The MIDA requires a four-step process (Figure 5.1) starting with the preparation of the manuscript and ending with searchable data accessible through the MIDA search engine and the Myaamia online dictionary.

To begin the process, the transcriptionist receives, from the archival library, high resolution (300—600dpi) digital scans for each page in a document and adds page and line numbers for reference directly to the image in preparation for upload (Figure 5.2). Each document page, including blank pages, are numbered and the text is numbered every ten (10) lines.

A spreadsheet is provided for the transcriptionist to begin transcribing based on the organization of the original document (Figure 5.3), which in this case includes the keyword (if available), original French or English, and original Miami-Illinois entries into the spreadsheet. Each entry has its corresponding page, line, and phrase number for reference to the original document. A new spreadsheet was required approximately every 15 pages to maintain usability of the spreadsheets, as they grew quite large.

The spreadsheets were created in a web-based service allowing access to documents from anywhere, as well as the critical feature of allowing multiple users to edit the same spreadsheet simultaneously. This was important so that collaborators who are not always in the same location can not only do their work, but assist each other with reading original text and translations.

The transcription and French translation steps are always done in spreadsheet form for reasons of efficiency. Figure 5.4 illustrates a few lines from a partially completed spreadsheet. Final transcription work from the LeBoullenger document alone resulted in approximately 25,000 spreadsheet lines of data.

Once the transcription and translation of the French text is complete, the spreadsheet data is then uploaded into the MIDA database for further work. The process begins by providing English translations and analyses of the original Miami- Illinois data transcribed from the LeBoullenger document on the MIDA website. There are three primary parts to this task; the first step is filling in the contemporary spelling of the Miami-Illinois words. As mentioned above, the data in the Illinois dictionaries is recorded in an inconsistent writing system, which fails to mark all the phonemic contrasts of the language. Thus, it is necessary to re-transcribe the Miami- Illinois data into the modern, phonemic orthography. We also provide a place for supporting evidence to show how these phonemicizations are decided upon. There is also a cognate field, discussed earlier, where one enters cognate words drawn from the sister languages as well as original forms of the words drawn from other Miami- Illinois sources.

The second step consists of filling in corrected English glosses to the Illinois data. While it is essential to provide the literal English translations of the French glosses, often the original glosses are just as imprecise as the original transcriptions of the Illinois words. Thus, it is necessary to provide revised English translations, informed by our actual knowledge of Miami-Illinois grammar and data elsewhere in the corpus. For example, LeBoullenger’s form [ouris],” literally “little mouse,” though it is clear from the modern records that this word in fact means “chipmunk.” As a more subtle example, LeBoullenger glosses the imperative (nissahanto) as “abats cette perche,” or “knock down that pole,” though it is clear from the structure of this verb, and from related forms recorded elsewhere, that its actual meaning is more like “knock it down! (by instrument)”; that is, that this verb explicitly indicates that the action is accomplished by some kind of tool, and that it can refer to any kind of standing object being knocked down, not just poles. And finally, the third step is to provide the grammatical analysis of the

Miami-Illinois words, by breaking down all the Miami-Illinois words into their constituent parts including their stems and stem components. Such data enables users to search on all words in the manuscripts, which share certain morphemes, so as to compare their usage across dozens or even hundreds of different words. Figures 5.5 and 5.6 show screenshots of a typical search and results based on these last steps.

Needless to say, completion of all three of these categories for each entry in the manuscript is an extremely time-consuming project. Some words submit to a very simple, obvious analysis, while just as many words are of a more obscure origin and require extensive research before their entries can be even partially analyzed. A significant number of words resist analysis entirely. Given the size of the LeBoullenger manuscript as well as that of the other equally large or larger Jesuit dictionaries that will be added to MIDA in the course of time, it is clear that an exhaustive analysis of all the data in all of these manuscripts is a process that will span decades, long after the process of keying in the data and translating the French glosses is complete.

As noted above, the transcriptions and their French translations were originally recorded in spreadsheets. Translation and linguistic work required a more

sophisticated database to support storage of stems, morphemes, and cognate information as well as comprehensive search capability to locate other entries in the corpus for cross-reference. These additional fields are added to MIDA after the spreadsheet data has been uploaded.

The MIDA contains an advanced search function allowing the user to search within or among manuscripts and by any data field. Figure 5.7 shows the field menu of the search function.

The MIDA also supports many additional features including: [1]

  • [1] No login needed for general users (search only); • Accounts and login required for editors (search and update); • Comprehensive search engine for general users and researchers; • A feature that logs changes made to records by editors; • Administrative tools for spreadsheet data import/export, account management,and account creation.
