Methods for Generating Multilingual Test Items

Three AIG item modelling methods can be used to generate items in multiple languages. The methods are differentiated by how the content of each language is structured in the item model.

Language-Dependent Item Modelling

The first method is to use the three-step process described in Chapter 2 to Chapter 4 for each language group. In step 1, the content required for the generated items is identified by the SME. In step 2, an item model is developed by the SME to specify where content is placed in each generated item. In step 3, computer-based algorithms are used to place the content specified in step 1 into the item model developed in step 2. Hence one strategy for scaling item development in a multilingual context is to use AIG for each language group. This means that each language will require its own item model. The SME who wanted a bank with half-a-million medical education items would require approximately 400 item models (i.e., if each item model generated, on average, 1,248 medical items, as described in Chapter 1) that could be created at a rate of 33 per month. In a multilingual context, this rate of production would need to double for two languages. The strength of this AIG method is that this rate of production is a significant improvement over the traditional multilingual item development approach. A cognitive model is developed in step 1. This model should be common for all language groups. An item model is created in step 2. The content in this model is expressed in the SME's native language. The content from the cognitive models is placed into the template of the item model in step 3, thereby generating large numbers of new multilingual items. The weakness of this AIG method is that the items will not necessarily be equivalent because each item model is language dependent. AIG is used to scale the development process by creating item models for each language group. Because there is no attention directed at comparing the content in the item models across the language groups, the first method can be described as language-dependent item modelling.

When construct equivalence is a testing requirement, language- independent items considered to be parallel across the linguistic groups are needed. Construct equivalence includes both conceptual and functional equivalence (Hambleton, 2005). In AIG, conceptual equivalence requires that the same cognitive model be used across language groups, while functional equivalence means that the same content in the item model is used across language groups. Hence the second AIG method uses the three-step process, except that the content in the item models must be developed so that it is equivalent across languages. Items are generated using the same cognitive model (conceptual equivalence) with parallel item models (functional equivalence).

Successive-Language Item Modelling

One way to create equivalent content in the item models is with the successive item development approach (van de Vijver & Poortinga, 2005). With successive development, the content in the item model is first developed by a monolingual SME in the source language for use with the same monolingual and monoculture context. Then the content in the item model is translated either by the original SME or by a translator for use with examinees in a different language and/or culture. The quality of the target language translation can be evaluated in one of two ways. The first involves translating the content item model in the source language to the target language for a second time. A translator evaluates the equivalence of the source and translated target item model versions. The second translation of the source language item model can be conducted by humans or by a statistical machine translation service, such as Google Translate. The second involves re-translating the translated item model content back into the source language using a different translator. The comparability of meaning between the original and back-translated item models can be assessed by the original SMEs, an independent reviewer, or a committee of reviewers (Brislin 1986; Hambleton & Bollwark 1991). The translation of the translated item model can be conducted by humans or by a statistical machine translation service. The strength of the second AIG method, like the first, is that the rate of production is a significant improvement over the traditional multilingual item development approach. The second method also has the benefit of generating items that are intended to be equivalent across language groups because the content in each item model is judged to be language independent using the successive development process. In other words, AIG is used to scale the item development process by creating an item model in one language that, in turn, is used to create the equivalent item model in another language. These item models are used together to generate multilingual test items. The weakness of this method is that it relies on the use of a single baseline language—the source language—for establishing equivalency. That is, a source language is used to establish meaning in the item model, and then the target language item model is created to be equivalent in meaning to the source. The assumption is that the source language can be used to produce the same meaning for all target languages. The second AIG method can be described as successive-language item modelling.

< Prev   CONTENTS   Source   Next >