Step 2: Item Model Development
With the content identified and structure using the cognitive model in step 1, this content must now be positioned within a template that, in turn, will create the assessment tasks. This template is called an item model. Item models (Bejar, 1996, 2002; Bejar et al., 2003; LaDuca, Staples, Templeton & Holzman, 1986) have been described using different terms, including schemas (Singley & Bennett, 2002), blueprints (Embretson, 2002), templates (Mislevy & Riconscente, 2006), forms (Hively, Patterson & Page, 1968), frames (Minsky, 1974) and shells (Haladyna & Shindoll, 1989). Item models contain the components in an assessment task that can be manipulated for item generation. These components include the stem, the options and the auxiliary information. The stem contains context, content, item and/or the question the examinee is required to answer. The options include a set of alternative answers with one correct option and one or more incorrect options or distractors. Both stem and options are required for multiple-choice item models. Only the stem is created for constructed-response item models. Auxiliary information includes any additional content, in either the stem or option, required to generate an item, including text, images, tables, graphs, diagrams, audio and/or video.
Types of Item Models
The principles, standards, guidelines and practices used for traditional item development (e.g., Case & Swanson, 2002; Downing & Haladyna, 2006; Haladyna & Rodriguez, 2013; Rodriguez, this volume; Schmeiser & Welch, 2006) currently provide the foundational concepts necessary for creating item models. A literature on item model development is also beginning to emerge (e.g., Gierl et al., 2008; Gierl & Lai, 2013b), and some illustrative examples are available (e.g., Bejar et al., 2003; Case & Swanson, 2002; Gierl et al., 2008; Gierl & Lai, 2013b). Two types of item models can be created for AIG: 1-layer and n-layer item models (Gierl & Lai, 2012a).
1-layer item model
The goal of item generation using the 1-layer item model is to produce new test items by manipulating a relatively small number of elements in the model. We use the item model element as the unit of analysis in our description because it is the most specific variable in the cognitive model that is manipulated to produce new items. The 1-layer item modeling currently dominates practical applications in AIG. Often, the starting point is to use a parent item. The parent can be found by reviewing items from previous test administrations, by drawing on a bank of existing test items, or by creating the parent item directly. The parent item for the infection and pregnancy example was presented in Figure 21.2. The parent item highlights the underlying structure of the model, thereby providing a point of reference for creating alternative items. Then, an item model is created from the parent by identifying elements that can be manipulated to produce new items.
One disadvantage of using a 1-layer item model for AIG is that relatively few elements can be manipulated. The manipulations are limited because the number of potential elements in a 1-layer item model is relatively small (i.e., the number of elements is fixed to the total number of elements in the stem). Unfortunately, by restricting the element manipulations to a small number, the generated items may have the undesirable quality of appearing too similar to one another. These items are often described as isomorphic. In our experience, generated isomorphic items from 1-layer models are referred to pejoratively by many test developers as “clones,” “ghost” items or “Franken-items.” Isomorphic items are often perceived to be simplistic and easy to produce.
One early attempt to address the problem of generating isomorphic items was described by Gierl et al. (2008). They developed a taxonomy of 1-layer item model types. The purpose of this taxonomy was to provide test developers with design guidelines for creating item models that yield diverse types of generated items. Gierl et al.’s strategy for promoting diversity was to systematically combine and manipulate those elements in the stem and options typically used for item model development. According to Gierl et al., the elements in the stem can function in four different ways. Independent indicates that the elements in the stem are unrelated to one another. Hence, a change in one stem element will not affect the other stem elements. Dependent indicates all elements in the stem are related to one other. A change in one stem element will affect the other stem elements. Mixed includes independent and dependent elements in the stem, where at least one pair of stem elements is related. Fixed represents a constant stem format with no variation. The elements in the options can function in three different ways. Randomly selected options refer to the manner in which the distractors are selected, presumably, from a list of possible alternatives. The distractors in this case are selected randomly Constrained options mean that the keyed option and the distractors are generated according to specific constraints, such as algorithms, rules, formulas or calculations. Fixed options occur when both the keyed option and distractors are fixed and therefore do not change across the generated items. A matrix of 1-layer item model types can then be produced by crossing the four different elements in the stem and the three different elements in the options. Gierl et al. claimed that the taxonomy is useful because it provides the guidelines necessary for designing diverse 1-layer item models by outlining their structure, function, similarities and differences. It can also be used to ensure that test developers do not design item models where the same elements are constantly manipulated or where the same item model structure is frequently used.
Figure 21.4 contains an example of the 1-layer item model based on the Figure 21.3 parent item. For this 1-layer item model, the stem contains two integers (GESTATION PERIOD; AGE) and two strings (TYPE OF INFECTION; ALLERGY). Using the Gierl et al. (2008) taxonomy described earlier, this item model would be described as a mixed stem with constrained options. The GESTATION PERIOD integer and TYPE OF INFECTION and ALLERGY string elements in the stem are dependent because the values they assume will depend on the combination of content in the item model. The AGE integer, however, is free to vary with all combinations of items; hence it is independent of the other elements (hence both independent and dependent elements are included in this example, making it mixed). The options are constrained by the combination of integer and string values specified in the stem, regardless of the AGE element.
Figure 21.4 The 1-layer item model for the infection and pregnancy example.
п-layer item models
The second type of item model can be described as multiple- or n-layer (Gierl & Lai, 2012a). The goal of AIG using the n-layer item model is to produce items by manipulating a relatively large number of elements at two or more levels in the model. Much like 1-layer item modeling, the starting point for the n-layer model is to use a parent item. But unlike the 1-layer model, where the manipulations are constrained to a linear set of generative operations using a small number of elements at a single level, the n-layer model permits manipulations of a nonlinear set of generative operations using elements at multiple levels. As a result, the generative capacity of the n-layer model is high. The concept of n-layer item generation is adapted from the literature on syntactic structures of language (e.g., Higgins, Futagi & Deane, 2005). Language is often structured hierarchically, meaning that content or elements are often embedded within one another. This hierarchical organization can also be used as a guiding principle to generate large numbers of meaningful test items. The use of an n-layer item model is therefore a flexible template for expressing different syntactic structures, thereby permitting the development of many different but feasible combinations of embedded elements. The n-layer structure can be described as a model with multiple layers of elements, where each element can be varied simultaneously at different levels to produce different items.
A comparison of the 1-layer and n-layer item model is presented in Figure 21.5. For this example, the 1-layer model can provide a maximum of four different values for element A. Conversely, the n-layer model can provide up to 64 different values by embedding the same four values for elements C and D within element B. Because the maximum generative capacity of an item model is the product of the ranges in each element (Lai, Gierl & Alves, 2010), the use of an n-layer item model will always increase the number of items that can be generated relative to a 1-layer structure.
One important advantage of using an n-layer item model is that more elements can be manipulated simultaneously, thereby expanding the generative capacity of the model. Another important advantage is that the generated items will likely appear to be quite different from one another because more content in the model is manipulated. Hence, n-layer item modeling can help address the problem of cloning that concerns some test developers because large numbers of systematic manipulations are occurring in each model, thereby promoting heterogeneity in the generated items. The disadvantage
Figure 21.5 A comparison of the elements in a 1-layer and n-layer item model.
of using an n-layer structure is that the models are complex and therefore challenging to create. Also, the effect of embedding elements, while useful for generating large numbers of diverse items, will make it challenging to predict the psychometric characteristics of the generated items using precalibration statistical methods.
An n-layer infection and pregnancy item model is presented in Figure 21.6. This example illustrates how the structure of the item can be manipulated to produce more diverse generated items. In addition to manipulating the integer and string values, as with the 1-layer
Figure 21.6 An n-layer item model for the infection and pregnancy example.
example, we now embed the integers and strings within one another to facilitate the generative process. For the n-layer example, two layers are used. The first layer is sentence structure. The first sentence states, “A [[AGE]]-year-old pregnant female at [[GESTATION PERIOD]] weeks gestation[[ALLERGY]] presents with clinical and radiological signs and symptoms consistent with [[TYPE OF INFECTION]].’’ The second sentence states, “Suppose a pregnant woman[[ALLERGY]] was admitted with signs consistent with [[TYPE OF INFECTION]]. She was [[GESTATION PERIOD]] weeks into her term.’ The second layer includes the same elements specified in the 1-layer model (see Figure 21.4)—namely, Type of Infection, Allergy and Gestation Period. In sum, by introducing layered elements, more diverse items can be generated because the 1-layer model is a subset of the n-layer model.