What Is Documentation to Support Test Score Interpretation and Use?
In current practice, the documentation for a testing program generally serves multiple purposes: for example, to (a) describe the test in sufficient detail to enable test users to select tests and interpret and use test scores appropriately, and (b) provide technical information on the psychometric characteristics of tests. Different stakeholders in a testing program—for example, test developers and test users— bring diverse information needs and purposes to the table. Testing program documentation1 must provide a sufficiently comprehensive and thorough description of the test to inform stakeholders who select and use tests that are appropriate for a given purpose (AERA et al., 2014). Becker and Pomplun (2006) also point out that testing program documentation constitutes a form of insurance against potential legal challenges to a testing program, as such evidence is used to defend against claims of bias and unfairness. In addition, such documentation should provide explicit guidance about appropriate and intended interpretations of test performances, as well as inappropriate and unintended uses of test results. Finally, testing program documentation also provides the basis for validity arguments (Kane, 2006, 2013, this volume), an idea that we explore in more detail in later sections.
Many types of information are communicated in testing program documentation. For example, the Standards (2014) chapter about supporting documentation for tests explicitly proposes information on “the nature and quality of the test, the resulting scores, and the interpretations based on the test scores” (AERA et al., 2014, p. 123). Becker and Pomplun (2006) identify evidence that they argue ought to be included in all technical reports: test and item level information, including evidence related to item performance; scaling and equating results; and information pertaining to reliability and measurement error. Considering all the design, development and implementation decisions that are made during the life of a test, a more comprehensive list of topics to document might include the technical topics in those two lists, plus others closely related to interpreting and using test scores: explicit statements of tntended score interpretations and uses; identification of targeted test takers; description of test development procedures and tools, as might be used in a principled design approach (e.g., evidence-centered design [ECD]; see Mislevy & Haertel, 2006; Riconscente & Mislevy, this volume); description of quality control procedures in item development and throughout the test administration, scoring, analysis and reporting process; statements of appropriate test preparation activities and evidence of their efficacy; and test security requirements and procedures. Much of this content typically is not included in technical reports, as we illustrate ahead. We discuss four types of documentation in subsequent sections: technical reports; other technical documents; test administration manuals; and score reports, interpretive guides and score interpretation training materials. Finally, we propose a new document to support score interpretation and use, the interpretation/use argument (IUA) report.