Lauress L. Wise and Barbara S. Plake

The 2014 Standards for Educational and Psychological Testing (AERA, APA & NCME, 2014; hereafter referred to simply as the Standards) provide consensus standards for the development, use and evaluation of tests. Chapter 4 of the revised Standards focuses on test design and development, but important principles for test development are found throughout the Standards, particularly in the first three foundational chapters on validity, reliability and fairness. The 2014 version of the Standards elevates fairness to a foundational principle, along with validity and reliability/precision. The chapter on test development, Chapter 4 , has been expanded to include test design as well as development, emphasizing the need for test design to support the validity of interpretations of test scores for intended uses and principles of fairness in access to the test and the interpretation and use of test scores.

Chapter 4 provides an overarching standard,1 describing the goals and intent of each of the specific standards concerning test design and development:

Standard 4.0

Tests and testing programs should be designed and developed in a way that supports the validity of interpretations of the test scores for their intended uses. Test developers and publishers should document steps taken during the design and development process to provide evidence of fairness, reliability, and validity for intended uses for individuals in the intended examinee population.

The implications of the revised 2014 Standards for test design and development are discussed in two sections of this chapter. The first section covers issues with test design, beginning with the need for a clear description of the purpose(s) and intended use(s) of the test. The test design section goes on to describe ways that the design of the test should provide evidence of the validity, reliability and fairness needed to support these purposes and intended uses. The second section reviews standards for the processes for developing a test consistent with the previously articulated design. The second section covers item and test form development, test administration instructions, specification and monitoring of scoring processes, scaling and equating, and score reporting.

