Methods to Assess Reliability

Reliability is an index of random measurement error. Reliability coefficients are highest with no error (r = + or - 1.0) and lowest with total error (r = 0.0). Does the instrument make distinctions between two or more behaviors with a reasonable level of confidence? Before using an instrument to collect baseline and follow-up program data, instrument reliability must be documented. For a few instruments, the reliability has been calculated with many different groups. For most instruments, however, it has not. There are multiple approaches to assessing instrument reliability. Two factors are important to consider: the type of instrument (observer or external source vs. self-report) and the times at which the instrument is applied (same time vs. different times). Table 4.9 shows the types of reliability.

Inter-Rater Reliability

If two observers collect data at the same time, reliability can be estimated by having the two observers rate the same performance of a task, skill, or behavior: inter-observer or inter-rater reliability. This documents whether two people are seeing and interpreting the same responses or behaviors in the same way at the same time. Because both observers should measure the same actions, the perfect inter-observer reliability and instrument

table 4.9 Types of Reliability

time measure applied

type of m easure







Internal Consistency

Test-Retest: Stability

would produce an r = 1.0. The level of error variability decreases reliability downward from 1.0. Pearson correlation is the common technique for measuring reliability. Because most observation scales use nominal or ordinal rating categories, Cohen’s kappa is an accepted statistical technique (Cohen, 1975). Kappa corrects the simple percentage agreement between two observers for chance agreement. Forms of Kappa have been developed to weight deviations from exact agreement and for multiple observers.

< Prev   CONTENTS   Source   Next >