Measures can be used for different purposes within intervention research. Measures provide answers to questions regarding intervention efficacy and effectiveness, for whom the intervention works and under what conditions, and how an intervention impacts outcomes. Measures also provide insight into issues about the clinical significance (see Chapter 17) and cost-effectiveness of an intervention, and feasibility of implementing an intervention on a broader scale. Today, there is a myriad of measures available, and they are included in intervention trials that range from biomarkers to performance metrics and subjective evaluations.

Outcome measures generally provide evidence about the efficacy or effectiveness of an intervention. They are used as a barometer to judge the strength of the evidence supporting the impact of an intervention. They may also provide information about other aspects of an intervention such as feasibility, cost-effectiveness, and participant satisfaction with a treatment approach. Outcome measures also provide information about the clinical meaningfulness of the research findings or the impact an intervention has on an individual’s functioning with respect to everyday activities. Studies may include a variety of outcome measures such as clinical outcomes, quality-of-life metrics, satisfaction or usability ratings, or cost or resource utilization metrics. The choice of outcome measures depends on the stakeholders, the research questions, the target population, and the intended use of the evidence. In all cases, outcome measures must be clearly defined and unambiguous; the manner in which a measure is operationalized has broad implications for how it is assessed.

In randomized controlled trials (RCTs), choices have to be made as to what constitutes primary versus secondary outcome measures. In the Resources for Enhancing Alzheimer’s Caregivers Health II (REACH II) trial (Belle et al., 2006), the primary outcome measures were related to the five areas of caregiver risk/components of the intervention, but additional (secondary) measures were included that assessed variables such as use of formal support services, religiosity, and the caregiver’s perception of the caregiving experience. Primary outcomes are generally considered the critical outcomes with respect to decision making and are few in number. Secondary outcomes are often used to gather effect size data for subsequent trials or used as mediating variables to help explain the effects of an intervention.

In addition, data are sometimes collected on measures that characterize treatment populations to examine moderator effects of an intervention. Moderating variables help explain for whom the intervention works and under what conditions. This provides information on the external validity of the intervention. One might gather data on the ethnic/culture affinity of the target population to examine if the effects or impacts of an intervention vary as a function of ethnicity or culture. For example, we found that religious coping mechanism and age moderated the effects of the REACH II intervention for African American and Hispanic dementia caregivers (Lee, Czaja, & Schulz, 2010).

Similarly, one might gather data on attitudinal variables or self-efficacy to determine if these variables mediate the relationship between a treatment and an outcome. Mediating variables help explain how or why an intervention results in a change in an outcome measure, the mechanism of change. For example, computer anxiety has been found to mediate the relationship between age and technology uptake (Czaja, Charness, Fisk, et al., 2006).

The literature also suggests that changes in self-efficacy mediate the relationship between physical activity interventions and changes in physical activity behaviors (Lewis, Marcus, Pate, & Dunn, 2002). Baranowski, Cerin, & Baranowski (2009) maintain that change in desired outcomes is contingent upon changes in mediator variables (e.g., self-efficacy) and that interventions will be “successful” to the extent that mediator variables are targeted by the intervention at the appropriate levels. However, the selection of the appropriate mediators is key and must be guided by theory (see Chapter 4).

Sometimes, measures are also used to screen study participants with respect to inclusion/exclusion criteria. Screening measures might be related to an individual’s characteristics (e.g., age, gender, sexual preference), living conditions (e.g., community dwelling), health status, or relationship status (e.g., spouse). These types of measures are used to operationalize a study’s inclusion/exclusion criteria. For example, if an exclusion criterion was cognitive impairment, the screening measure might be a score of 26 or less on the Mini-Mental State Examination (Folstein, Folstein, & McHugh, 1975), which is commonly used as a screen for general cognitive status.

Studies may also include process measures that evaluate aspects of the intervention that are related (or not) to an outcome such as therapeutic alliance or the ingredients of an intervention. Other types of measures, which are particularly relevant today with the emphasis on implementation of evidence-based treatments, are those related to treatment implementation such as staff-training requirements, delivery characteristics, and indices of treatment fidelity and clinical significance. As noted, most studies include a variety of measures (see Table 14.1 for examples). Sometimes, measures may be used as mediating variables (e.g., perceived social support) and in other studies as outcome variables depending on the goals of the study.

TABLE 14.1 Examples of Measures and How They Might Be Used in Behavioral Intervention Research

Role of Measure

Examples of Measures

Description of Measure



  • ? Mini-Mental Status Examination (MMSE) (Folstein et al., 1975)
  • ? Wide Range Achievement Test (WRAT) (Wilkinson, 1993)
  • ? Snellen Test (Berson, 1993)
  • ? Cognitive status
  • ? General reading level
  • ? Basic visual acuity


  • ? Demographic Questionnaire (Czaja et al., 2006a)
  • ? Technology Experience Questionnaire (Czaja et al., 2006b)
  • ? General Health Perceptions Scale (Ware & Sherbourne, 1992)
  • ? Ten-Item Personality Inventory (TIPI)
  • ? Age, education, occupational and socioeconomic status, culture ethnicity, living arrangements
  • ? Use of general technology and use/breadth of experience with computer technology and the Internet
  • ? Self-reported physical health
  • ? Personality traits


  • (Gosling, Rentfrow, & Swann, 2003)
  • ? New General Self-Efficacy Scale (Chen, Gully, & Eden, 2001)
  • ? Computer Attitudes (Jay &

Willis, 1992)

  • ? Family Caregiving Factors Inventory (Shyu, 2000)
  • ? Perceived Stress Scale (Cohen, Kamarck, & Mermelstein, 1983)
  • ? Belief in one's overall competence across a wide variety of situations
  • ? Three dimensions of computer attitudes (comfort, efficacy, and interest)
  • ? Caregivers expectations of the caregiving role
  • ? Degree to which situations in one's life are perceived as stressful




  • ? Center for Epidemiological Studies— Depression Scale (Radloff, 1977)
  • ? Functional Health and Well-Being (SF-36; Ware & Sherbourne, 1992)
  • ? Revised Memory and Behavior Problem Checklist (Teri et al., 1992; Zarit, Orr, & Zarit, 1985)
  • ? The Community Health Activities Model Program for Seniors (CHAMPS) Questionnaire (Stewart et al., 2001)
  • ? Depressive symptoms
  • ? Health-related quality of life Caregiver burden
  • ? Self-reported physical activity


  • ? Instrumental Activities of Daily Living (Lawton & Brody, 1969)
  • ? Katz Index of Independence in Activities in Daily Living (Katz ADL) (Katz, Ford, Moskowitz, Jackson, & Jaffe, 1963)
  • ? Competence in higher order everyday activities
  • (e.g., food preparation, money management)
  • ? Competence in basic activities of daily living (e.g., bathing, toileting)

TABLE 14.1 Examples of Measures and How They Might Be Used in Behavioral Intervention Research (Continued)

Role of Measure

Examples of Measures

Description of Measure

  • ? Schizophrenia Cognition Rating Scale (SCoRs) (Keefe, Poe, Walker, Kang, & Harvey, 2006)
  • ? Independent Living Skills Inventory (ILSI) (Menditto et al., 1999)
  • ? Ratings of cognitive function
  • ? Clinical rating scale of real- world, everyday functioning


  • ? Measures of task performance (task specific)
  • ? Short Physical Performance Battery (SPPB) (Bean, Vora, & Frontera, 2004; Nelson et al., 2004)
  • ? Everyday Problem Solving Test (Marsiske & Willis, 1995)
  • ? Measures of behavioral patterns (e.g., sensing data)
  • ? Task completion time, number and types of errors, accuracy
  • ? Times measures of standing, balance, walking speed, and ability to rise from a chair
  • ? Ability to solve problems in several domains (e.g., medication use, financial management, transportation)
  • ? Sleep activity, communication patterns, movement patterns

Physiological Indices

? Weight, BMI, brain imaging, EEG, cortisol, heart rate, cholesterol



  • ? Cost of Care Index (Kosberg & Cairl, 1980)
  • ? Health care costs
  • ? Perceived worthiness of providing care
  • ? Number and type of insurance claims, medication costs, hospitalizations, outpatient visits, preventative health visits, use of services

Note: The role categories are not mutually exclusive—some measures may be used as mediators or outcome measures depending on the study goals.

For example, in a recently completed trial that examined the efficacy of a software application (PRISM) on outcomes related to social connectivity and quality of life among older adults at risk for social isolation (Czaja et al., 2015), the assessment battery included screening measures to evaluate cognitive status (Mini-Mental State Examination; Folstein et al., 1975), measures to characterize the sample/ potential moderator variables (e.g., educational level), potential mediating variables (e.g., measures of component cognitive abilities), and primary (e.g., social support, loneliness) and secondary outcomes (e.g., computer proficiency). The trial also included measures of usability and perceived usefulness of the technology as well as interview data that captured more in-depth perceptions of the PRISM system.

Generally, the selection of the appropriate outcome measures for an intervention trial should be based on (a) the theoretical constructs or models guiding the intervention; (b) the research topic, questions, and hypotheses; (c) the psychometric properties of the measures; (d) the assurance that change in a measure is meaningful with respect to the target population and the intervention being evaluated; and (e) the previous literature. As noted in Chapter 4, the intervention evaluated in the REACH II trial was based on a stress process model that suggested multiple factors contribute to caregiver burden and distress. The intervention was multicomponent and addressed five areas of caregiver risk: depression and emotional well-being, burden, self-care and healthy behaviors, care recipient problem behaviors, and social support. The primary outcome measures chosen for the trial were also linked to these areas of risk and to the components of the intervention (Table 14.1).

Other important selection criteria for outcome measures are related to the feasibility and cost of collecting the data, the resources available with respect to data collection and analysis, and participant burden. For example, in cognitive aging research, the use of brain imaging is commonly employed to gather data on brain functioning or activity relative to behavior or cognitive operations. Collection of this type of data is feasible only if the appropriate imaging equipment is available, if there are sufficient funds available, and if someone on the research team has the requisite knowledge to collect and analyze the data. Finally, the choice of measures may vary according to stage in the intervention pipeline. In the initial development of an intervention, the measures may reflect responses from a focus group about the content of the intervention, whereas further in the pipeline, where a study is conducted to evaluate the efficacy of the intervention, the measures may reflect some psychosocial construct or change in behavior.

As noted, in behavioral intervention research, decisions must be made about outcome measures as well as measures that may be used for screening as mediating or moderator variables, and to assess issues relevant to treatment implementation and clinical significance. The selection of measures requires knowledge of (a) the relevant intervention literature and theories/models, (b) the psychometric properties (e.g., reliability and validity) of the measures, (c) the practical aspects/ constraints of administering the measures, (d) the appropriateness of the measures for the target population (e.g., some measures may be culturally biased), (e) the currency of the measures (e.g., measures that assess attitudes toward technology may lose relevance if they ask questions about technology that is no longer available), and (f) associated effect sizes to help guide calculations about sample size and also guide understanding of the practical importance of a particular finding.

In the following sections, we provide a basic review of types of measures and the general criteria for measure selection. We proceed with a word of caution: the choice of measures for a study can be overwhelming; in most treatment domains, there are large numbers of measures available. It is also sometimes difficult to find consensus about which measure or measures are optimal with respect to answering a research question. For example, in the cognitive literature, there is a wide variety of measures and techniques available for measuring various aspects of cognition; however, researchers do not always agree on the best approach or how best to measure abilities such as working memory or attention. In this regard, there have been attempts to harmonize measures across studies such as in the United States, the National Institutes of Health (NIH) Toolbox for the Assessment of Neurological and

Behavioral Function (, which includes a set of measures to assess cognitive, emotional, motor, and sensory functions.

The measurement literature is constantly evolving and advances in technology such as imaging techniques, sensing devices, and wearable technologies allow different ways to capture changes in behavior. As will be discussed a bit later in this chapter, developments in technology are also changing the way outcome data is collected. For example, computer adaptive testing (CAT) allows assessments to be specifically targeted to an individual, and computer-assisted telephone interviewing (CATI) is a telephone-interviewing technique in which the interviewer is guided by a software application. Measures can also become obsolete because they are cumbersome, no longer relevant, or there is an improved method for assessing the construct of interest. For example, in our Center for Research on Aging and Technology Enhancement (CREATE), we have to update our measure of technology experience to ensure that it is current with respect to current technologies; telephone answering machines are becoming obsolete whereas smartphones are becoming ubiquitous. This underscores the need to keep abreast of the current literature within any particular area.

< Prev   CONTENTS   Source   Next >