# Measurement Over Time

So far we have addressed issues related to cross-sectional designs where one might want to evaluate the effects of a treatment versus an active placebo condition (e.g., attention control) in three different, older, ethnic/cultural groups. Longitudinal data analyses are different from cross-sectional designs in that each subject has a set of observations measured repeatedly over time and these observations are intercor- related. As a result, standardized regression methods ignoring such a correlation would render an insufficient estimate of the beta weights and potentially inaccurate conclusions.

One of the most common experimental designs is a “pre-post” two-group study in which a single health status measurement is obtained, an intervention is administered to the treatment group but not to the placebo group, and a single follow-up measurement is collected once again from participants in both groups. In this design, change in the outcome(s) is associated with the intervention exposure and the two groups can be compared to see if the change in the outcome is different for those subjects who are actively treated as compared to control group participants. In other longitudinal designs, follow-up measurements could be made at more time points, for example, at baseline, and then at 6, 12, and 18 months after the intervention exposure. This would necessitate a 3 (Groups) X 2 (Intervention) X 4 (Time) design. If there were multiple measures for the DV (e.g., depression, social isolation), multivariate approaches such as Multivariate analysis of variance (MANOVA) would be employed. Some investigators might also consider transforming and standardizing the individual depression measures on the same scale to create an average composite measure. This analysis would generate an *F* value for group, intervention, and time. Although a statistically significant *F* value of *p <* .05 or *p <* .01 would provide overall main effects for the group and intervention across all measurement time as well as general changes in all scores over time, this would provide little information on whether there was an effect of the intervention by group and how these differences might manifest themselves over time. Thus, the Intervention X Time *F* value (two-way) interaction term would provide a reliable estimate that either the intervention produced differed general effects over time or there were group differences in the intervention effect over time (a three-way interaction). Following a statistically significant interaction, one could look at actual mean differences and control for multiple comparisons by using procedures such as the Tukey’s HSD test, Scheffe procedure, or the more conservative Bonferroni procedure.

Let’s suppose that a researcher is collecting interview data and, owing to the limited number of interviewers, not all participants could be interviewed on the same day or week. Thus, there would be unequal spacing between measurement occasions for participants in the study (e.g., Person A was followed up at 6 months after the baseline measurement; Person B was followed up at 7 months after the baseline measurement). This is a common scenario in field-based intervention research in which follow-up assessments cannot be controlled. Participants may not be able to schedule follow-ups at the precise moment an investigator desires owing to personal issues (e.g., hospitalization, work demands, busy schedules) or practical issues (e.g., weather-related conditions interfering with testing conditions). Let’s also say that each day participants would get better at using the intervention material; thus, the amount of time each participant has access to the intervention material is important. In these instances, where there are three or more measurement occasions, growth curve model, which is a special case of multilevel models, may be a useful framework for analyzing change with longitudinal data. In contrast to approaches such as repeated-measures ANOVA, growth curve models make use of all available data from an individual, correct for unreliability of measurement, and, most importantly, emphasize each individual’s trajectory, rather than average group values at each occasion (Duncan, Duncan, & Strycker, 2013). Simulation studies have found growth curve models to be statistically more powerful in detecting group differences in change than Analysis of covariance (ANCOVA) models. Conceptually, these models involve estimating individual regressions of the DV over time and adding at the next level predictors of the regression parameters of individual trajectories.

Growth or change trends within individuals, including polynomial trends, can be modeled over time, which is not possible in classic models that use average trends. For instance, in the Personalized Reminder Information and Social Management System (PRISM) study, an online platform was designed to decrease social isolation in older adults; if we were interested in determining the effect of the intervention from the time beginning at the initiation of intervention (i.e., installation of the PRISM system in the participant’s home or one-on-one training session of how to use the “folder” of information for the control group), then time could be coded as a continuous variable. In this case, growth curve models could be used to estimate the average rate of change in the DV, as well as individual trajectories of the rate of change for each participant in the study. Next, residence type could be utilized as predictors or moderators of the intervention effect on the DV.

The benefits of longitudinal designs are not without costs. Some challenges of longitudinal studies include proper accounting of covariates that also change at each occasion of measurement, also known as “time-varying” covariates. For instance, researchers may use baseline measures of a health indicator as a covariate in their analyses. The activities of daily living (ADLs) skills, which refer to the basic tasks of everyday life, such as eating, dressing, toileting, are often used as covariates. However, older adults may show steep declines in ADLs during the study span, which in turn may impact outcomes such as social isolation and depression. It is thus necessary to make informed decisions on which variables to collect at each data collection wave.