Over the last 15 years, the discussion of how to interpret change in patient-reported outcomes has received considerable attention. Interpretability refers to the clinical significance of increases or decreases on a particular scale or measure over time. For instance, if I score a 30 on the Beck Depression Inventory (BDI-II), we know that I have scored in the middle of the scale—the BDI-II has 63 total points. But imagine two months later I score 42. What does this 12-point increase mean from a clinical point of view? Should my drug regime change? If so, how should it change? PROMs that have been developed using classical testing theory (CTT) only provide ordinal level information, i.e., we know that someone who scores 42 is more depressed than someone who scores 30, but we do not know the degree of that difference. PROMs are thus difficult to interpret.
This has led to the development of methods to enhance their interpretability. One popular method is the identification of a minimal important difference (MID). An MID is the smallest change in respondent scores that represent clinical, as opposed to merely statistical, significance and which would, ceteris paribus, warrant a change in a patient’s care (Jaeschke et al. 1989). One popular method for determining a measure’s MID is to map changes in respondent outcomes onto some kind of control. These are referred to as “anchor-based” approaches. The idea is to determine the minimal amount of change that is noticeable to patients and to use this unit of change as the MID. The method asks the control group of patients to rate the extent of their symptom change over the course of an illness or intervention on a transition rating index (TRI). TRIs are standardized questionnaires that ask patients questions, such as “Do you have more or less pain since your first radiotherapy treatment?” Typically patients are given seven possible answers ranging from “no change” to “a great deal better” (Fayers and Machin 2007). Those who indicate minimal change, i.e., those who rate themselves as just “a little better” than before the intervention, become the patient control group. The mean change score of this group is used as the MID for the PROM.
This approach of acquiring an MID via a patient control group assumes that respondents who rate their symptom change as “a little better” on a transition question should ceteris paribus also have comparable change scores from the PROM. Put differently, similarities in respondent answers to transition questions ought to underwrite similarities in respondents’ magnitude of change over the course of an intervention or illness. But qualitative data from interviews with patients suggests that this assumption is ill founded (Taminiau-Bloem et al. 2011; Wyrwich and Tardino 2006). Whether one understands the magnitude of change over the course of an illness as large or small is a matter of interpretation. As I have argued elsewhere, respondents’ answers to TRI ought to be understood against the background of what makes for a good quality of life, e.g., the magnitude of change to which the answer “a little better” refers depends heavily on the significance that, say, worry has within the respondent’s vision of the good (McClimans 2011). Thus, it is possible to have an outcome that indicates a large magnitude of change and to interpret this change as minimal.
Consider Cynthia Chauhan, a patient advocate during the deliberations on the FDA guidelines for the use of PROMs in labeling claims. In response to the deliberations, Chauhan cautioned those present, “.. .not to lose the whole person in your quest to give patient-reported outcomes free-standing autonomy.” (Chauhan 2007). To make her point, she discussed the side effects of a drug called bimato- prost, which she uses to forestall blindness from glaucoma. One of the side effects of bimatoprost is to turn blue eyes brown. Chauhan has “sapphire blue” eyes, in which, she says, she has taken some pride. As she speaks of her decision to take the drug despite its consequences, she notes that doing so will affect her identity in that she will soon no longer be the sort of person she has always enjoyed being, i.e., she will no longer have blue eyes. Moreover, she points out that although the meaning that taking this drug has for her is not quantified on any outcome measure, it nonetheless affects her quality of life (Chauhan 2007).
We can imagine that, even if the bimatoprost is only minimally successful and Chauhan’s resulting change score from the PROM is low, she will nonetheless have experienced a significant change—she will not be the same person she was before. But this significance is tied to the place that her blue eyes had in her understanding of herself and what she took to be a good life; ceteris paribus we would not expect a brown-eyed person to summarize their experience in the same way. Thus, it would not be surprising if Chauhan’s answer to the transition question was “quite a bit,” while the magnitude of her change score was minimal.
I suggest that what examples such as this illustrate is that our understanding of clinical significance ought to be closely linked to our understanding of the construct given the cohort of respondents for whom the measure is targeted. To put this point slightly differently, understanding change in PROMs requires that researchers have a grip on what quality of life or perceived health status means in the context of a particular PROM and the population it serves. In other words, we need a theory of the construct that the PROM aims to measure, i.e., a collection of sentences, propositions, statements, or beliefs and their logical consequences, and these can include statistical and general laws.