One of the most oft-cited and oft-studied of all context effects relates to the impact of asking about personal relationships, such as dating frequency or marital happiness, prior to asking evaluative subjective well-being questions (e.g. Strack, Martin and Schwarz, 1988; Schwarz, Strack and Mai, 1991; Schuman and Presser, 1981; Smith 1982, cited in Tourangeau, Rasinski and Bradburn, 1991). Some of these studies find correlational effects - i.e. stronger relationships between personal relationships and overall life satisfaction when the question about personal relationships is asked first - but not directional effects, i.e. no differences in mean scores or the percentage of very happy respondents (e.g. Tourangeau, Rasinkski and Bradburn, 1991; Schwarz, Strack and Mai, 1991). Other studies have however found evidence of mean score differences, such that answering questions about marriage satisfaction induces higher happiness ratings overall, but produces no change in the correlation between marital and life satisfaction (Schuman and Presser, 1981; Smith 1982).
Variability across results, both in terms of the direction of effects and their magnitude, appears to be quite persistent. For example, following the procedure used by Strack, Martin and Schwarz (1988), Pavot and Diener (1993a) found much weaker context effects among single-item life evaluation measures, and no effect of context on the multi-item Satisfaction with Life Scale. Schimmack and Oishi (2005) performed a meta-analysis of all known studies exploring the effect of item order on the relationship between an overall life satisfaction measure and a domain-specific satisfaction measure. Sixteen comparisons from eight different articles were examined. Overall, the meta-analysis indicated that item-order effects were statistically significant (z = 1.89, p < 0.02), but the average effect size was in the “weak to moderate range” (d = 0.29, r = 0.15). Like Tourangeau et al., the authors also noted that the results were extremely variable across studies, with effect sizes ranging from d = 1.83 to -0.066. Further empirical investigation by Schimmack and Oishi reaffirmed that overall item-order effects were small or non-significant, but also that it is difficult to make a priori predictions about when item-order effects will emerge. However, they did find that priming an irrelevant or unimportant domain (such as weather) produced no item order effects, and similarly, priming a highly important and chronically accessible domain13 (such as family) also failed to produce item order effects.
The implication of Schimmack and Oishi’s findings is that item order effects should be most likely when preceding questions concern domains of life that are relevant to an individual’s overall life satisfaction, but that are chronically accessible to the individual in only a weak way. For example, satisfaction with recreation might be relevant to overall life satisfaction, but might not be something that always springs to mind when individuals make life satisfaction judgments. Asking a question about recreation prior to one about overall life satisfaction might make recreation-related information more accessible and more salient, thus strengthening the relationship between this and overall life satisfaction. Schimmack and Oishi (2005) tested this hypothesis in relation to housing satisfaction, but failed to show significant order effects. However, this may be because of large individual differences in how important, relevant, and chronically accessible housing information is. For example, Schimmack et al. (2002) found that housing was highly relevant for some individuals and irrelevant for others.
Tourangeau et al. also speculate that some of the variability among findings may be due to the introduction given to the questions that immediately precede the satisfaction questions (e.g. questions about marital status) and to whether marital happiness/satisfaction is one of many other domains assessed alongside overall life satisfaction (because the effect may be reduced if there are several domain-specific items preceding the overall judgment).
Although the picture provided by this work is a complicated one, the available evidence on item-order effects suggests that, to ensure some consistency of results, general life evaluation questions should precede questions about specific life domains, particularly when only a small number of domains are considered. Furthermore, if demographic questions (such as marital status) are asked before evaluative subjective well-being questions, there should be some introductory text to act as a buffer. Specific instructions to respondents to include or exclude certain domains from overall life evaluations (e.g. “aside from your marriage” or “including your marriage”), however, are not recommended, as these can also influence the pattern of responding in artificial ways (because overall evaluations of life would be expected to incorporate information such as marital satisfaction).
Although most of the work on question order has focused on evaluative subjective well-being judgements, the UK Office of National Statistics (ONS, 2011b) have reported an effect of question order on multiple-item positive and negative affect questions. In a split-sample randomised trial using national data (N = 1 000), the ONS found that asking negative affect questions first produced lower scores on positive affect items - and this effect was significant (at the p < 0.05 level) in the case of using adjectives such as relaxed, calm, excited and energised. Conversely, when positive affect questions were asked first, the mean ratings for negative affect questions were generally higher - except in the case of pain - and this increase was statistically significant for the adjectives worried and bored.14
On the issue of how many subjective well-being questions to ask within a survey module, Strack, Schwartz and Wanke (1991) found that asking questions about two closely related constructs could produce distortions in the data. These authors examined the correlations between evaluative life satisfaction and happiness questions administered: i) in two separate and seemingly unrelated questionnaires; and ii) concurrently in the same questionnaire, with a joint lead-in that read, “Now, we have two questions about your life”. The correlation between the measures dropped significantly from r = 0.96 in condition i) to r = 0.75 in condition ii). Strack et al. infer that respondents in condition ii) were more likely to provide different answers to the two questions because they were applying the conversational principle of non-redundancy. Specifically, respondents may assume that two similar questions asked on the same survey must require different responses because asking the same question twice would be redundant.