Notes

1. For example, failures in memory can potentially be reduced by appropriate probing and time-marking (e.g. when responding to questions about affect experienced yesterday, respondents may benefit from a reminder such as “yesterday was tuesday”), whereas failures in communication may be reduced through appropriate translation procedures and pre-testing.

  • 2. This may be particularly problematic in the context of official national household surveys, where respondents may find difficult to understand why government might want to collect information about just one day.
  • 3. Correlations of r = 0.62 and 0.77 (both significant at the p < 0.001 level) were obtained between frequency judgements and experienced affect, whereas for intensity measures, correlations were r = 0.54 and r = 0.59 for positive affect intensity (p < 0.001); and just r = 0.34 and 0.13 for negative affect intensity (p < 0.030 and 0.23), respectively.
  • 4. A large number of response options will only lead to greater scale sensitivity if respondents actually use all of the available options.
  • 5. Respondents were asked to consider a shop or restaurant known to them and rate overall quality (extremely bad to extremely good) and a range of sub-categories, such as competence of staff, promptness of service, range of choice, etc. After completing the measures, respondents were then asked to rate the scales used in terms of: ease of use, quick to use and allowed you to express your feelings adequately.
  • 6. Using a multi-trait, multi-method design, Kroh found validity estimates around 0.89 for the 11-point scale, 0.80 for the 7-point scale and 0.70 for the magnitude scale. In terms of reliability, the magnitude scale performed better, with 0.94 reliability across traits, compared to 0.83 for the 11-point scale and 0.79 for the 7-point measure. However, Kroh reports that the open-ended magnitude scale was generally more problematic, taking longer to complete and apparently reducing respondent motivation. Thus, on balance, and in particular due to the evidence on validity, Kroh recommends the use of 11-point scales.
  • 7. Responses to the question: ’’In general, how happy are you with your life as a whole?”.
  • 8. In 1991, the scale anchors ranged from “not at all satisfied” to “completely satisified” - which could imply a unipolar scale. In 1992, this was changed to an unambiguously bipolar format: “completely dissatisfied” to “completely satisfied”. This switch could also be partly responsible for the stronger skew in the 1992 and 1993 data: indicating complete dissatisfaction could be a much stronger statement than indicating the absence of satisfaction.
  • 9. I.e. respondents who are less motivated, more fatigued or more cognitively burdened - and who may therefore be seeking the first satisfactory answer available, rather than giving detailed consideration to every response option.
  • 10. Cues in this context refer to aspects of the survey that send signals to respondents about the information they may need to provide in their answers. For example, survey source can send a signal about the likely content of a survey, as well as the answers that might be expected.
  • 11. The transition question used was: “Now thinking about your personal life, are you satisfied with your personal life today?” and the inclusion of this transition question reduced the impact of including political questions down from 0.6 of a rung to less than 0.1 of a rung.
  • 12. There remains a (currently unexplored) risk that opening the survey with subjective well-being questions could then produce a context effect for other self-report and particularly subjective items, especially if the drive for consistency underpins some of these response patterns. Although there are a priori grounds to expect context to have less of an impact on more domain-specific judgements (Schwarz and Strack, 2003), it will still be important to investigate whether adding subjective well-being questions to the beginning of a survey has any detectable impact on other responses.
  • 13. Schimmack and Oishi differentiate between temporarily accessible information, brought to mind as a result of study context (for example, item order), and chronically accessible information, which is available to individuals all the time and may be the default information used to generate overall life satisfaction judgments.
  • 14. This implies assimilation when Negative Affect questions are asked first, but a contrast effect when Positive Affect questions are asked first. The ONS plan to run this test again to increase the sample size as well as to investigate order effects among single-item headline measures of evaluative, eudaimonic and affective subjective well-being. This will produce some helpful insights.
  • 15. Contributing 21% to the variation in coefficient size when comparing CAPI and CASI survey methods.
  • 16. Self-administered questionnaires consistently evidenced the highest amount of variance attributable to method effects (28%, 27% and 25% respectively), whereas the variance explained by method effects was much lower in the case of both PAPI (14%, 13%, 13%) and CAPI (11%, 9% and 10%). Reliability estimates for each of these satisfaction measures were broadly similar (e.g. for the life satisfaction scale, reliability was 0.79, 0.80 and 0.80 for SAQ, PAPI and CAPI respectively), with the exception of health satisfaction (which varied from 0.79 in SAQ measures, to 0.83 in PAPI and 0.88 in CAPI). There was, however, an overall statistically significant difference in reliability between SAQ and CAPI, with CAPI producing higher estimates of reliability.
  • 17. One challenge in interpreting this result, however, is that the study employed emotion rating scales that asked respondents to indicate their current seasonal level, compared to how they generally feel. This question phrasing may have encouraged respondents to reflect on seasonal and other contrasts.
  • 18. Happiness was measured through a simple question, with only three response options: “Taking all things together, how would you say things are these days - would you say that you’re very happy, pretty happy, or not too happy these days?”
  • 19. For example, “The magnitude of the modelled effect of a change in weather circumstances from half-cloudy to completely sunny is comparable to that associated with more than a factor of ten increase in household income, more than a full-spectrum shift in perceived trust in neighbours, and nearly twice the entire benefit of being married as compared with being single” (p. 26).
  • 20. These were short-term affect measures, in which respondents were asked to rate the intensity of feelings experienced the previous day on a 0-6 scale.
  • 21. The authors found no effects of weather on positive affect, but small significant effects of temperature, wind and sunlight on negative affect - with warmer temperatures increasing, and both wind power and sunlight decreasing, negative affect. There was also a small but significant effect of sunlight on tiredness.
  • 22. In this context, a “balanced” scale is a multiple-item scale that includes an equal number of positive- and negatively-framed questions - so, for example, an affect measure that contains equal numbers of positive and negative affect items. An “unbalanced” scale contains a disproportionate number of questions framed in a positive or negative way. So, for example, an unbalanced eudaimonia scale might have a greater number of positively-framed items (such as “Most days I get a sense of accomplishment from what I do”), relative to the number of negatively-framed items (such as “When things go wrong in my life it generally takes me a long time to get back to normal”).
  • 23. Diener and Biswas-Diener’s (2009), Psychological Well-Being Scale; and Huppert and So’s (2009), Flourishing Index.
  • 24. I.e. a dispositional tendency towards experiencing negative affect.
  • 25. Data were drawn from the Pew Global Attitudes Survey from 47 different nations across all continents (N = 45 239 interviews in 2007) and examined for the extent to which Likert scale extremes were used. Minkov’s resulting “polarisation” measure was highest when 50% of respondents have chosen the positive extreme (e.g. very good), and 50% the negative extreme (e.g. very bad).
  • 26. For example, a tendency for more extreme responding in a country where the majority of respondents score above the scale midpoint would result in a higher overall mean value because the positive extreme would be emphasised more often than the negative.
  • 27. The ambivalence index constructed by these authors is described as capturing “the degree to which the respondent sees the true and false-key items as making opposite claims” (p. 935), and is a possible proxy for differences in dialectical thinking - i.e. the ability to tolerate holding what Western cultures might regard as contradictory beliefs.
  • 28. All of these correlations were significant at the 0.01 level, two-tailed.
 
Source
< Prev   CONTENTS   Source   Next >