Taken as a whole, test-retest scores for measures of subjective well-being are generally lower than is the case for commonly collected demographic and labour market statistics such as education and income. These variables generally show a test-retest score in the region of 0.9 (Krueger and Schkade, 2008) - although one would of course expect education and income to vary less over very short time periods. However, the scores for measures of subjective well-being are higher than those found for some more cognitively challenging economic concepts and for those that one would expect to be more variable over time. For example, an analysis of expenditure information found test-retest values of 0.6 over a period of one week (Carin, G., D. Evans, F. Ravndal and K. Xu, 2009).
In general, a cut-off at 0.7 is considered an acceptable level of internal consistency reliability for tests based on comparing results when using different measures (Nunnally and Bernstein, 1994; Kline, 2000). By this criterion, the more reliable multi-item measures of subjective well-being, such as the satisfaction with life scale, exhibit high reliability, although they are not as reliable as demographic statistics or educational status. The case for single item measures is more ambiguous, although the analysis of Lucas and Donellan, which has the best measures and largest dataset of any of the studies considered here, suggests that single item measures of life satisfaction also have an acceptable degree of reliability. Looking at country averages, the reliability of life satisfaction measures is generally well above the required threshold for acceptable reliability.
Measures of affect would be expected to have lower levels of reliability than is the case for evaluative measures, simply because moods change more frequently. The available evidence is generally consistent with this, and suggests that affect measures are reliable enough for use. There is less evidence on measures of eudaimonia. Although the Diener/Wirtz Psychological Well-being Scale performs relatively well, this cannot necessarily be generalised to other measures of eudaimonic well-being, and further research is warranted in this area.