Philosophy of science: generalisability and evidence in context

The philosophy of science provides a final perspective to help develop a conceptualisation of appropriate evidence. Some authors in this discipline have also identified that technical language on ‘hierarchies of evidence’ can obscure the political nature of policymaking (cf. Goldenberg 2006), but there is a particularly strong strand of work in this field that addresses the concepts of causality and

Evidence may be constructed in ways more or less useful for policy goals

Figure 6.2 Evidence may be constructed in ways more or less useful for policy goals.

generalisability in evidence production as well. This captures the points noted in Chapter 2 in particular about the need to distinguish between the internal and external validity of evaluation studies - i.e. showing that an intervention that worked in one place does not necessarily mean that the intervention works always and everywhere. As Cartwright explains: ‘For policy and practice we do not need to know “it works somewhere”. We need evidence for “it-will-work- for-us”’ (2011, p. 1401).

It was further explained in Chapter 2 that the generalisability assumed with most clinical trials arises from pre-existing knowledge about shared features of human biochemistry or anatomy, which provide the mechanisms through which clinical interventions produce a result. Yet the nature of the social world can be quite different, with interventions often working through alternative mechanisms in differing contexts (Cartwright 2011; Cartwright and Hardie 2012; Worrall 2010). This can be a particular challenge for the application of the method of meta-analysis to social concerns. Meta-analyses typically combine findings from multiple experiments to use a larger sample size than any one trial in order to have greater certainty of effect. It is a key tool promoted by the EBP movement, with a graphic presenting the results of a meta-analysis (on the use of corticosteroids in women about to give birth prematurely) even serving as the logo of the Cochrane Collaboration, the highly regarded global initiative that reviews evidence to guide clinical practice.

Yet systematically reviewing experimental trials and combining their results into a single point estimate typically relies on the assumption that the identical mechanism of effect exists across trials. Corticosteroids work in a pregnant woman through the same mechanisms in Tampa as in Timbuktu. Yet this is not necessarily the case for many social interventions. Chapter 2 illustrated this with the example of Robert Martinson’s work that reviewed all published English- language reports on prisoner rehabilitation from a 20-year period and found ‘no clear pattern to indicate the efficacy of any particular method of treatment’ (1974, p. 49). Over two decades later, Pawson and Tilley (1997) noted that this was the most cited paper in the history of evaluation research and was widely interpreted as a conclusion that ‘nothing works’ for prison reform. Yet they explain that this study created ‘an impossibly stringent criterion for “success”’ (1997, p. 9), requiring studies to show impact across all included populations. Instead, they argue that it is more important to consider what works for whom in which circumstances, noting that the more useful conclusion to draw from Martinson’s review is that ‘most things have been found sometimes to work’ (1997, p. 10).

We can return to the case of HIV prevention described above to see another example of this. Just as it was noted how the construction of evidence may be important to guide HIV prevention efforts, we can also recognise that the socially determined nature of many HIV risk practices means that many interventions might only work sometimes, requiring direct consideration of the mechanisms by which interventions have their effect in different contexts. A useful illustration of this comes in the form of recent experimental trials evaluating the provision of cash transfers to prevent HIV in African settings. Johnston has argued that cash transfers are a current fashion in Africa, ‘liked by almost everyone, seemingly effective and potentially cheap’ (2015, p. 409), yet she explains how evaluation of these programmes has shown particularly mixed results for HIV prevention. In some interventions, cash transfers resulted in lower HIV incidence (fewer new infections) compared to control groups, in others, it resulted in lower incidence for some but not all intervention sub-groups and in yet others, there was no significant difference for any groups (Johnston 2015). However, no doubt one of the main reasons for this is the simple fact that money will be used in different ways by people in different settings, which will only occasionally affect their HIV risk behaviours.

So, for instance, if women are reliant on selling sex to make ends meet, a cash transfer could conceivably reduce this high-risk practice. Yet in situations where having access to cash alternatively leads to broader social and sexual networking, this might inadvertently increase risk (Parkhurst 2010). Indeed, if a cash transfer was given to a population group known to routinely pay money for sex (such as travelling businessmen in some contexts), this could have the opposite effect to that intended on HIV risk behaviour.

A meta-analysis of cash transfers for HIV prevention might seem a good idea, but there would be little usefulness of any point estimate of impact found by combining the included studies. Systematically reviewing the literature may also be useful, but the lessons to learn from such a review would not come from assuming that the mechanism of effect is the same in all people. It is an oversimplified and erroneous question to ask ‘do cash transfers work?’. Instead, reviews would need to look within studies to explore how the intervention was delivered and how it brought about an effect. Methods such as ‘process evaluation’ have grown in terms of their use to help investigate some of these elements (in particular, looking at which participants received particular components of interventions; Saunders, Evans and Joshi 2005) and alternatives such as ‘realist evaluation’ methods have developed to study mechanisms of intervention effect in different social contexts (Kazi 2003; Pawson and Tilley 1997). Evidence required for these sorts of questions do not necessarily make up the core features of experimental trials, but rather they supplement impact evaluations with additional methods such as in-depth qualitative or ethnographic analyses,

local surveys or quantitative sub-group analyses to generate evidence about mechanisms of effect. Such forms of evidence often rank low in existing hierarchies, but they may be particularly appropriate to the central policy question of: ‘Will it work here?’

Returning to our simple graphics, Figure 6.3 is constructed to illustrate how bodies of evidence may be more or less relevant to the context addressed by the policy decision (see Figure 6.3 above). The figure particularly illustrates that there can be much evidence ranking highly on hierarchies in terms of quality that may not be applicable locally and, as such, may not be appropriate for the given policy needs.

< Prev   CONTENTS   Source   Next >