Realist evaluation

Available languages

Realist evaluation is a form of theory-driven evaluation, but is set apart by its explicit philosophical underpinnings.

Pawson and Tilley (1997) developed the first realist evaluation approach, although other interpretations have been developed since. Pawson and Tilley argued that in order to be useful for decision makers, evaluations need to identify ‘what works in which circumstances and for whom?’, rather than merely ‘does it work?. 

The complete realist question is: “What works, for whom, in what respects, to what extent, in what contexts, and how?”. In order to answer that question, realist evaluators aim to identify the underlying generative mechanisms that explain ‘how’ the outcomes were caused and the influence of context. 

The realist understanding of how programmes work

Realist philosophy (Pawson and Tilley use the term ‘scientific realism’) considers that an intervention works (or not) because actors make particular decisions in response to the intervention (or not). The ‘reasoning’ of the actors in response to the resources or opportunities provided by the intervention is what causes the outcomes.

Strictly speaking, the term ‘generative mechanism’ refers to the underlying social or psychological drivers that ‘cause’ the reasoning of actors. For example, a parenting skills programme may have achieved different outcomes for fathers and mothers. The mechanism generating different ‘reasoning’ by mothers and fathers may relate to dominant social norms about the roles and responsibilities of mothers and fathers. Additional mechanisms may be situated in psychological, social or other spheres.

Context matters: firstly, it influences ‘reasoning’ and, secondly, generative mechanisms can only work if the circumstances are right. Going back to our example, there may different social beliefs about the roles and responsibilities of mothers and fathers in different cultures, which may affect how parents respond to the parenting programme. Whether parents can put their new learning into practice will depend on a range of factors – perhaps the time they have available, their own beliefs about parenting, or their mental health. Finally, the context may provide alternative explanations of the observed outcomes, and these need to taken into account during the analysis.

Undertaking a realist evaluation

Developing the initial programme theory

Realist evaluation starts with theory and ends with theory. In other words, the purpose of a realist evaluation is as much to test and refine the programme theory as it is to determine whether and how the programme worked in a particular setting.

The programme theory describes how the intervention is expected to lead to its effects and in which conditions it should do so. The initial programme theory may be based on previous research, knowledge, experience, and the assumptions of the intervention designers about how the intervention will work. The difference between realist and other kinds of programme theory-based evaluation approaches is that a realist programme theory specifies what mechanisms will generate the outcomes and what features of the context will affect whether or not those mechanisms operate. Ideally, these elements (mechanisms, outcome, context) are made explicit at the evaluation design stage, as it enables to design the data collection to focus on testing the different elements of the programme theory.

Choosing the evaluation methods

Realist evaluation is method-neutral (i.e., it does not impose the use of particular methods).

As with any evaluation, the choice of data collection, and analysis methods and tools should be guided by the types of data that are needed to answer the evaluation questions, or more specifically, to test the initial programme theory in all its dimensions.

Usually, both quantitative and qualitative data are collected in a realist evaluation, often with quantitative data being focused on context and outcomes and qualitative data on generative mechanisms. Because the realist analysis uses mainly intra-programme comparisons (i.e., comparisons across different groups involved in the same programme) to test the initial theory, a realist evaluation design does not need to construct comparison groups. Rather, the refined programme theory will be tested subsequently in a different context in a next study. Often the case study design is used, whereby case selection is typically purposive, as the cases should enable ‘testing’ of the initial programme theory in all its dimensions.

Using a realist data analysis approach

Realist data analysis is driven by the principles of realism: realist evaluation explains change brought about by an intervention by referring to the actors who act and change (or not) a situation under specific conditions and under the influence of external events (including the intervention itself). The actors and the interventions are considered to be embedded in a social reality that influences how the intervention is implemented and how actors respond to it (or not). The context-mechanism-outcome (CMO) configuration is used as the main structure for realist analysis.  

In the first phase of analysis, data are organised in relation to the initial programme theory – that is, whether the data relate to what was done (the intervention activities) or to context, mechanism, outcome and (groups of) actors. Qualitative data are coded and appropriate methods for analysing quantitative data applied. The data on outcomes are disaggregated by sub-groups (which were selected on the basis of the programme theory).

Once patterns of outcomes are identified, the mechanisms generating those outcomes can be analysed, provided the right kinds of data are available. The contexts in which particular mechanisms did or did not ‘fire’ can then be determined. Contexts may relate to the sub-groups for whom outcomes were generated and/or to other stakeholders, processes of implementation, organisational, socio-economic, cultural and political conditions.

The analytic process is not necessarily sequential, but should result in a set of ‘context-mechanism-outcome’ (CMO) statements: “In this context, that particular mechanism fired for these actors, generating those outcomes. In that context, this other mechanism fired, generating these different outcomes.”

The last phase of the analysis consists of determining which CMO configuration(s) offers the most robust and plausible explanation of the observed pattern of outcomes. This resulting CMO configuration is then compared with the initial programme theory, which is modified (or not) in light of the evaluation findings.

Using the findings from realist evaluation

Both generative mechanisms and programme theories can be considered at different levels of abstraction, from very specific (particular individuals within particular programmes) to quite abstract (across different kinds of programmes). Pawson and Tilley argued that ‘middle range theories’* (MRT) are most useful.  MRTs are specific enough to generate particular propositions to test and general enough to apply across different situations. Typically, MRTs develop over time based on the accumulation of insights acquired through a series of studies allowing gradual specification of the realist findings. All kinds of theory in realist evaluation – programme theory, theories about particular mechanisms, CMOs, and formal theory– are most useful if developed at a middle level of abstraction.

Because realist evaluation uses the idea of generative causality (i.e. mechanisms only fire when the context is conducive), realists are modest in their claims, stating that an evaluation cannot produce universally applicable findings. At best, evaluation can make sense of the complex processes underlying programmes by formulating plausible explanations ex-post. It can indicate the conditions in which the intervention works (or not) and how they do so. This realistic specification allows decision makers to assess whether interventions that proved successful in one setting may be so in another setting, and assists programme planners in adapting interventions to suit specific contexts.

*A middle range theory is understood as “theory that lies between the minor but necessary working hypotheses …and the all-inclusive systematic efforts to develop a unified theory that will explain all the observed uniformities of social behavior, social organization and social change” [Merton, R.K .(1968). Social theory and social structure. New York: The Free Press, p39].  In essence, “middle range” refers to the degree of abstraction and can refer to programme theory, generative mechanisms, CMOs, formal theories etc.


Hospital management practices in Ghana

We present an example of a realist study of existing practices within the health care system in Ghana; it is not an evaluation of an intervention per se. Hence, we refer to the middle range theory (MRT) rather than the programme theory.

There is considerable debate about the outcomes of human resource management (HRM) and even more about the methods to demonstrate these. In general, proximal outcomes can be described in three categories: (1) improved staff availability; (2) improved staff attitudes and affects (commitment, job satisfaction) and (3) better staff behaviour (higher task performance and organisational citizenship behavior, lower absenteeism).

In a series of case studies of regional and district hospitals in Ghana, we assessed the role of hospital management practices on organisational performance. The MRT was developed through: an exploratory visit of the central regional hospital during which staff were interviewed; a literature review of human resource management and hospital performance which led to the concept of ‘high commitment management’; and, reviewing previous research on organisational culture.

We formulated the initial MRT as follows:

“Hospital managers of well-performing hospitals deploy organisational structures that allow decentralisation and self-managed teams and stimulate delegation of decision-making, good flows of information and transparency. Their HRM bundles combine employment security, adequate compensation and training. This results in strong organisational commitment and trust. Conditions include competent leaders with an explicit vision, relatively large decision-making spaces and adequate resources.”

We initially selected ‘organisational commitment’ and ‘trust’ as proximal outcomes of human resource management, because our initial literature review indicated that these outputs are often found to explain the effect of high commitment management. Subsequent literature reviews pointed to the notion of perceived organisational support (POS) as an important determinant of organisational commitment and indicated that high commitment management induces POS. POS refers to the perception of operational staff that they are effectively supported and recognised by the organisation. This provided a potential explanation in the form of a causal pathway between the management practices and organisational commitment.

The study started with a case study of a well-performing hospital, where the management style and its effects were assessed on the basis of the initial middle range theory. Interviews with management team members and operational staff combined with quantitative data analysis allowed us to identify two distinct sets of practices according to their key mechanism:

The first CMO can be summarised as follows: a hospital management team can attain higher organisational commitment if their management practices act upon economic exchange and social exchange. When the staff perceive high levels of management support or perceived organisational support (intervention), they will develop extra role behaviours, such as working late, organisational citizenship behaviours, etc. (outcomes) on the basis of reciprocity (mechanism) –even in hospitals with limited margins of freedom regarding recruitment, salary scales, promotion and firing (context).

The second CMO configuration can be summarised as ‘keeping up standards of excellence through organisational culture’. Building upon the ‘human capital’ within their staff and in a organisational context of relatively good resource availability and well-trained professionals (context), the managers developed a comprehensive set of both ‘hard’ and ‘soft’ management practices that responded to the needs of the staff  and enabled them to maintain their professional standards (intervention). This contributed to a strong positive organisational culture (immediate outcome) that contributed to and maintained the performance standards at a high level (intermediate outcome) and to organisational performance (distal outcome).

In a next round of case studies, we tested the refined programme theory in other hospitals. These case studies confirmed the importance of perceived organisational support (POS), but also indicated that perceived supervisor support (PSS) is an important determinant that could act as a subsititute for POS and which is equally triggering reciprocal behaviour. PSS was found to be operating in the case of non-professional cadres like maintenance and administrative personnel. For more details, we refer to Marchal et al. (2010a) and Marchal et al. (2010b).


Advice for CHOOSING this approach (tips and traps)

A realist evaluation design is well suited to assess how interventions in complex situations work because it allows the evaluator to deconstruct the causal web of conditions underlying such interventions.

A realist evaluation yields information that indicates how the intervention works (i.e., generative mechanism) and the conditions that are needed for a particular mechanism to work (i.e., specification of contexts) and, thus, it is likely to be more useful to policymakers than other types of evaluation.

As with any evaluation, the scope of the realist evaluation needs to be set within the boundaries of available time and resources. Using a realist approach to evaluation is not necessarily more resource or time-intensive than other theory-based evaluations, but it can be more expensive than a simple pre-post evaluation design.

Advice for USING this approach (tips and traps)

Larger scale or more complicated realist evaluations are ideally carried out by interdisciplinary teams as this usually allows for a braoder consideration of likely mechanisms. However, it is possible to undertake realist evalaution with single practitioners, and in small-scale evaluations.

If the programme theory/MRT is made explicit together with the main actors, it can lead to a better, shared understanding of the intervention. This in turn could improve ownership and lead to more context-appropriate interventions.

Developing the causal theory may also contribute to a better definition of what needs to be evaluated and, thus, what the key evaluation questions are.

Allow sufficient time for assessing the interactions between intervention, actors and context.


Image source: Mobius Transform of a regular circle packing, by fdecomite on Flickr



Discussion Papers


Examples of realist studies in health care

Last updated:

Expand to view all resources related to 'Realist evaluation'

'Realist evaluation' is referenced in: