Impact evaluation, like many areas of evaluation, is under-researched. Doing systematic research about evaluation takes considerable resources, and is often constrained by the availability of information about evaluation practice. Much of the work undertaken in evaluation is not readily visible (see the recent comments by Drew Cameron on an earlier blog post which provide details about the considerable effort involved in a study of impact evaluations in development).
Therefore we don’t have a good collective understanding of the state of practice, or a body of knowledge about when particular methods, designs, approaches and tools are most appropriate and effective. Recommendations for evaluation practice are rarely based on evidence about when these recommendations would be useful and appropriate - and when they would not be.
As we engage in a range of events across the International Year of Evaluation we would like to ask if we can use the momentum to start building both a formal agenda and cumulative knowledge about doing impact evaluation better. Could this be part of the discussions about a global evaluation agenda (being discussed in a series of events in EvalYear, and through the European Evaluation Society's website)?
Some of the questions to ask are:
- What would be important areas to research?
- How can we set priorities among these rather than just creating a long wish list?
- How do we take the work forward, collectively, in a manner that makes a real difference?
In 2013, we were fortunate to be involved in a symposium held by the Centre for Development Impact at the Institute of Development Studies (IDS) around the theme ‘Impact Innovation and Learning: Towards a Research and Practice Agenda for the Future’. Proceedings from this event have now been published and made available as two special issues of the IDS Bulletin: ‘Rethinking Impact Evaluation for Development’ and ‘Towards Systemic Approaches to Evaluation and Impact’. Together, the papers discuss how the new purpose of development cooperation might change the way we approach evaluation, and what sorts of innovation might be needed to meet the new challenges. For example, what can be done to better understand what works in development? In particular, how can we address causal inference when we expect single interventions to have less and less direct influence but the call for accountability continues to increase with tighter aid budgets and more pronounced time pressures?
Our paper ‘Developing a research agenda for impact evaluation in development’ discusses why it would be useful to have a research agenda and some suggestions about the process that would be needed to develop it in a formal and collaborative way. We set out a framework for organising possible research topics and research questions, and discuss methods for conducting this research and processes for developing the research agenda.
What needs to be researched?
The way we think about development interventions, their impact, and, impact evaluation has important implications for what a research agenda might consider. In an earlier blog in January, we argued for using a broad definition of both impact and impact evaluation. That is, impact evaluation should include all evaluations that address impact, not only those using particular designs or methods;and, impacts should include indirect and unintended impacts as well as direct and intended impacts.
Within this broad scope, research on impact evaluation, then, needs to look at:
(1) the enabling environment - policies, guidelines, guidance, formal and informal requirements and resources for impact evaluation
(2) practice - how impact evaluation is actually undertaken – including not only methods and designs for causal inference and measurement, but also methods and processes x, including management structures, processes for identifying and engaging primary intended users, methods for deciding on the values to be used (criteria, standards and synthesis), and strategies to support use of findings. In fact, methods and processes along the range of tasks described in the BE Rainbow Framework.
(3) products - the reports and other artefacts produced by impact evaluations, including innovative approaches such as video reports and infographics
(4) the impacts of impact evaluation – both intended uses and other impacts, including negative ones. This needs to draw on the extensive research on evaluation use and influence (for example, Mel Mark and Gary Henry’s paper on mechanisms of evaluation influence) and also pay attention to possible negative impacts such as loss of trust, the damage done to communities through intrusive, time-consuming data extraction, goal displacement and data corruption.
Some research will focus on only one of these aspects, but particularly useful research would link the various aspects, building evidence that could be used to develop contingent recommendations about the types of enabling environment, practices and products that are likely to produce beneficial impacts, and how to achieve them.
Different types of research are needed:
- Descriptive research to document what is being done, developing typologies and identifying patterns. For example, how is ‘impact evaluation’ defined in official guidelines? What are the strategies used to engage intended beneficiaries in the evaluation process?
- Causal research to identify the factors that produce these patterns. For example, what are the formal and informal barriers and enablers for conducting and using impact evaluation? In what ways and to what extent does the involvement of intended users in the evaluation process increase actual use of the findings?
- Evaluative research to compare the actual performance to explicit standards of performance. For example, how accurately do evaluation reports summarise findings and make defensible conclusions? What is the relative value-for-money of different approaches to impact evaluation?
How should we research impact evaluation?
There are problems with two commonly used methods to research impact evaluation: syntheses of journal articles; and surveys of evaluators and others involved in evaluation.
Published journal articles represent a tiny fraction of evaluation practice. Much evaluation practice is not documented, or not made publicly available, or not published in academic journals. Although journals try hard to engage evaluation practitioners and managers, articles are still disproportionately written by academics and graduate students, who have greater incentives and capacity to do so, whose experience is more likely to be based on an external evaluator perspective, and more likely to be about smaller scale evaluations. There are also geographic and linguistic biases in the types of evaluation practice that are visible in published journal articles.
Surveys have the usual problems with the representativeness of the sample, response rate and the ability to capture the rich contextual information needed to make sense of practice, products, and impacts. Some surveys about evaluation have had both very low response rates (less than 3% in one case) and demonstrably different sample profile to the population of organisations being researched.
Instead, research into impact evaluation needs to include a range of approaches including:
- Documenting conference presentations and discussions – these can provide rich and varied descriptions of practice, especially where formats are used that encourage disclosure of gaps and dilemmas. Our conferences need to make better use of formats such as roundtables think tanks to gather rich documentation about the messy details of real impact evaluation practice.
- Simulations and experiments – where these can be made adequately realistic. It is easier to test comprehension of main messages from impact evaluation reports, for example, than to test the influence of these reports on decisions where it is not possible to have the same level of responsibility and consequence as in real life. A recent example explored people’s response to a policy brief.
- Systematic, rich case studies – these seem likely to provide the most useful evidence about the enabling environment, practice, products and impact – and the linkages between them. These could be written retrospectively (as in the writeshop cases developed as part of the BetterEvaluation project), or concurrently (with existing cases), or concurrently during trials of new methods or of the translation of methods to new settings (as in the Methods Lab research project on impact evaluation). Comparative case studies would be particularly useful (BetterEvaluation's new brief for UNICEF on comparative case studies, written by Delwyn Goodrick, provides guidance on this).
- Annotated examples – these could be short narratives about impact evaluation that are crowd-sourced to provide examples which are both illustrative and can be used to build and test theories about how evaluation works in practice, especially if done iteratively, searching for disconfirming evidence.
This mix of research methods would provide a better view of current and potential evaluation practice.
Here are some examples of how we might use these methods to research particular aspects of impact evaluation:
|Aspect of impact evaluation||Research question||Possible research approach|
|Choose what to evaluate||What investments and activities are the subjects of evaluation? On the basis of what evidence are decisions made about other investments and activities?||Review of formal records of impact evaluations (where available); survey of evaluators|
|What opportunities exist for funding public interest impact evaluation rather than only funder-controlled evaluation?||Review of public interest research examples|
|Develop key evaluation questions||What are effective processes to identify potential key evaluation questions and prioritise these to a feasible list?||Detailed documentation and analysis of meeting processes to negotiate questions|
|Supporting use||How can the utility of an evaluation be preserved when the primary intended users leave before it is completed?||Identify, document and analyse existing examples|
|Reporting findings||How can reports communicate clearly without oversimplifying the findings, especially when there are important differences in results?||User testing of alternative reports|
|What procurement processes support effective selection and engagement of an evaluation team and effective management of the evaluation project?||Interview evaluation managers, evaluators, and contract managers about their processes, the impact of these and the rationale for them, develop a typology of issues and options|
|Evaluation capacity development||Why does so much evaluation fail to be informed by what is known about effective evaluation?||Interview with evaluation managers about their knowledge, sources of knowledge about evaluation and views on barriers to using this knowledge|
BetterEvaluation, as an international platform for generating and sharing information about choosing and using evaluations methods and processes, can play an important role in this research in a number of ways:
- documenting tacit knowledge - including developing retrospective or concurrent accounts of the process of an evaluation (as in the first round of writeshop cases), working with the creators of new approaches to document the details of the process (for example, Outcomes Harvesting with Ricardo Wilson-Grau, and Collaborative Outcomes Reporting with Jess Dart), documenting the adaptation of a method, process or approach to a new setting
- curating existing knowledge - for example, providing ready access to existing research such as databases of impact evaluations
- building new knowledge about which methods and processes are suitable for which situations - including documenting the expert design process used to develop an evaluation design, developing a database about the contexts in which particular designs have been used and their level of success
- supporting the translation and adaptation of good practices to new contexts - rather than as Michael Patton warned against in a recent guest blog, simply transferring something identified as 'best practice.'
How should a research agenda for impact evaluation be developed?
To be effective, a research agenda cannot simply be a wish list developed by researchers, nor an ambit claim developed by a self-selected group. It needs to be inclusive, transparent and defensible. To maximise uptake of the findings, it needs to encompass strategies and processes for engaging intended end users in the research process, including in the process of identifying and deciding research priorities. Our paper draws some lessons from the development of a national evaluation agenda for HIV about how this might be done - including the need for time and resources to identify and engage a range of stakeholders, support their capacity development and build consensus so different perspectives can be adequately heard and accommodated.
Do you agree we need more research into impact evaluation? What are your thoughts on issues and methods for this research? How might we work to develop a collaborative formal research agenda for impact evaluation?