Hackathon

We’re excited to be involved in the 2020 IPDET Hackathon – a week-long event in which hundreds of people from around the world bring together their skills, knowledge and inspirations to find creative solutions to challengesof our times.

Or, as the IPDET Hackathon sign-up info states:

Unlike tech-hackathons which aim at merely technical solutions, the IPDET Evaluation Hackathon is open to all possible solutions which might help to empower the field of evaluation: seeking for relevant concepts, elaborated strategies, applicable prototypes, innovative methods etc.

The BetterEvaluation team is looking forward to helping support participants meet the challenges. Our team will be available to participants via the Hackathon Slack channels to answer questions about using the BetterEvaluation website as a resource. And in this blog we share key ideas and suggested reading to get participants thinking about some of the issues that come up in the challenges. BetterEvaluation’s CEO, Professor Patricia Rogers, will be part of the judging panel for the Hackathon.

As well as engaging with the Hackathon throughout this week, we’re eagerly awaiting the ideas and solutions that teams come up with. Our mission at BetterEvaluation is to work collaboratively with our global community to create, share and support use of knowledge about how to better plan, manage, conduct and use evaluation. We encourage Hackathon participants to consider contributing to BetterEvaluation after the event has concluded. This might be through providing a resource recommendation for something that was particularly useful or enlightening while working on the challenges, or it might be some innovative method or process that should be included on the site.

Cross-cutting themes that run across some of the Hackathon Challenges

There are ten Hackathon Challenges that participants can choose to address. When reading through these, a few underlying themes jumped out at us which ran through a number of these challenges - and so we thought we'd share a few key resources that may be useful in thinking about these themes.

1. What does it mean to do evaluation better?

Venn Diagram with the words utility, validity, propriety and feasibility

At BetterEvaluation we think a lot about what it means to do evaluation better. It’s summed up in our overarching guiding principles of doing no harm and doing maximum good. When people think about 'evaluation quality' they often focus on data collection methods but there's much more to ensuring evaluation quality than that. Comprehensive 'evaluation quality' means paying attention to the quality of every step of the evaluation process, not just the accuracy of data collection. Drawing on the Joint Committee Program Evaluation Standards, we pay attention to a number of dimensions of quality for evaluation:

Utility – far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise. John Tukey The future of data analysis. Annals of Mathematical Statistics 33 (1), (1962), page 13.

Validity – thinking about the whole suite of evaluative tasks, from managing through to reporting (see the BetterEvaluation Rainbow Framework for the range of evaluation tasks), as well as how data are collected, analysed and reported with validity.

Feasibility – thinking about practical issues especially in times when usual methods of data collection and engagement are not possible.

Propriety – thinking about the ethical issues, keeping in mind the wider impacts of the process of doing evaluation, as well as the issues involved in data collecting and reporting.

We recommend having a look at:

The Joint Committee standards (discussed above) – Originally developed for educational evaluation but now used more widely.
Our page on Evaluation Standards - This has links for a number of different evaluation standard statements.
The BetterEvaluation Manager’s guide - This guide walks users through the process of commissioning or managing an evaluation, from start to finish. It’s a good way to get an introduction to a range of issues related to evaluation quality – beyond looking at just data collection.

2. What is innovation in evaluation?

Many of the challenges we face in evaluation require innovative responses. But what do we mean by innovation? At BetterEvaluation we have a particular focus on finding and sharing examples of innovative practice, in addition to coverage of more traditional methods, processes, designs and approaches.

When existing evaluation tools, processes, methods and systems are not enough to meet current challenges, you need to draw on innovations in evaluation.

At its essence, the Hackathon challenges ask participants to innovate. However, innovation does not always mean invention of something new. It can involve translation from another setting, bricolage (combining) of existing elements, or systematisation of some existing good practices:

Table 1. Types of innovation

Invention	New technology or new process
Transfer or translation	Bringing in an idea from another setting or another purpose, and possibly adapting it
Bricolage	Gathering together existing elements in a new way
Systematisation	Documenting and making explicit and systematic some existing practices

We recommend checking out our briefing paper on Innovations in Evaluation, developed for the joint UNICEF, BetterEvaluation and EVAL-SDGs webinar. This brief opens up some of the issues and questions about why and how to adopt innovations in evaluation, and discusses how innovations can be useful in addressing eight long-standing challenges in evaluation.

3. How do we infer causality in a complex world?

As the world and the interventions we evaluate become increasingly complicated and complex, simple causal models and simple approaches to causal inference become less frequently appropriate. However, this doesn’t mean that drawing causal inference should be put in the ‘too hard’ basket.

As a starting point, it’s important to be clear about the actual causal question being asked. Many people focus on the question of attribution (‘Did the intervention cause the effect?’), however this question becomes inappropriate if the effect has multiple causes or is not homogenous. Elliot Stern and colleagues’ ground-breaking work on ‘Broadening the range of evaluation methods’ for DFID helps to think about which causal questions might be relevant when, and what methods and approaches are likely to be appropriate for answering these.

We recommend reading the working paper, Broadening the range of designs and methods for impact evaluations, by Elliot Stern, Nicoletta Stame, John Mayne, Kim Forss, Rick Davies, Barbara Befani (2012) for the UK Department for International Development (DFID). In it, they describe how theory-based, case-based and participatory options can be used in impact evaluations. These designs show promise to reinforce existing impact evaluation practice, including experimental and statistical designs, when dealing with complex programmes.

You might also find it useful to check out the report Choosing appropriate methods for impact evaluation which further develops these ideas (as summarised in the following table)

Table 2. Types of Intended Use

Intended Use	Typical evaluation question	Conditions:	Relevant methods and designs
Attribution	Did the intervention cause the impact(s)?	A single cause A small number of effects Either a homogenous effect or knowledge of relevant contextual factors	RCTs, regression discontinuity, propensity scores
Apportioning	To what extent can a specific impact be attributed to the intervention?	A single effect Large data sets on relevant contributing factors	Regression, econometrics, structural equation modelling
Contribution	Did the intervention make a difference?	Understanding of the different configurations that could produce the results (e.g. contextual factors, programme variations)	Contribution analysis, comparative case studies, process tracing, Bradford Hill criteria
Explanation	How has the intervention made a difference?	The development of a clear programme theory which sets out a change theory and an action (can be informed by stakeholder interviews, research literature, and theoretical frameworks) Where possible, identification of potential 'active ingredients' in the programme and exploration of different combinations of what is delivered to test their relative effectiveness. Requires homogeneity of effects as it only provides information about average effects	Multi-arm RCTs with 2-way or 3-way interactions designed to identify the 'active ingredient'
Generalisability or transportability	Is the intervention likely to work elsewhere? What is needed to make it work elsewhere?	Understanding of contextual factors that have affected the implementation and results To identify alternative action theories which might be more suitable in different contexts, or even alternative change theories	Realist evaluation

4. How do we evaluate efficiency?

According to the OECD DAC revised criteria, the 'efficiency' criteria examines:

"HOW WELL ARE RESOURCES BEING USED?

The extent to which the intervention delivers, or is likely to deliver, results in an economic and timely way.

Note: “Economic” is the conversion of inputs (funds, expertise, natural resources, time, etc.) into outputs, outcomes and impacts, in the most cost-effective way possible, as compared to feasible alternatives in the context. “Timely” delivery is within the intended timeframe, or a timeframe reasonably adjusted to the demands of the evolving context. This may include assessing operational efficiency (how well the intervention was managed).

Evaluating efficiency can be highly problematic – for example if efficiency is defined as number of outputs/direct cost, it can lead to invalid results and skewed incentives to produce a high number of outputs regardless of quality, or to produce outputs regardless of how well these lead to outcomes."

We recommend reading OPM's approach to assessing value for money (King and Oxford Policy Management, 2018) as a starting point on value for investment to start thinking about efficiency in a wider context and using program-specific criteria when evaluating efficiency. The guide sets out an 8 step process for evaluating value for money and draws on a number of existing evaluation methods and approaches, especially the use of contribution analysis to answer causal questions and rubrics to answer evaluative questions. It also identifies and addresses different types of value for money evaluation - economy, efficinecy, cost-effeciveness, and equity - and different types of efficiency. In particular, it discusses ways to avoid erroneous conclusions about efficiency that arise from ignoring equity or sustainability issues.

5. How do we support engagement and participation in evaluation?

Engagement in and commitment to evaluation is an important part of making it better. Many stakeholders of an evaluation are understandably cautious, reluctant or cynical because of previous bad experiences with evaluation. We see better (more appropriate and more effective) participation in evaluation as an important part of making evaluation more valid and useful, especially in terms of addressing issues of equity (no one left behind).

We recommend reading through some of the information on our site around participation. In particular, three of our approach pages may be of particular use:

Participatory Evaluation: an approach that involves the stakeholders of a programme or policy in the evaluation process. This involvement can occur at any stage of the evaluation process, from the evaluation design to the data collection and analysis and the reporting of the study.
Utilization-Focused Evaluation (UFE): UFE, developed by Michael Quinn Patton, is an approach based on the principle that an evaluation should be judged on its usefulness to its intended users. Therefore evaluations should be planned and conducted in ways that enhance the likely utilization of both the findings and of the process itself to inform decisions and improve performance.
Empowerment Evaluation: Empowerment evaluation is a stakeholder involvement approach designed to provide groups with the tools and knowledge they need to monitor and evaluate their own performance and accomplish their goals. It is also used to help groups accomplish their goals. Author: David Fetterman.

Good luck with the Hackathon Challenges!

We hope the above ideas and resources are useful for you as you approach the different Hackathon Challenges. We're really looking forward to seeing what you come up with.

If you come across resources that would be helpful for others in the evaluation community while doing these challenges, or want to share some new options for doing evaluation on BetterEvaluation after the Hackathon - let us know. Let's build knowledge about how to do evaluation better together.