Planning and managing evaluations of development research

This guidance has been prepared for those commissioning evaluations of development research projects and programming. “Commissioners” are those who ask for an evaluation, who are responsible for making sure it happens and is well-honed to the needs of those who will use the evaluation.

It is a comprehensive guide that can be used for each step of large-scale, multi-stakeholder evaluation processes. It prompts you to consider documenting your decisions in a formal Terms of Reference (ToR) that all stakeholders can refer to. For simpler evaluations, you might skim or skip some sections of this guide. In those cases, the ToR may be less formal and it may be sufficient for you to document your decisions for yourself, your colleagues, and evaluators you decide to contract.

What is specific about evaluating research?

The International Development Research Centre (IDRC) in Canada primarily funds and facilitates global South-based research for development (R4D). Its mandate is: “To initiate, encourage, support, and conduct research into the problems of the developing regions of the world and into the means for applying and adapting scientific, technical, and other knowledge to the economic and social advancement of those regions.”

Evaluating research for development (R4D) includes several unique features when compared to evaluating other international development interventions. It is also different from evaluating other areas of research. These differences are described below along with tools to evaluate R4D programming that may be relevant to the evaluation you are commissioning.

Long, non-linear results chains: As shown below, three different projects may all aim to improve health: building a hospital has a clear link with improved health; training health care workers also has a plausible connection to improve health outcomes, though there are more links in a causal chain between that training and improved health; and, finally, a research project that studies food consumption and nutrition in children has an ultimate aim of improving health, but there are many intermediary links in the causal chain before that research can make a difference to the health of the children. Evaluating research for development starts with figuring out what results in that long causal chain you want to evaluate, and then, to assess the contribution of research to the outcomes sought. To evaluate the results of research for development programming, one must accept that result pathways are typically non-linear, context is crucial, and complexity is the norm.

The types of outcomes of R4D are also different from those of other development interventions. They can include, for example, increased capacity of the individuals, organizations and networks doing the research and using the research. Outcome evaluations might focus on the influence of research on technological development, innovation, or policy and practice changes. They also include efforts to scale up the influence of research. The following resources can help evaluate R4D outcomes:

The Knowledge Translation Toolkit. Bridging the Know–Do Gap: A Resource for Researchers –gives an overview of what knowledge translation entails and how to use it effectively to bridge the “know–do” gap between research, policy, practice, and people. It describes underlying theories and provides strategies and tools to encourage and enable evidence-informed decision making.
Evaluating policy influence is the subject of the freely available book by Fred Carden: Knowledge to policy. Making the most of development research – The book starts from a sophisticated understanding about how research influences public policy and decision-making. It shows how research can contribute to better governance in at least three ways: by encouraging open inquiry and debate, by empowering people with the knowledge to hold governments accountable, and by enlarging the array of policy options and solutions available to the policy process.
The Overseas Development Institute has several useful guides for evaluating policy influence. For example, the RAPID Outcome Mapping Approach (ROMA) and brief guides such as Monitoring and evaluation of policy influence and advocacy.
Tools to evaluate capacity development include the framework used in IDRC’s capacity development evaluation for individuals, organizations and networks and the European Centre for Development Policy Management (ECDPM) 5C framework.
In the Monitoring and Evaluation Strategy Brief, Douthwaite and colleagues give an overview of the monitoring and evaluation (M&E) system of the CGIAR Research Program on Aquatic Agricultural Systems (AAS) and describes how the M&E system is designed to support the program to achieve its goals. The brief covers: (1) the objectives of the AAS M&E system in keeping with the key program elements; (2) the theory drawn upon to design the M&E system; and, (3) the M&E system components.
CIRAD’s Impress project describes a research project that explores the impacts of international agricultural research including the methodology used and several case studies.

Finally, evaluating research for development is also different from evaluating academic research. Typically, academic research evaluation is done through deliberative means (such as peer review) and analytics (such as bibliometrics). IDRC uses a holistic approach that acknowledges scientific merit as a necessary but insufficient condition for judging research quality and the role of multiple stakeholders and potential users in determining the effectiveness of research (in terms of its relevance, use and impact). IDRC developed the Research Quality Plus (or RQ+) Assessment Framework which consists of three components:

Key influences (enabling or constraining factors) either within the research endeavor or in the external environment including: (a) maturity of the research field; (b) intention to strengthen research capacity; (c) risk in the research environment; (d) risk in the political environment; and, (e) risk in the data environment.
Research quality dimensions and sub-dimensions which are closely inter-related including: (a) scientific integrity; (b) research legitimacy; (c) importance; and, (d) positioning for use.
Customizable assessment rubrics (or 'evaluative rubrics') that make use of both qualitative and quantitative measures to characterize each key influence and to judge the performance of the research study on the various quality dimensions and sub-dimensions.

The work of IDRC contributes to the wider ongoing debate about how to evaluate research quality and acknowledges valuable approaches of other organizations working in this area (see, for example, resources below). IDRC invites funders of research and researchers to treat the RQ+ framework as a dynamic tool for adaptation to their specific purposes.

Further information & Resources

Ofir Z, Schwandt T, Colleen D, McLean R (2016). RQ+ Research Quality Plus. A Holistic Approach to Evaluating Research. Ottawa: International Development Research Centre (IDRC) – This report describes IDRC’s approach and inaugural version of the RQ+ assessment framework for evaluating research quality. The report includes valuable lessons learned from the implementation of the framework and discusses a range of potential uses of the framework.
For information about bibliometrics, see: https://www.nature.com/articles/520429a and http://www.leidenmanifesto.org/. These describe some typical ways in which research quality is assessed and the problems those hold.

See also: http://www.researchtoaction.org/2013/08/altmetrics-and-the-global-south-increasing-research-visibility/ which includes Altmetrics and ImpactStory (sites that track impact of written work by analysing the uptake of research within social media ) and The Scholarly Communication in Africa Programme (SCAP) (an initiative seeking to increase the visibility and developmental impact of research outputs from universities in Southern Africa).

Steps in the commissioning process

Typically, commissioners define what is to be evaluated; decide who will be involved in the evaluation; what results are considered important; and, what evidence is relevant. While commissioners, generally, rely on the expertise of evaluators to make decisions about the specific methods to be used, they often set the overall parameters (i.e., overall approach, budget, time line) within which the evaluation is to take place.

Manager's guide to evaluation

The manager's guide supports decision-making throughout an evaluation. It covers planning, design, execution, reporting, and supporting the use of the evaluation's findings.

To evaluate or not?

With the exception of external program reviews, which are managed centrally by the Policy and Evaluation Division, evaluation at IDRC is strategic as opposed to routine. This means that evaluations managed by IDRC program staff and grantees are undertaken selectively and in cases where the rationale for undertaking the evaluation is clear.

Within IDRC’s decentralized evaluation system, program staff and grantees generally have jurisdiction over evaluation decisions at the project level and, to a certain extent, at the program level.

At the project level: project evaluations are normally conducted under the direction of program officers or project partners. Not all projects are routinely evaluated.

The decision to evaluate a project is usually motivated by one of the following “evaluation triggers”:

Significant materiality (investment);
High risk;
Novel / innovative approach;
High political priority and/or scrutiny;
Advanced phase or maturity of the partnership.

At the program level: program-led evaluations are defined and carried out by a program in accordance with its needs. Program-led evaluations often focus in on burning questions or learning needs at the program level. They strategically assess any important defined aspect within the program’s portfolio (e.g., project(s), organization(s), issues, modality, etc.). Program-led evaluations can be conducted either internally, externally or via a hybrid approach. The primary intended users are usually the program team or its partners (e.g., collaborating donors, project partners, like-minded organizations, etc.).

When NOT to evaluate

In general, there are some circumstances in which undertaking a project or program evaluation is not advisable:

Middle/Main Content Area;
Constant change. When a project or program has experienced constant upheaval and change, evaluation runs the risk of being premature and inconclusive;
Some projects and programs are simply too young and too new to evaluate, unless the evaluation is designed as an accompaniment and/or developmental evaluation;
Lack of clarity and consensus on objectives. This makes it difficult for the evaluator to establish what s/he is evaluating;
Primarily for promotional purposes. Although “success stories” and “best practices” are often a welcome byproduct of an evaluation, it is troublesome to embark on an evaluation if unearthing “success stories” is the primary goal. Evaluations should systematically seek to unearth what did and also what didn’t work.

A key question is: Realistically, do you have enough time to undertake an evaluation?

When the timelines for planning and/or conducting an evaluation are such that they compromise the credibility of the evaluation findings, it is better not to proceed but to focus efforts on how to make the conditions more conducive.

Roles and responsibilities for evaluation

Within IDRC’s decentralized evaluation system, responsibility for conducting and using evaluation is shared:

Senior management actively promotes a culture of learning, creating incentives for evaluation and learning from failures and disappointing results. It allots resources for evaluation and incorporates evaluation findings into its decision-making.
Program staff and project partners engage in and support high-quality, use-oriented evaluations. They seek opportunities to build their evaluation capacities, think evaluatively, and develop evaluation approaches and methods relevant to development research.

IDRC resources

IDRC's overall approach to evaluation, guiding principles, components, and roles within our decentralized system, read Evaluation at IDRC

Further information & Resources

USAID Checklist for deciding to evaluate

Adherence to ethical principles and standards

Both research and evaluations supported by IDRC endeavor to comply with accepted ethical principles:

Respect for persons, animals, and the environment. When research/evaluation involves human participants, it should respect the autonomy of the individual.
Concern for the welfare of participants: researchers/evaluators should act to benefit or promote the wellbeing of participants (beneficence) and should do no harm (non-maleficence).
Justice: the obligation to treat people fairly, equitably, and with dignity.

IDRC-supported research and evaluation should also adhere to universal concepts of justice and equity while remaining sensitive to the cultural norms and practices of the localities where the work is carried out.

Participants must be informed that they are taking part in a research or evaluation study and that they have the right to refuse to participate or cease participation at any time without negative consequences.

The guiding principle of “first, do no harm” applies equally to staff, consultants, and beneficiaries during the evaluation process.

The American Evaluation Association (AEA) developed and adopted important guiding principles for evaluators:

Systematic inquiry: Evaluators conduct systematic, data-based inquiry.
Competence: Evaluators provide competent performance to stakeholders.
Integrity/Honesty: Evaluators display honesty and integrity of the entire evaluation process.
Respect for people: Evaluators respect the security, dignity and self-worth of respondents, program participants, clients, and other evaluation stakeholders.
Responsibilities for general and public welfare: Evaluators articulate and take into account the diversity of general and public interests and values.

[Source: American Evaluation Association, 2004.]

Importance of cultural competence in evaluation

Cultural competence is the ability to possess sensitivity to and understanding of the cultural values of individuals and groups. Culture can be described as the socially transmitted pattern of beliefs, values, and actions shared by groups of people.

Cultural competence in evaluation is an essential competency that allows an evaluator to demonstrate an understanding of and sensitivity to cultural values. This ensures that an evaluation is respectful and responsive to those involved. Cultural competence helps you work effectively in cross-cultural settings.

A culturally competent perspective can promote effective collaboration. It can also ensure that cultural competency is integrated into the entire evaluation process from choosing the methodology, selecting the right surveys or data collection tools, to reporting the data and findings.

To be culturally competent, a person should:

Value the differences between groups and individuals
Be knowledgeable about different cultures
Be aware of the interaction between cultures
Be knowledgeable of negative perceptions or stereotypes a group may face
Be able to adapt, as needed, to adequately reach diverse groups

[Source: The University of Minnesota. Find more details on Cultural Competence in Evaluation]

IDRC resources

IDRC Corporate Principles on Research Ethics

Acknowledgements

The guide was developed with funding from the International Development Research Centre (IDRC) in Canada by: Dr Greet Peersman and Professor Patricia Rogers[content] and Nick Herft [design] of the BetterEvaluation (BE) Project, Australian and New Zealand School of Government (ANZSOG), Melbourne, Australia with input from IDRC staff.

We would like to thank the content reviewers: Farid Ahmad, Head Strategic Planning, Monitoring and Evaluation, International Centre for Integrated Mountain Development (ICIMOD), Kathmandu, Nepal & Vanessa Hood, Evaluation Lead, Strategy & Planning, Sustainability Victoria, Melbourne, Australia.