impact evaluations – Monitoring and Evaluation NEWS

PROCESS TRACING: Oxfam’s Draft Protocol

Undated, but possibly 2012. Available as pdf

Background: “Oxfam GB has adopted a Global Performance Framework. Among other things, this framework involves the random selection of samples of closing or sufficiently mature projects under six outcome areas each year and rigorously evaluating their performance. These are referred to as Effectiveness Reviews. Effectiveness Reviews carried out under the Citizen Voice and Policy Influencing thematic areas are to be informed by a research protocol based on process tracing, a qualitative research approach used by case study researchers to investigate casual inference.”

Oxfam is seeking feedback on this draft. Please send your comments to PPAT@oxfam.org.uk

See also the related blog posting by Oxfam on the “AEA365 | A Tip-a-Day by and for Evaluators”website:

APC Week: Claire Hutchings and Kimberly Bowman on Advocacy Impact Evaluation

Rick Davies comment: While the draft protocol already includes six references on process tracing, I would recommend two more which I think are especially useful and recent:

Mahoney, James. “Mahoney, J. (2012). The Logic of Process Tracing Tests in the Social Sciences. 1-28.” Sociological Methods & Research XX(X) (March 2, 2012): 1–28. doi:10.1177/0049124112437709. http://smr.sagepub.com/content/early/2012/02/29/0049124112437709.full.pdf
- Abstract: This article discusses process tracing as a methodology for testing hypotheses in the social sciences. With process tracing tests, the analyst combines preexisting generalizations with specific observations from within a single case to make causal inferences about that case. Process tracing tests can be used to help establish that (1) an initial event or process took place, (2) a subsequent outcome also occurred, and (3) the former was a cause of the
  latter. The article focuses on the logic of different process tracing tests, including hoop tests, smoking gun tests, and straw in the wind tests. New criteria for judging the strength of these tests are developed using ideas concerning the relative importance of necessary and sufficient conditions. Similarities and differences between process tracing and the deductive nomological model of explanation are explored.

Goertz, Gary, and James Mahoney. A Tale of Two Cultures: Qualitative and Quantitative Research in the Social Sciences. Princeton University Press, 2012. http://books.google.com.au/books?id=3DZ6d0d2K3EC&printsec=frontcover&dq=goertz+mahoney&hl=en&sa=X&ei=o38VUYrmPMXUmAXGjoCYCQ&ved=0CDcQ6AEwAQ See Chapter 8: Causal mechanisms and process tracing

Addressing attribution of cause and effect in small n impact evaluations: towards an integrated framework

Howard White and Daniel Phillips, International Initiative for Impact Evaluation, Working Paper 15, May 2012 Available as MSWord doc

Abstract

With the results agenda in the ascendancy in the development community, there is an increasing need to demonstrate that development spending makes a difference, that it has an impact. This requirement to demonstrate results has fuelled an increase in the demand for, and production of, impact evaluations. There exists considerable consensus among impact evaluators conducting large n impact evaluations involving tests of statistical difference in outcomes between the treatment group and a properly constructed comparison group. However, no such consensus exists when it comes to assessing attribution in small n cases, i.e. when there are too few units of assignment to permit tests of statistical difference in outcomes between the treatment group and a properly constructed comparison group.

We examine various evaluation approaches that could potentially be suitable for small n analysis and find that a number of them share a methodological core which could provide a basis for consensus. This common core involves the specification of a theory of change together with a number of further alternative causal hypotheses. Causation is established beyond reasonable doubt by collecting evidence to validate, invalidate, or revise the hypothesised explanations, with the goal of rigorously evidencing the links in the actual causal chain.

We argue that, properly applied, approaches which undertake these steps can be used to address attribution of cause and effect. However, we also find that more needs to be done to ensure that small n evaluations minimise the biases which are likely to arise from the collection, analysis and reporting of qualitative data. Drawing on insights from the field of cognitive psychology, we argue that there is scope for considerable bias, both in the way in which respondents report causal relationships, and in the way in which evaluators gather and present data; this points to the need to incorporate explicit and systematic approaches to qualitative data collection and analysis as part of any small n evaluation.

BROADENING THE RANGE OF DESIGNS AND METHODS FOR IMPACT EVALUATIONS

Report of a study commissioned by the Department for International Development. Working Paper 38, April 2012. Available as pdf

(Copy of email) “All

I would like to draw your attention to this important and interesting report by Elliot Stern and colleagues, commissioned by Evaluation Department and Research Division through DFID’s policy research fund.

One of the main challenges we face in raising standards on evaluation in DFID is choosing the best methods and designs for impact evaluation and helping people to think through the practical choices involved. The central dilemma here is how to move towards more rigorous and scientific methods that are actually feasible and workable for the types of programme DFID and our partners fund. As the paper explains, we need approaches that stand up to academic scrutiny, encompass rigour and replicability and which offer a wide and flexible range of suitable methods in different contexts and a clear basis for selecting the best methods to fit the evaluation questions. One well-publicised and influential route advocated by economists in the US and elsewhere is to shift towards more experimental evaluation designs with a stronger focus on quantitative data. This approach has a major advantage of demonstrating and measuring impact in ways that are replicable and stand up to rigorous academic scrutiny. This has to be key for us in DFID as well. However, for many of our programmes it is not easily implemented and this paper helps us to look towards other approaches that will also pass the test of rigour.

This is clearly a difficult challenge, both theoretically and practically and we were lucky to get an exceptionally strong team of eminent experts in evaluation to review the context, theory and practice in this important area. In my view, what the paper from Elliot Stern and his colleagues provides that is valuable and new includes among other things:

a) An authoritative and balanced summary of the challenges and issues faced by evaluators in choosing methods for impact evaluation, making the case for understanding contributory causes, in which development interventions are seen as part of a package of factors that need to be analysed through impact evaluation.

b) A conceptual and practical framework for comparing different methods and designs that does not avoid the tough issues we confront with the actual types of programmes we fund in practice, as opposed to those which happen to be suitable for randomised control trials as favoured by researchers.

c) Guidance on which methods work best in which situations – for example, when experimental methods are the gold standard and when they are not – starting from the premise that the nature of the programme and the nature of the evaluation questions should drive the choice of methods and not the other way around.

We hope you will find the paper useful and that it will help to move forward a debate which has been central in evaluation of international development. Within DFID, we will draw on the findings in finalising our evaluation policy and in providing practical guidance to our evaluation specialists and advisers.

DFID would be interested to hear from those who would like to comment or think they will be able to use and build on this report. Please send any comments to Lina Payne (l-Payne@dfid.gov.uk). Comments will also be welcomed by Professor Elliot Stern (e.stern@lancaster.ac.uk) and his team who are continuing a programme of work in this area.

Regards

Nick York

Head of Evaluation Department ” [DFID]

BEHIND THE SCENES: MANAGING AND CONDUCTING LARGE SCALE IMPACT EVALUATIONS IN COLOMBIA

by Bertha Briceño, Water and Sanitation Program, World Bank; Laura Cuesta, University of Wisconsin-Madison, Orazio Attanasio, University College London
December 2011, 3ie Working Paper 14, available as pdf

“Abstract: As more resources are being allocated to impact evaluation of development programs,the need to map out the utilization and influence of evaluations has been increasingly highlighted. This paper aims at filling this gap by describing and discussing experiences from four large impact evaluations in Colombia on a case study-basis. On the basis of (1) learning from our prior experience in both managing and conducting impact evaluations, (2) desk review of available documentation from the Monitoring & Evaluation system, and (3) structured interviews with government actors, evaluators and program managers, we benchmark each evaluation against eleven standards of quality. From this benchmarking exercise, we derive five key lessons for conducting high quality and influential impact evaluations: (1) investing in the preparation of good terms of reference and identification of evaluation questions; (2) choosing the best methodological approach to address the evaluation questions; (3) adopting mechanisms to ensure evaluation quality; (4) laying out the incentives for involved parties in order to foster evaluation buy-in; and (5) carrying out a plan for quality dissemination.”

Can we obtain the required rigour without randomisation? Oxfam GB’s non-experimental Global Performance Framework

Karl Hughes, Claire Hutchings, August 2011. 3ie Working Paper 13. Available as pdf.

[found courtesy of @3ieNews]

Abstract

“Non-governmental organisations (NGOs) operating in the international development sector need credible, reliable feedback on whether their interventions are making a meaningful difference but they struggle with how they can practically access it. Impact evaluation is research and, like all credible research, it takes time, resources, and expertise to do well, and – despite being under increasing pressure – most NGOs are not set up to rigorously evaluate the bulk of their work. Moreover, many in the sector continue to believe that capturing and tracking data on impact/outcome indicators from only the intervention group is sufficient to understand and demonstrate impact. A number of NGOs have even turned to global outcome indicator tracking as a way of responding to the effectiveness challenge. Unfortunately, this strategy is doomed from the start, given that there are typically a myriad of factors that affect outcome level change. Oxfam GB, however, is pursuing an alternative way of operationalising global indicators. Closing and sufficiently mature projects are being randomly selected each year among six indicator categories and then evaluated, including the extent each has promoted change in relation to a particular global outcome indicator. The approach taken differs depending on the nature of the project. Community-based interventions, for instance, are being evaluated by comparing data collected from both intervention and comparison populations, coupled with the application of statistical methods to control for observable differences between them. A qualitative causal inference method known as process tracing, on the other hand, is being used to assess the effectiveness of the organisation’s advocacy and popular mobilisation interventions. However, recognising that such an approach may not be feasible for all organisations, in addition to Oxfam GB’s desire to pursue complementary strategies, this paper also sets out several other realistic options available to NGOs to step up their game in understanding and demonstrating their impact. These include: 1) partnering with research institutions to rigorously evaluate “strategic” interventions; 2) pursuing more evidence informed programming; 3) using what evaluation resources they do have more effectively; and 4) making modest investments in additional impact evaluation capacity.”

Measuring Impact: Lessons from the MCC for the Broader Impact Evaluation Community

William Savedoff and Christina Droggitis, Centre for Global Development, Aug 2011. Available as pdf (2 pages)

Excerpt:

“One organization that has taken the need for impact evaluation seriously is the Millennium Challenge Corporation. The first of the MCC programs came to a close this fiscal year, and in the next year the impact evaluations associated with them will begin to be published.

Politicians’ responses to the new wave of evaluations will set a precedent, either one that values transparency and encourages aid agencies to be public about what they are learning or one that punishes transparency and encourages agencies to hide findings or simply cease commissioning evaluations.”

Towards a Plurality of Methods in Project Evaluation: A Contextualised Approach to Understanding Impact Trajectories and Efficacy

Michael Woolcock, January 2009, BWPI Working Paper 73

Abstract
“Understanding the efficacy of development projects requires not only a plausible counterfactual, but an appropriate match between the shape of impact trajectory over time and the deployment of a corresponding array of research tools capable of empirically discerning such a trajectory. At present, however, the development community knows very little, other than by implicit assumption, about the expected shape of the impact trajectory from any given sector or project type, and as such is prone to routinely making attribution errors. Randomisation per se does not solve this problem. The sources and manifestations of these problems are considered, along with some constructive suggestions for responding to them. ”

Michael Woolcock is Professor of Social Science and Development Policy, and Research Director of the Brooks World Poverty Institute, at the University of Manchester.

[RD Comment: Well worth reading, more than once]

PS: See also the more recent “Guest Post: Michael Woolcock on The Importance of Time and Trajectories in Understanding Project Effectiveness” on the Development Impact blog, 5th May 2011

Impact Evaluation in Practice

Paul J. Gertler, Sebastian Martinez, Patrick Premand, Laura B. Rawlings, Christel M. J. Vermeersch, World Bank, 2011

Impact Evaluation in Practice is available as downloadable pdf, and can be bought online.

“Impact Evaluation in Practice presents a non-technical overview of how to design and use impact evaluation to build more effective programs to alleviate poverty and improve people’s lives. Aimed at policymakers, project managers and development practitioners, the book offers experts and non-experts alike a review of why impact evaluations are important and how they are designed and implemented. The goal is to further the ability of policymakers and practitioners to use impact evaluations to help make policy decisions based on evidence of what works the most effectively.

The book is accompanied by a set of training material — including videos and power point presentations — developed for the “Turning Promises to Evidence” workshop series of the Office of the Chief Economist for Human Development. It is a reference and self-learning tool for policy-makers interested in using impact evaluations and was developed to serve as a manual for introductory courses on impact evaluation as well as a teaching resource for trainers in academic and policy circles.

CONTENTS
PART ONE. INTRODUCTION TO IMPACT EVALUATION
Chapter 1. Why Evaluate?
Chapter 2. Determining Evaluation Questions
PART TWO. HOW TO EVALUATE
Chapter 3. Causal Inference and Counterfactuals
Chapter 4. Randomized Selection Methods
Chapter 5. Regression Discontinuity Design
Chapter 6. Difference-in-Differences
Chapter 7. Matching
Chapter 8. Combining Methods
Chapter 9. Evaluating Multifaceted Programs
PART THREE. HOW TO IMPLEMENT AN IMPACT EVALUATION
Chapter 10. Operationalizing the Impact Evaluation Design
Chapter 11. Choosing the Sample
Chapter 12. Collecting Data
Chapter 13. Producing and Disseminating Findings
Chapter 14. Conclusion

Learning how to learn: eight lessons for impact evaluations that make a difference

ODI Background Notes, April 2011. Authors: Ben Ramalingam

“This Background Note outlines key lessons on impact evaluations, utilisation-focused evaluations and evidence-based policy. While methodological pluralism is seen as the key to effective impact evaluation in development, the emphasis here is not methods per se. Instead, the focus is on the range of factors and issues that need to be considered for impact evaluations to be used in policy and practice – regardless of the method employed. This Note synthesises research by ODI, ALNAP, 3ie and others to outline eight key lessons for consideration by all of those with an interest in impact evaluation and aid effectiveness”. 8 pages

The 8 lessons:
Lesson 1: Understand the key stakeholders
Lesson 2: Adapt the incentives
Lesson 3: Invest in capacities and skills
Lesson 4: Define impact in ways that relate to the specific context
Lesson 5: Develop the right blend of methodologies
Lesson 6: Involve those who matter in the decisions that matter
Lesson 7: Communicate effectively
Lesson 8: Be persistent and lexible

See also Ben’s Thursday, April 14, 2011 blog posting: When will we learn how to learn?

[RD comments on this paper]

1. The case for equal respect for different methodologies can be overstated. I feel this is the case when Ben argues that “First, it has been shown that the knowledge that results from any type of particular impact evaluation methodology is no more rigorous or widely applicable than the results from any other kind of methodology.” While it is important that evaluation results affect subsequent policy and practice their adoption and use is not the only outcome measure for evaluations. We also want those evaluation results have some reliability and validity, that will stand the test of time and be generalisable to other settings with some confidence. An evaluation could affect policy and practice without necessarily being good quality , defined in terms of reliability and valdity.

Nevertheless, I like Ben’s caution about focusing too much on evaluations as outputs and the need to focus more on outcomes, the use and uptake of evaluations.

2. The section of Ben’s paper that most attracted my interest was the story about the Joint Evaluation of Emergency Assistance to Rwanda, and how the evaluation team managed to ensure it became “one of the most influential evaluations in the aid sector”. We need more case studies of these kinds of events and then a systematic review of those case studies.

3. When I read statements various like this: “As well as a supply of credible evidence, effort needs to be made to understand the demand for evidence” I have an image in my mind of evaluators as humble supplicants, at the doorsteps of the high and mighty. Isn’t it about time that evaluators turned around and started demanding that policy makers disclose the evidence base of their existing policies? As I am sure has been said by others before, when you look around there does not seem to be much evidence of evidence based policy making. Norms and expectations need to be built up, and then there may be more interest in what evaluations have to say. A more assertive and questioning posture is needed.

Sound expectations: from impact evaluations to policy change

3ie Working paper # 12, 2011, by Center for the Implementation of Public Policies Promoting Equity and Growth (CIPPEC) Emails: vweyrauch@cippec.org, gdiazlangou@cippec.org

Abstract

“This paper outlines a comprehensive and flexible analytical conceptual framework to be used in the production of a case study series. The cases are expected to identify factors that help or hinder rigorous impact evaluations (IEs) from influenc ing policy and improving policy effectiveness. This framework has been developed to be adaptable to the reality of developing countries. It is aimed as an analytical-methodological tool which should enable researchers in producing case studies which identify factors that affect and explain impact evaluations’ policy influence potential. The approach should also enable comparison between cases and regions to draw lessons that are relevant beyond the cases themselves.

There are two different , though interconnected, issues that must be dealt with while discussing the policy influence of impact evaluations. The first issue has to do with the type of policy influence pursued and, aligned with this, the determination of the accomplishment (or not) of the intended influence. In this paper, we first introduce the discussion regarding the different types of policy influence objectives that impact evaluations usually pursue, which will ultimately help determine whether policy influence was indeed achieved. This discussion is mainly centered around whether an impact evaluation has had impact on policy. The second issue is related to the identification of the factors and forces that mediate the policy influence efforts and is focused on why the influence was achieved or not. We have identified and systematized the mediating factors and forces, and we approach them in this paper from the demand and supply perspective, considering as well, the intersection between these two.

The paper concludes that, ultimately, the fulfillment of policy change based on the results of impact evaluations is determined by the interplay of the policy influenc e objectives with the factors that affect the supply and demand of research in the policymaking process.

The paper is divided in four sections. A brief introduction is followed by an analysis of policy influence as an objective of research, specifically, impact evaluations. The third section identifies factors and forces that enhance or undermine influence in public policy decision making. The research ends up pointing out the importance of measuring policy influence and enumerates a series of challenges that have to be further assessed.”