Addressing attribution of cause and effect in small n impact evaluations: towards an integrated framework

Howard White and Daniel Phillips,  International Initiative for Impact Evaluation, Working Paper 15, May 2012 Available as MSWord doc

Abstract

With the results agenda in the ascendancy in the development community, there is an increasing need to demonstrate that development spending makes a difference, that it has an impact. This requirement to demonstrate results has fuelled an increase in the demand for, and production of, impact evaluations. There exists considerable consensus among impact evaluators conducting large n impact evaluations involving tests of statistical difference in outcomes between the treatment group and a properly constructed comparison group. However, no such consensus exists when it comes to assessing attribution in small n cases, i.e. when there are too few units of assignment to permit tests of statistical difference in outcomes between the treatment group and a properly constructed comparison group.

We examine various evaluation approaches that could potentially be suitable for small n analysis and find that a number of them share a methodological core which could provide a basis for consensus. This common core involves the specification of a theory of change together with a number of further alternative causal hypotheses. Causation is established beyond reasonable doubt by collecting evidence to validate, invalidate, or revise the hypothesised explanations, with the goal of rigorously evidencing the links in the actual causal chain.

We argue that, properly applied, approaches which undertake these steps can be used to address attribution of cause and effect. However, we also find that more needs to be done to ensure that small n evaluations minimise the biases which are likely to arise from the collection, analysis and reporting of qualitative data. Drawing on insights from the field of cognitive psychology, we argue that there is scope for considerable bias, both in the way in which respondents report causal relationships, and in the way in which evaluators gather and present data; this points to the need to incorporate explicit and systematic approaches to qualitative data collection and analysis as part of any small n evaluation.


 


Test, Learn, Adapt: Developing Public Policy with Randomised Controlled Trials

Laura Haynes, Owain Service,  Ben Goldacre, David Torgerson. Cabinet Office. Behavioral Insights Team. 2012. Available as pdf

Contents
Executive Summary
Introduction
Part 1 – What is an RCT and why are they important?
What is a randomised controlled trial?
The case for RCTs-debunking some myths:
1.We don’t necessarily know‘what works’
2. RCTs don’t have to cost a lot of money
3 There are ethical advantages to using RCTs
4. RCTs do not have to be complicated or difficult to run
PART II-Conducting an RCT: 9 key steps
Test
Step1: Identify two or more policy interventions to compare
Step 2: Define the outcome that the policy is intended to influence
Step 3: Decide on the randomisation unit
Step 4: Determine how many units are rquired for robust results
Step 5: Assign each unit to one of the polivy interventions using a robustly random method
Step 6: Introduce the poicy interventions to the assigned groups
Learn
Step 7: Measure the results and determine the impact of the policy interventions
Adapt
Step 8: Adapt your policy intervention to reflect your findings
Step 9: Return to step 1

BROADENING THE RANGE OF DESIGNS AND METHODS FOR IMPACT EVALUATIONS

Report of a study commissioned by the  Department for International Development. Working Paper 38, April 2012. Available as pdf

(Copy of email) “All

I would like to draw your attention to this important and interesting report by Elliot Stern and colleagues, commissioned by Evaluation Department and Research Division through DFID’s policy research fund.

One of the main challenges we face in raising standards on evaluation in DFID is choosing the best methods and designs for impact evaluation and helping people to think through the practical choices involved. The central dilemma here is how to move towards more rigorous and scientific methods that are actually feasible and workable for the types of programme DFID and our partners fund. As the paper explains, we need approaches that stand up to academic scrutiny, encompass rigour and replicability and which offer a wide and flexible range of suitable methods in different contexts and a clear basis for selecting the best methods to fit the evaluation questions. One well-publicised and influential route advocated by economists in the US and elsewhere is to shift towards more experimental evaluation designs with a stronger focus on quantitative data. This approach has a major advantage of demonstrating and measuring impact in ways that are replicable and stand up to rigorous academic scrutiny. This has to be key for us in DFID as well. However, for many of our programmes it is not easily implemented and this paper helps us to look towards other approaches that will also pass the test of rigour.

This is clearly a difficult challenge, both theoretically and practically and we were lucky to get an exceptionally strong team of eminent experts in evaluation to review the context, theory and practice in this important area. In my view, what the paper from Elliot Stern and his colleagues provides that is valuable and new includes among other things:

a) An authoritative and balanced summary of the challenges and issues faced by evaluators in choosing methods for impact evaluation, making the case for understanding contributory causes, in which development interventions are seen as part of a package of factors that need to be analysed through impact evaluation.

b) A conceptual and practical framework for comparing different methods and designs that does not avoid the tough issues we confront with the actual types of programmes we fund in practice, as opposed to those which happen to be suitable for randomised control trials as favoured by researchers.

c) Guidance on which methods work best in which situations – for example, when experimental methods are the gold standard and when they are not – starting from the premise that the nature of the programme and the nature of the evaluation questions should drive the choice of methods and not the other way around.

We hope you will find the paper useful and that it will help to move forward a debate which has been central in evaluation of international development. Within DFID, we will draw on the findings in finalising our evaluation policy and in providing practical guidance to our evaluation specialists and advisers.

DFID would be interested to hear from those who would like to comment or think they will be able to use and build on this report. Please send any comments to Lina Payne (l-Payne@dfid.gov.uk). Comments will also be welcomed by Professor Elliot Stern (e.stern@lancaster.ac.uk) and his team who are continuing a programme of work in this area.

Regards

Nick York

Head of Evaluation Department ” [DFID]

MDGs 2.0: What Goals, Targets, and Timeframe?

Jonathan Karver, Charles Kenny, and Andy Sumner Available as pdf
Centre for Global Development, Working Paper 297, 2012

Abstract
“The Millennium Development Goals (MDGs) are widely cited as the primary yardstick against which advances in international development efforts are to be judged. At the same time, the Goals
will be met or missed by 2015.  It is not too early to start asking ‘what next?’  This paper builds on a discussion that has already begun to address potential approaches, goals and target indicators to help inform the process of  developing a second generation of  MDGs or ‘MDGs 2.0.’ The paper outlines potential goal areas based on the original Millennium Declaration, the timeframe for any MDGs 2.0 and attempts to calculate some reasonable targets associated with those goal areas.”

Evaluating the impact of aid to Africa: lessons from the Millennium Villages

3 July 2012 17:00-18:30 (GMT+01 (BST)) – Public event, Overseas Development Institute and screened live online

Register to attend

“At the turn of the century, Jeffrey Sachs of Columbia University, in partnership with the United Nations, established integrated rural development projects, known as Millennium Villages in ten African countries. When they came to be evaluated in 2011, an intense row broke out between development experts about their impact and sustainability.

ODI and the Royal Africa Society are delighted to host Michael Clemens who will argue that aid projects in Africa need much more careful impact evaluations that are transparent, rigorous, and cost-effective. Our panel of experts will also discuss the Millenium Villages project within the wider context of international aid to Africa, analysing other development models and questioning the impact of each one.”

Where do European Institutions rank on donor quality?

ODI Background Notes, June 2012. Authors: Matthew Geddes

“This paper investigates how to interpret, respond to and use the evidence provided by recent donor quality indices, using the European Institutions as an example.

The debate on aid impact is longstanding and the tightening of budgets in the current financial crisis has led to a renewed focus on aid effectiveness, with the most recent iterations including three academic indices that rank the ‘quality’ of donors as well as the Multilateral Aid Review (MAR, 2011) by the UK Department for International Development (DFID).

These exercises are being used to assess donor comparative performance and foster international norms of good practice. The MAR is also being used to guide the allocation of DFID funds and identify areas of European Institution practice that DFID seeks to reform. This paper investigates how to interpret, respond to and use the evidence they provide, focusing on the European Institutions, major donors them­selves and, taken together, DFIDs largest multilat­eral partner.

The paper presents scores for the European Institutions, reassesses this evidence and identifies issues that could make the evidence less robust, before  working through several examples to see how the evidence that the indices present might best be applied in light of these criticisms.

The paper concludes that the indices’ conflicting results suggest that the highly complex problem of linking donor practices to aid impact is probably not a problem best suited to an index approach. On their own the indices are limited in what they can be used to say robustly, and together, produce a picture which is not helpful for policy-makers, especially when being used to allocate aid funding.”

New Directions for Evaluation: Promoting Valuation in the Public Interest: Informing Policies for Judging Value in Evaluation

Spring 2012, Volume 2012, Issue 133, Pages 1–129 Buy here

Editor’s Notes – George Julnes

  1. Editor’s notes (pages 1–2) Abstract PDF(22K)

Research Articles

  1. Managing valuation (pages 3–15)  George JulnesAbstract PDF(77K) References
    >
  2. The logic of valuing (pages 17–28)  Michael Scriven Abstract  PDF(63K) References
  3. The evaluator’s role in valuing: Who and with whom (pages 29–41)Marvin C. Alkin, Anne T. Vo and Christina A. Christie Abstract PDF(74K) References
  4. Step arounds for common pitfalls when valuing resources used versus resources produced (pages 43–52)Brian T. Yates Abstract PDF(60K) References
  5. When one must go: The Canadian experience with strategic review and judging program value (pages 65–75)François Dumaine Abstract  PDF(59K) References
  6. Valuing, evaluation methods, and the politicization of the evaluation process (pages 77–83)Eleanor Chelimsky Abstract PDF(46K) References
  7. Valuation and the American Evaluation Association: Helping 100 flowers bloom, or at least be understood? (pages 85–90)Michael Morris Abstract PDF(40K) References

LINKING MONITORING AND EVALUATION TO IMPACT EVALUATION

Burt Perrin, Impact Evaluation Notes  No. 2. April 2012 Rockefeller Foundation and Interaction. Available as pdf

Summary

This is the second guidance note in a four-part series of notes related to impact evaluation developed by InterAction with financial support from the Rockefeller Foundation.This second guidance note, Linking Monitoring and Evaluation to Impact Evaluation, illustrates the relationship between routine M&E and impact evaluation – in particular, how both monitoring and evaluation activities can support meaningful and valid impact evaluation. M&E has a critical role to play in impact evaluation, such as: identifying when and under what circumstances it would be possible and appropriate to undertake an impact evaluation; contributing essential data to conduct an impact evaluation, such as baseline data of various forms and information about the nature of the intervention; and contributing necessary information to interpret and apply findings from impact evaluation.

Contents
Introduction 1
1. How can monitoring and other forms of evaluation support impact evaluation?
1.1. Main characteristics of monitoring, evaluation, and impact evaluation
1.2. How M&E can contribute to impact evaluation
2. How to build impact evaluation into M&E thinking and practices
2.1. Articulate the theory of change
2.2. Identify priorities for undertaking impact evaluation
2.3. Identify information/data needs
2.4. Start with what you have
2.5. Design and implement the impact evaluation, analyze and interpret the findings
2.6. Use the findings
2.7. Review, reflect, and update
3. Engaging all parts of the organization
3.1. M&E: A core management function requiring senior management leadership and support
3.2. An active role for program staff is required
Summary
References and Other Useful Resources
Annex 1 – Contribution analysis

 

Integrated Monitoring: A Practical Manual for Organisations That Want to Achieve Results

Written by Sonia Herrero, InProgress, Berlin, April 2012. 43 pages Available as pdf

“The aim of this manual is to help those working in the non-profit sector — non-governmental organisations (NGOs) and other civil society organisations (CSOs) — and the donors which fund them, to observe more accurately what they are achieving through their efforts and to ensure  that they make a positive difference in the lives of the people they want to help. Our interest in writing this guide has grown out of the desire to help bring some conceptual clarity to
the concepts of monitoring and to determine ways in which they can be harnessed and used more effectively by non-profit practitioners.

The goal is to help organisations build monitoring and evaluation into all your project management efforts. We want to demystify the monitoring process and make it as simple and accessible as possible. We have made a conscious choice to avoid technical language, and instead use images and analogies that are easier to grasp. There is a glossary at the end of the manual which contains the definitions of any terms you may be unfamiliar with. This manual is organised into two parts. The first section  covers the ‘what’ and ‘why’ of monitoring and  evaluation; the second addresses how to do it.”

These materials may be freely used and copied by non-profit organisations for capacity building purposes, provided that inProgress and authorship are acknowledged. They may not be reproduced for commercial gain.

Contents
Introduction
I. KEY ASPECTS OF MONITORING
1. What is Monitoring?
2. Why Do We Monitor and For Whom?
3. Who is Involved?
4. How Does it Work?
5. When Do We Monitor?
5. What Do We Monitor?
5.1 Monitoring What We DoII. HOW DO WE MONITOR?
1. Steps for Setting Up a Monitoring S   2. How to Monitor the Process and the Outputs
3. How to Monitor the Achievemen 3.1 Define Results/Outcomes
3.2 Define Indicators for Results
4. Prepare a Detailed Monitoring Plan
5. Identify Sources of Information
6. Data Collection
6.1 Tools for Data Compilation
7. Reflection and Analysis
7.1 Documenting and Sharing
8. Learning and Reviewing
8.1 Learning
8.2 Reviewing
9. Evaluation
Conclusion
Glossary
References

Magenta Book – HM Treasury guidance on evaluation for Central Government (UK)

27 April 2011

“The Magenta Book is HM Treasury guidance on evaluation for Central Government, but will also be useful for all policy makers, including in local government, charities and the voluntary sectors. It sets out the key issues to consider when designing and managing evaluations, and the presentation and interpretation of evaluation results. It describes why thinking about evaluation before and during the policy design phase can help to improve the quality of evaluation results without needing to hinder the policy process.

The book is divided into two parts.

Part A is designed for policy makers. It sets out what evaluation is, and what the benefits of good evaluation are. It explains in simple terms the requirements for good evaluation, and some straightforward steps that policy makers can take to make a good evaluation of their intervention more feasible.

Part B is more technical, and is aimed at analysts and interested policy makers. It discusses in more detail the key steps to follow when planning and undertaking an evaluation and how to answer evaluation research questions using different evaluation research designs. It also discusses approaches to the interpretation and assimilation of evaluation evidence.

The Magenta Book will be supported by a wide range of forthcoming supplementary guidance containing more detailed guidance on particular issues, such as statistical analysis and sampling. Until these are available please refer to the relevant chapters of the original Magenta Book.

The Magenta Book is available for download in PDF format:

%d bloggers like this: