UK centre of excellence for evaluation of international development

Prior Information Notice

DFID is planning to establish a Centre of Excellence to assist with our commitment to use high quality evaluation to maximise the impact UK funded international development. DFID would like to consult with a wide range of research networks and experts in the field, and invite ideas and suggestions to help develop our ideas further before formally issuing invitations to tender to the market for this opportunity. There are two main channels for interested parties to contribute to this process:

1. Comments and views on the draft scope can be fed in through the DFID supplier portal by registering for this opportunity at https://supplierportal.dfid.gov.uk/selfservice/ and accessing the documentation.

2. DFID will hold bilateral discussions and/or information sharing sessions with interested parties depending on demand.

Please ensure all comments are fed in through the DFID portal by 31st August 2012. Once the consultation process is complete and the scope of the Centre of Excellence fully defined, DFID plans to a run a competitive tender for this work. The target date for establishment of the Centre is mid 2013.

RD Comment: Why is this consultation process not more open? Why do particpants have to register as potential suppliers, when many who might be interested to read and comment on the proposal would probably not necessarily want to become suppliers?

DFID How To Note: Reviewing and Scoring Projects Introduction

November 2011. Available as pdf.

” Introduction This guidance is to help DFID staff, project partners and other stakeholders use the scoring system and complete the latest templates when undertaking an Annual Review (AR) or Project Completion Review (PCR – but formerly known as Project Completion Report)) for projects due for review from January 2012.  This guidance applies to all funding types however separate templates are available for core contributions to multilateral organisations.  The guidance does not attempt to cover in detail how to organise the review process, although some help is provided.

Contents:
Principal changes from previous templates        2
Introduction        2
What is changing?        3
What does it involve?        4
Using the logframe as a monitoring tool        5
If you don?t have a logframe        6
Assessing the evidence base        6
The Scoring System        6
Updating ARIES        7
Transparency and Publishing AR?s and PCR?s       7
Projects below £1m approved prior to the new Business Case format   8
Multilateral Core Contributions        9
Filling in the templates/ Guidance on the template contents     9
Completing the AR/PCR and information onto ARIES     19
Annex A:  Sample Terms of Reference ”

RD Comment: To my surprise, although this How To Note gives advice on how to assign weights to each outputs, it does not explain how these interact with output scores to generate a weighted achievement score for each output. Doing so would help explain why the weightings are being requested. At present weightings are requested but their purpose is not explained.

The achievement scoring system is a definite improvement on the previous system. The focus is now on actual achievement to date rather than expected achievement by the end of the project, and the scale is evenly balanced, with the top and bottom of the scale representing over and under-achievement respectively.

Models of Causality and Causal Inference

by BarbaraBefani. An annex to BROADENING THE RANGE OF DESIGNS AND METHODS FOR IMPACT EVALUATIONS. Report of a study commissioned by the Department for International Development, APRIL 2012 ,  by Elliot Stern (Team Leader), Nicoletta Stame, John Mayne, Kim Forss, Rick Davies, Barbara Befani

Introduction

The notion of causality has given rise to disputes among philosophers which still continue today. At the same time, attributing causation is an everyday activity of the utmost importance for humans and other species, that most of us carry out successfully outside the corridors of academic departments. How do we do that? And what are the philosophers arguing about? This chapter will attempt to provide some answers, by reviewing some of the notions of causality in the philosophy of science and “embedding” them into everyday activity. It will also attempt to connect these with impact evaluation practices, without embracing one causation approach in particular, but stressing strengths and weaknesses of each and outlining how they relate to one another. It will be stressed how both everyday life, social science and in particular impact evaluation have something to learn from all these approaches, each illuminating on single, separate, specific aspects of the relationship between cause and effect. The paper is divided in three parts: the first addresses notions of causality that focus on the simultaneous presence of a single cause and the effect; alternative causes are rejected depending on whether they are observed together with effect. The basic causal unit is the single cause, and alternatives are rejected in the form of single causes. This model includes multiple causality in the form of single independent contributions to the effect. In the second part, notions of causality are addressed that focus on the simultaneous presence of multiple causes that are linked to the effect as a “block” or whole: the block can be either necessary or sufficient (or neither) for the effect, and single causes within the block can be necessary for a block to be sufficient (INUS causes). The third group discusses models of causality where simultaneous presence is not enough: in order to be defined as such, causes need to be shown to actively manipulate / generate the effect, and focus on how the effect is produced, how the change comes about. The basic unit here – rather than a single cause or a package – is the causal chain: fine-grained information is required on the process leading from an initial condition to the final effect.

The second type of causality is something in-between the first and third: it is used when there is no finegrained knowledge on how the effect is manipulated by the cause, yet the presence or absence of a number of conditions can be still spotted along the causal process, which is thus more detailed than the bare “beginning-end” linear representation characteristic of the successionist model.

 

RD Comment: I strongly recommend this paper

For more on necessary and/or sufficient conditions see this blog posting which shows how different combinations of causal conditions can be visually represented and recognised, using Decision Trees

 

DFID’s Approach to Impact Evaluation – Part I

[from Development Impact: News, views, methods, and insights from the world of impact evaluation.  Click here https://blogs.worldbank.org/impactevaluations/node/838 to view full story.
As part of a new series looking how institutions are approaching impact evaluation, DI virtually sat down with Nick York, Head of Evaluation and Gail Marzetti, Deputy Head, Research and Evidence Division
Development Impact (DI): There has been an increasing interest in impact evaluation (defined as experimental/quasi-experimental analysis of program effects) in DFID. Going forward, what do you see as impact evaluation’s role in how DFID evaluates what it does? How do you see the use of impact evaluation relative to other methods?  
Nick YorkThe UK has been at the forefront among European countries in promoting the use of impact evaluation in international development and it is now a very significant part of what we do – driven by the need to make sure our decisions and those of our partners are based on rigorous evidence.   We are building in prospective evaluation into many of our larger and more innovative operational programmes – we have quite a number of impact evaluations underway or planned commissioned from our country and operational teams. We also support international initiatives including 3ie where the UK was a founder member and a major funder, the Strategic Impact Evaluation Fund with the World Bank on human development interventions and NONIE, the network which brings together developing country experts on evaluation to share experiences on impact evaluation with professionals in the UN, bilateral and multilateral donors.
DI: Given the cost of impact evaluation, how do you choose which projects are (impact) evaluated?
NY:  We focus on those which are most innovative – where the evidence base is considered to be weak and needs to be improved – and those which are large or particularly risky. Personally, I think the costs of impact evaluation are relatively low compared to the benefits they can generate, or compared to the costs of running programmes using interventions which are untested or don’t work.   I also believe that rigorous impact evaluations generate an output – high quality evidence – which is a public good so although the costs to the commissioning organization can be high they represent excellent value for money for the international community. This is why 3ie, which shares those costs among several organisations, is a powerful concept.

AusAID: Establishment of Independent Evaluation Committee

[from AusAID website, 11 May 2012]

Foreign Minister Bob Carr has announced the establishment of an Independent Evaluation Committee (IEC) to strengthen the independence and credibility of the work of the Office of Development Effectiveness (ODE).

Chaired by Jim Adams—a former Vice President of the World Bank—the Committee will oversee ODE in assessing the effectiveness and evaluating the impact of the Australian aid program.

In its policy statement, An Effective Aid Program for Australia, the Government committed to establishing an IEC in response to a recommendation from last year’s Independent Review of Aid Effectiveness. This is part of the Government’s commitment to improve the aid program’s evaluation function in order to deliver more efficient and effective aid.

The IEC will be an advisory body with a whole of government mandate, providing independent expert evaluation advice to the Development Effectiveness Steering Committee, which provides advice to government on Overseas Development Assistance priorities and effectiveness.

It will also oversee the work of ODE in planning, commissioning, managing and delivering a high quality evaluation program. The IEC will provide advice on ODE’s evaluation strategy and work plan. It will also oversee ODE’s preparation of an annual evaluation summary and quality assurance report.

The IEC will meet four times a year, with the first meeting of the IEC scheduled for June 2012.

Who is on the IEC?

The Independent Evaluation Committee has three external members (including the chair) and one senior AusAID representative. External members are appointed by the Minister for Foreign Affairs, while the Director General of AusAID appoints the AusAID representative. Given the IEC’s whole of government mandate, a representative from the Department of Finance and Deregulation will be invited to attend meetings as an observer.

The external members are Jim Adams (Chair), Professor Patricia Rogers and Dr Wendy Jarvie. They contribute a mix of solid international development and aid effectiveness experience, high-level evaluation expertise and strong public sector experience to the IEC.

Read the member biographies [PDF 330kb]
Read the member biographies [Word 89kb]

How will the IEC work?

The IEC will oversee the work program of ODE and reports to the DESC. The terms of reference for the IEC set out its mandate, roles and responsibilities. The terms of reference were  endorsed by the DESC before being approved by the Minister.

Read the terms of reference [PDF 549kb]
Read the terms of reference [Word 59kb]

 

Addressing attribution of cause and effect in small n impact evaluations: towards an integrated framework

Howard White and Daniel Phillips,  International Initiative for Impact Evaluation, Working Paper 15, May 2012 Available as MSWord doc

Abstract

With the results agenda in the ascendancy in the development community, there is an increasing need to demonstrate that development spending makes a difference, that it has an impact. This requirement to demonstrate results has fuelled an increase in the demand for, and production of, impact evaluations. There exists considerable consensus among impact evaluators conducting large n impact evaluations involving tests of statistical difference in outcomes between the treatment group and a properly constructed comparison group. However, no such consensus exists when it comes to assessing attribution in small n cases, i.e. when there are too few units of assignment to permit tests of statistical difference in outcomes between the treatment group and a properly constructed comparison group.

We examine various evaluation approaches that could potentially be suitable for small n analysis and find that a number of them share a methodological core which could provide a basis for consensus. This common core involves the specification of a theory of change together with a number of further alternative causal hypotheses. Causation is established beyond reasonable doubt by collecting evidence to validate, invalidate, or revise the hypothesised explanations, with the goal of rigorously evidencing the links in the actual causal chain.

We argue that, properly applied, approaches which undertake these steps can be used to address attribution of cause and effect. However, we also find that more needs to be done to ensure that small n evaluations minimise the biases which are likely to arise from the collection, analysis and reporting of qualitative data. Drawing on insights from the field of cognitive psychology, we argue that there is scope for considerable bias, both in the way in which respondents report causal relationships, and in the way in which evaluators gather and present data; this points to the need to incorporate explicit and systematic approaches to qualitative data collection and analysis as part of any small n evaluation.


 


Test, Learn, Adapt: Developing Public Policy with Randomised Controlled Trials

Laura Haynes, Owain Service,  Ben Goldacre, David Torgerson. Cabinet Office. Behavioral Insights Team. 2012. Available as pdf

Contents
Executive Summary
Introduction
Part 1 – What is an RCT and why are they important?
What is a randomised controlled trial?
The case for RCTs-debunking some myths:
1.We don’t necessarily know‘what works’
2. RCTs don’t have to cost a lot of money
3 There are ethical advantages to using RCTs
4. RCTs do not have to be complicated or difficult to run
PART II-Conducting an RCT: 9 key steps
Test
Step1: Identify two or more policy interventions to compare
Step 2: Define the outcome that the policy is intended to influence
Step 3: Decide on the randomisation unit
Step 4: Determine how many units are rquired for robust results
Step 5: Assign each unit to one of the polivy interventions using a robustly random method
Step 6: Introduce the poicy interventions to the assigned groups
Learn
Step 7: Measure the results and determine the impact of the policy interventions
Adapt
Step 8: Adapt your policy intervention to reflect your findings
Step 9: Return to step 1

BROADENING THE RANGE OF DESIGNS AND METHODS FOR IMPACT EVALUATIONS

Report of a study commissioned by the  Department for International Development. Working Paper 38, April 2012. Available as pdf

(Copy of email) “All

I would like to draw your attention to this important and interesting report by Elliot Stern and colleagues, commissioned by Evaluation Department and Research Division through DFID’s policy research fund.

One of the main challenges we face in raising standards on evaluation in DFID is choosing the best methods and designs for impact evaluation and helping people to think through the practical choices involved. The central dilemma here is how to move towards more rigorous and scientific methods that are actually feasible and workable for the types of programme DFID and our partners fund. As the paper explains, we need approaches that stand up to academic scrutiny, encompass rigour and replicability and which offer a wide and flexible range of suitable methods in different contexts and a clear basis for selecting the best methods to fit the evaluation questions. One well-publicised and influential route advocated by economists in the US and elsewhere is to shift towards more experimental evaluation designs with a stronger focus on quantitative data. This approach has a major advantage of demonstrating and measuring impact in ways that are replicable and stand up to rigorous academic scrutiny. This has to be key for us in DFID as well. However, for many of our programmes it is not easily implemented and this paper helps us to look towards other approaches that will also pass the test of rigour.

This is clearly a difficult challenge, both theoretically and practically and we were lucky to get an exceptionally strong team of eminent experts in evaluation to review the context, theory and practice in this important area. In my view, what the paper from Elliot Stern and his colleagues provides that is valuable and new includes among other things:

a) An authoritative and balanced summary of the challenges and issues faced by evaluators in choosing methods for impact evaluation, making the case for understanding contributory causes, in which development interventions are seen as part of a package of factors that need to be analysed through impact evaluation.

b) A conceptual and practical framework for comparing different methods and designs that does not avoid the tough issues we confront with the actual types of programmes we fund in practice, as opposed to those which happen to be suitable for randomised control trials as favoured by researchers.

c) Guidance on which methods work best in which situations – for example, when experimental methods are the gold standard and when they are not – starting from the premise that the nature of the programme and the nature of the evaluation questions should drive the choice of methods and not the other way around.

We hope you will find the paper useful and that it will help to move forward a debate which has been central in evaluation of international development. Within DFID, we will draw on the findings in finalising our evaluation policy and in providing practical guidance to our evaluation specialists and advisers.

DFID would be interested to hear from those who would like to comment or think they will be able to use and build on this report. Please send any comments to Lina Payne (l-Payne@dfid.gov.uk). Comments will also be welcomed by Professor Elliot Stern (e.stern@lancaster.ac.uk) and his team who are continuing a programme of work in this area.

Regards

Nick York

Head of Evaluation Department ” [DFID]

MDGs 2.0: What Goals, Targets, and Timeframe?

Jonathan Karver, Charles Kenny, and Andy Sumner Available as pdf
Centre for Global Development, Working Paper 297, 2012

Abstract
“The Millennium Development Goals (MDGs) are widely cited as the primary yardstick against which advances in international development efforts are to be judged. At the same time, the Goals
will be met or missed by 2015.  It is not too early to start asking ‘what next?’  This paper builds on a discussion that has already begun to address potential approaches, goals and target indicators to help inform the process of  developing a second generation of  MDGs or ‘MDGs 2.0.’ The paper outlines potential goal areas based on the original Millennium Declaration, the timeframe for any MDGs 2.0 and attempts to calculate some reasonable targets associated with those goal areas.”

Where do European Institutions rank on donor quality?

ODI Background Notes, June 2012. Authors: Matthew Geddes

“This paper investigates how to interpret, respond to and use the evidence provided by recent donor quality indices, using the European Institutions as an example.

The debate on aid impact is longstanding and the tightening of budgets in the current financial crisis has led to a renewed focus on aid effectiveness, with the most recent iterations including three academic indices that rank the ‘quality’ of donors as well as the Multilateral Aid Review (MAR, 2011) by the UK Department for International Development (DFID).

These exercises are being used to assess donor comparative performance and foster international norms of good practice. The MAR is also being used to guide the allocation of DFID funds and identify areas of European Institution practice that DFID seeks to reform. This paper investigates how to interpret, respond to and use the evidence they provide, focusing on the European Institutions, major donors them­selves and, taken together, DFIDs largest multilat­eral partner.

The paper presents scores for the European Institutions, reassesses this evidence and identifies issues that could make the evidence less robust, before  working through several examples to see how the evidence that the indices present might best be applied in light of these criticisms.

The paper concludes that the indices’ conflicting results suggest that the highly complex problem of linking donor practices to aid impact is probably not a problem best suited to an index approach. On their own the indices are limited in what they can be used to say robustly, and together, produce a picture which is not helpful for policy-makers, especially when being used to allocate aid funding.”

%d bloggers like this: