Evaluating simple interventions that turn out to be not so simple

Conditional Cash Transfer (CCT) programs have been cited in the past as examples of projects that are suitable for testing via randomised control trials. They are relatively simple interventions that can be delivered in a standardised manner. Or so it seemed.

Last year Lant Pritchett, Salimah Samji and Jeffrey Hammer wrote this interesting (if at times difficult to read) paper “It’s All About MeE: Using Structured Experiential Learning (‘e’) to Crawl the Design Space“(the abstract is reproduced below). In the course of that paper they argued that CCT programs are not as simple as they might seem. Looking at three real life examples they identified at least 10 different characteristics of CCTs that need to be specified correctly in order for them to work as expected. Some of these involve binary choices (whether to do x or y) and some involve tuning of a numerical variable. This means there were at least 2 to the power of 10 i.e. 1024 different possible designs. They also pointed out that while changes to some of these characteristics make only a small difference to the results achieved others, including some binary choices, can make quite major differences. In other words, overall it may well be a rugged rather than a smooth design space. The question then occurs, how well are RCTs suited to exploring such spaces?

Today the World Bank Development Blog posted an interesting confirmation of the point made in Pritchett et al paper, in a blog posting titled:  Defining Conditional Cash Transfer Programs: An Unconditional Mess. Basically they are in effect pointing out that the design space is even way more complicated than Princhett et al describe!. They conclude

So, if you’re a donor or a policymaker, it is important not to frame your question to be about the relative effectiveness of “conditional” vs. “unconditional” cash transfer programs: the line between these concepts is too blurry. It turns out that your question needs to be much more precise than that. It is better to define the feasible range of options available to you first (politically, ethically, etc.), and then go after evidence of relative effectiveness of design options along the continuum from a pure UCT to a heavy-handed CCT. Alas, that evidence is the subject of another post…

So stay tuned fore their next installment. Of course you could quibble with the fact that even this conclusion is a bit optimistic, in that it talks about a a continuum of design options, when in fact it is multi-dimensional space  with both smooth and rugged bits

PS: Here is the abstract for the Printchett paper:

“There is an inherent tension between implementing organizations—which have specific objectives and narrow missions and mandates—and executive organizations—which provide resources to multiple implementing organizations. Ministries of finance/planning/budgeting allocate across ministries and projects/programmes within ministries, development organizations allocate across sectors (and countries), foundations or philanthropies allocate across programmes/grantees. Implementing organizations typically try to do the best they can with the funds they have and attract more resources, while executive organizations have to decide what and who to fund. Monitoring and Evaluation (M&E) has always been an element of the accountability of implementing organizations to their funders. There has been a recent trend towards much greater rigor in evaluations to isolate causal impacts of projects and programmes and more ‘evidence base’ approaches to accountability and budget allocations Here we extend the basic idea of rigorous impact evaluation—the use of a valid counter-factual to make judgments about causality—to emphasize that the techniques of impact evaluation can be directly useful to implementing organizations (as opposed to impact evaluation being seen by implementing organizations as only an external threat to their funding). We introduce structured experiential learning (which we add to M&E to get MeE) which allows implementing agencies to actively and rigorously search across alternative project designs using the monitoring data that provides real time performance information with direct feedback into the decision loops of project design and implementation. Our argument is that within-project variations in design can serve as their own counter-factual and this dramatically reduces the incremental cost of evaluation and increases the direct usefulness of evaluation to implementing agencies. The right combination of M, e, and E provides the right space for innovation and organizational capability building while at the same time providing accountability and an evidence base for funding agencies.” Paper available as pdf

I especially like this point  about within-project variation (on which I have argue for in the past): “Our argument is that within-project variations in design can serve as their own counter-factual and this dramatically reduces the incremental cost of evaluation and increases the direct usefulness of evaluation to implementing agencies

 

US Govt Executive Order — Making Open and Machine Readable the New Default for Government Information

(from The White House,  Office of the Press Secretary, For Immediate Release, May 09, 2013)

Executive Order — Making Open and Machine Readable the New Default for Government Information

EXECUTIVE ORDER

– – – – – – –

MAKING OPEN AND MACHINE READABLE THE NEW DEFAULT
FOR GOVERNMENT INFORMATION

By the authority vested in me as President by the Constitution and the laws of the United States of America, it is hereby ordered as follows:

Section 1. General Principles. Openness in government strengthens our democracy, promotes the delivery of efficient and effective services to the public, and contributes to economic growth. As one vital benefit of open government, making information resources easy to find, accessible, and usable can fuel entrepreneurship, innovation, and scientific discovery that improves Americans’ lives and contributes significantly to job creation.

Decades ago, the U.S. Government made both weather data and the Global Positioning System freely available. Since that time, American entrepreneurs and innovators have utilized these resources to create navigation systems, weather newscasts and warning systems, location-based applications, precision farming tools, and much more, improving Americans’ lives in countless ways and leading to economic growth and job creation. In recent years, thousands of Government data resources across fields such as health and medicine, education, energy, public safety, global development, and finance have been posted in machine-readable form for free public use on Data.gov. Entrepreneurs and innovators have continued to develop a vast range of useful new products and businesses using these public information resources, creating good jobs in the process.

To promote continued job growth, Government efficiency, and the social good that can be gained from opening Government data to the public, the default state of new and modernized Government information resources shall be open and machine readable. Government information shall be managed as an asset throughout its life cycle to promote interoperability and openness, and, wherever possible and legally permissible, to ensure that data are released to the public in ways that make the data easy to find, accessible, and usable. In making this the new default state, executive departments and agencies (agencies) shall ensure that they safeguard individual privacy, confidentiality, and national security.

Sec. 2. Open Data Policy. (a) The Director of the Office of Management and Budget (OMB), in consultation with the Chief Information Officer (CIO), Chief Technology Officer (CTO), and Administrator of the Office of Information and Regulatory Affairs (OIRA), shall issue an Open Data Policy to advance the
management of Government information as an asset, consistent with my memorandum of January 21, 2009 (Transparency and Open Government), OMB Memorandum M-10-06 (Open Government Directive), OMB and National Archives and Records Administration Memorandum M-12-18 (Managing Government Records Directive), the Office of Science and Technology Policy Memorandum of February 22, 2013 (Increasing Access to the Results of Federally Funded Scientific Research), and the CIO’s strategy entitled “Digital Government: Building a 21st Century Platform to Better Serve the American People.” The Open Data Policy shall be updated as needed.

(b) Agencies shall implement the requirements of the Open Data Policy and shall adhere to the deadlines for specific actions specified therein. When implementing the Open Data Policy, agencies shall incorporate a full analysis of privacy, confidentiality, and security risks into each stage of the information lifecycle to identify information that should not be released. These review processes should be overseen by the senior agency official for privacy. It is vital that agencies not release information if doing so would violate any law or policy, or jeopardize privacy, confidentiality, or national security.

Sec. 3. Implementation of the Open Data Policy. To facilitate effective Government-wide implementation of the Open Data Policy, I direct the following:

(a) Within 30 days of the issuance of the Open Data Policy, the CIO and CTO shall publish an open online repository of tools and best practices to assist agencies in integrating the Open Data Policy into their operations in furtherance of their missions. The CIO and CTO shall regularly update this online repository as needed to ensure it remains a resource to facilitate the adoption of open data practices.

(b) Within 90 days of the issuance of the Open Data Policy, the Administrator for Federal Procurement Policy, Controller of the Office of Federal Financial Management, CIO, and Administrator of OIRA shall work with the Chief Acquisition Officers Council, Chief Financial Officers Council, Chief Information Officers Council, and Federal Records Council to identify and initiate implementation of measures to support the integration of the Open Data Policy requirements into Federal acquisition and grant-making processes. Such efforts may include developing sample requirements language, grant and contract language, and workforce tools for agency acquisition, grant, and information management and technology professionals.

(c) Within 90 days of the date of this order, the Chief Performance Officer (CPO) shall work with the President’s Management Council to establish a Cross-Agency Priority (CAP) Goal to track implementation of the Open Data Policy. The CPO shall work with agencies to set incremental performance goals, ensuring they have metrics and milestones in place to monitor advancement toward the CAP Goal. Progress on these goals shall be analyzed and reviewed by agency leadership, pursuant to the GPRA Modernization Act of 2010 (Public Law 111-352).

(d) Within 180 days of the date of this order, agencies shall report progress on the implementation of the CAP Goal to the CPO. Thereafter, agencies shall report progress quarterly, and as appropriate.

Sec. 4. General Provisions. (a) Nothing in this order shall be construed to impair or otherwise affect:
(i) the authority granted by law to an executive department, agency, or the head thereof; or

(ii) the functions of the Director of OMB relating to budgetary, administrative, or legislative proposals.

(b) This order shall be implemented consistent with applicable law and subject to the availability of appropriations.

(c) This order is not intended to, and does not, create any right or benefit, substantive or procedural, enforceable at law or in equity by any party against the United States, its departments, agencies, or entities, its officers, employees, or agents, or any other person.

(d) Nothing in this order shall compel or authorize the disclosure of privileged information, law enforcement information, national security information, personal information, or information the disclosure of which is prohibited by law.

(e) Independent agencies are requested to adhere to this order.

BARACK OBAMA

 

Comic book Theories of Change?

Inspired by visitors” positive responses to the imaginative use of flow charts I have wondered how else Theories of Change could be described. The following thought came to me early this morning!

(with apologies to South Park)

See 6 Free Sites for Creating Your Own Comics, at Mashable, for links to stripgenerator and others

AN OFFER: I will give a £50 donation to Oxfam UK to the person who can come up with the best comic strip description of the Theory of Change of a real development project. Post your entry using Comment below, with a link to where the comic is and a link to where we can find a factual description of the project it represents. Your comic strip version can be as humorous(slapstick, farce, wit, irony, sarcasm, parody, gallows, juvenile, or…) or as serious as you like. It can be as long as you like and it does not need to be a simple sequence of panels, it could get way more complicated!

I will try to set up an opinion poll so visitors can vote for the ones they like the most. The winning entry will definitely be posted as an item here on MandE NEWS and be publicised via Twitter. The deadline: May 31st might do. One proviso: Nothing obscene or libelous

Impact Evaluation Toolkit: Measuring the Impact of Results Based Financing on Maternal and Child Health

Christel Vermeersch, Elisa Rothenbühler, Jennifer Renee Sturdy, for the World Bank
Version 1.0. June 2012

Download full document: English [PDF, 3.83MB] / Español [PDF, 3.47MB] / Francais [PDF, 3.97MB]

View online: http://www.worldbank.org/health/impactevaluationtoolkit

“The Toolkit was developed with funding from the Health Results Innovation Trust Fund (HRITF). The objective of  the HRITF is to design, implement and evaluate sustainable results-based financing (RBF) pilot programs that improve maternal and child health outcomes for accelerating progress towards reaching MDGs 1c, 4 & 5. A key element of this program is to ensure a rigorous and well designed impact evaluation is embedded in each country’s RBF project in order to document the extent to which RBF programs are effective, operationally feasible, and under what circumstances. The evaluations are essential for generating new evidence that can inform and improve RBF, not only in the HRITF pilot countries, but also elsewhere. The HRITF finances grants for countries implementing RBF pilots, knowledge and learning activities, impact evaluations, as well as analytical work. ”

Oxfam study of MONITORING, EVALUATION AND LEARNING IN NGO ADVOCACY

Findings from Comparative Policy Advocacy MEL Review Project

by Jim Coe and Juliette Majot | February 2013. Oxfam and ODI

Executive Summary & Full text available as pdf

“For organizations committed to social change, advocacy often figures as a crucial strategic element. How to assess effectiveness in advocacy is, therefore, important. The usefulness of Monitoring, Evaluation and Learning (MEL) in advocacy are subject to much current debate. Advocacy staff, MEL professionals, senior managers, the funding community, and stakeholders of all kinds are searching for ways to improve practices – and thus their odds of success – in complex and contested advocacy environments. This study considers what a selection of leading advocacy organizations are doing in practice. We set out to identify existing practice and emergent trends in advocacy-related MEL practice, to explore current challenges and innovations. The study presents perceptions of how MEL contributes to advocacy effectiveness, and reviews the resources and structures dedicated to MEL.

This inquiry was initiated, funded and managed by Oxfam America. The Overseas Development Institute (ODI) served an advisory role to the core project team, which included Gabrielle Watson of Oxfam America, and consultants Juliette Majot and Jim Coe. The following organizations participated in the inquiry:ActionAid International | Amnesty International | Bread for the World | CARE,USA |Greenpeace International | ONE | Oxfam America | Oxfam Great Britain | Sierra Club”

Scaling Up What Works: Experimental Evidence on External Validity in Kenyan Education

Centre for Global Development Working Paper 321 3/27/13 Tessa Bold, Mwangi Kimenyi, Germano Mwabu, Alice Ng’ang’a, and Justin Sandefur
Available as pdf

Abstract

The recent wave of randomized trials in development economics has provoked criticisms regarding external validity. We investigate two concerns—heterogeneity across beneficiaries and implementers—in a randomized trial of contract teachers in Kenyan schools. The intervention, previously shown to raise test scores in NGO-led trials in Western Kenya and parts of India, was replicated across all Kenyan provinces by an NGO and the government. Strong effects of shortterm contracts produced in controlled experimental settings are lost in weak public institutions: NGO implementation produces a positive effect on test scores across diverse contexts, while government implementation yields zero effect. The data suggests that the stark contrast in success between the government and NGO arm can be traced back to implementation constraints and political economy forces put in motion as the program went to scale.

Rick Davies comment: This study attends to two of the concerns I have raised in a  recent blog (My two particular problems with RCTs) – (a) the neglect of important internal variations in performance arising from a focus on average treatment effects, (b) the neglect of the causal role of contextual factors (the institutional setting in this case) which happens when the context is in effect treated as an externality.

It reinforces my view of the importance of a configurational view of causation.  This kind of analysis should be within the reach of experimental studies as well as methods like QCA. For years agricultural scientists have devised and used factorial designs (albeit using fewer factors than the number of conditions found in most QCA studies)

On this subject I came across this relevant quote from R A Fisher: “

If the investigator confines his attention to any single factor we may infer either that he is the unfortunate victim of a doctrinaire theory as to how experimentation should proceed, or that the time, material or equipment at his disposal is too limited to allow him to give attention to more than one aspect of his problem…..

…. Indeed in a wide class of cases (by using factorial designs) an experimental investigation, at the same time as it is made more comprehensive, may also be made more efficient if by more efficient we mean that more knowledge and a higher degree of precision are obtainable by the same number of observations.”

And also, from Wikipedia, another Fisher quote:

“No aphorism is more frequently repeated in connection with field trials, than that we must ask Nature few questions, or, ideally, one question, at a time. The writer is convinced that this view is wholly mistaken.”

And also

3ie Public Lecture: What evidence based development has to learn from evidence based medicine? What we have learned from 3ie’s experience in evidence based development

Speaker: Chris Whitty, LSHTM & DFID
Speaker: Howard White, 3ie
Date and time: 15 April 2013, 5.30 – 7.00 pm
Venue: John Snow Lecture Theatre A&B, London School of Hygiene & Tropical Medicine, Keppel Street, London, UK

Evidence-based medicine has resulted in better medical practices saving hundreds of thousands of lives across the world. Can evidence-based development achieve the same? Critics argue that it cannot. Technical solutions cannot solve the political problems at the heart of development. Randomized control trials cannot unravel the complexity of development. And these technocratic approaches have resulted in a focus on what can be measured rather than what matters. From the vantage point of a medical practitioner with a key role in development research, Professor Chris Whitty will answer these critics, pointing out that many of the same objections were heard in the early days of evidence-based medicine. Health is also complex, a social issue as well as a technical one. So what are the lessons from evidence-based medicine for filling the evidence gap in development?

The last decade has seen a rapid growth in the production of impact evaluations. What do they tell us, and what do they not? Drawing on the experience of over 100 studies supported by the 3ie Professor Howard White presents some key findings about what works and what doesn’t, with examples of how evidence from impact evaluations is being used to improve lives. Better evaluations will lead to better evidence and so better policies. What are the strengths and weaknesses of impact evaluations as currently practiced, and how may they be improved?

Chris Whitty is a clinical epidemiologist and Chief Scientific Advisor and Director Research & Evidence Division, UK Department for International Development (DFID). He is professor of International Health at LSHTM and prior to DFID he was Director of the LSHTM Malaria Centre and on Board of various other organisations.

Howard White is the Executive Director of 3ie, co-chair of the Campbell International Development Coordinating Group, and Adjunct Professor, Alfred Deakin Research Institute, Geelong University. His previous experience includes leading the impact evaluation programme of the World Bank’s Independent Evaluation Group and before that, several multi-country evaluations.

Phil Davies is Head of the London office of 3ie. He has responsibilities for 3ie’s Systematic Reviews programme. Prior to 3ie he was the Executive Director of Oxford Evidentia, ahas also served as a senior civil servant in the UK Cabinet Office and HM Treasury, responsible for policy evaluation and analysis.

First come first serve. Doors open at 5:15 pm
More about 3ie: www.3ieimpact.org

The (endangered) art of monitoring in development programmes

by Murray Boardman, Overseas Programme Manager, Save the Children New Zealand,
CID Talk: 20 June 2012Available as pdf (and being published as a full paper in the near future)

A summary of the presentation contents:
“Within development, monitoring and evaluation are as ubiquitous as salt and pepper.  Often development talks about monitoring and evaluation as a united term, rather than them being separate and unique processes along a quality framework continuum.  Due to various factors within development, there are some concerns that the evaluation frame is dominating, if not consuming, monitoring.
Given that monitoring is a fundamental component of development programming, any failure to adequately monitor projects will, inevitably, lead to increase costs and also reduces the effectiveness and quality of project outcomes.  Evidence of such occurrences is not isolated.
The attached presentation was given to a seminar for NGOs in New Zealand in June 2012.  It is largely based on a similar presentation given for a guest lecture at Massey University in October 2011.  It presents various observations – some of which are challenging – on the current dynamics between monitoring and evaluation and how evaluations are dominating the quality area of development.  The objective of this presentation is to not to demote or vilify evaluations, rather it is to promote and enhance monitoring as an essential skill set in order to ensure programme quality is continuously improved.

Rick Davies’ comment: A recommended read and a breath of fresh air. Are there are power differentials at work here, behind the problems that Murray identifies?. Who has more status and influence? Those who responsible for project monitoring or those responsible for evaluations?

See also: Daniel Ticehurst’s paper on monitoring: Who is listening to whom, and how well and with what effect?   Daniel Ticehurst, October 16th, 2012. 34 pages

 

Sustainable development: A review of monitoring initiatives in agriculture

(from DFID website)

A new report has just been released on the Review of the Evidence on Indicators, Metrics and Monitoring Systems. Led by the World Agroforestry Centre (ICRAF) under the auspices of the CGIAR Research Program on Water, Land and Ecosystem (WLE), the review examined monitoring initiatives related to the sustainable intensification of agriculture. Designed to inform future DFID research investments, the review assessed both biophysical and socioeconomic related monitoring efforts.

With the aim of generating insights to improve such systems, the report focuses upon key questions facing stakeholders today:

  1. How to evaluate alternative research and development strategies in terms of their potential impact on productivity, environmental services and welfare goals, including trade-offs among these goals?
  2. How to cost-effectively measure and monitor actual effectiveness of interventions and general progress towards achieving sustainable development objectives?

An over-riding lesson, outlined in the report, was the surprising lack of evidence for the impact of monitoring initiatives on decision-making and management. Thus, there are important opportunities for increasing the returns on these investments by better integrating monitoring systems with development decision processes and thereby increasing impacts on development outcomes. The report outlines a set of recommendations for good practice in monitoring initiatives…

DFID welcomes the publication of this review. The complexity of the challenges which face decision makers aiming to enhance global food security is such that evidence (i.e. metrics) of what is working and what is not is essential. This review highlights an apparent disconnection between what is measured and what is required by decision-makers. It also identifies opportunities for a way forward. Progress will require global co-operation to ensure that relevant data are collected and made easily accessible.

DFID is currently working with G8 colleagues on the planning for an international conference on Open Data to be held in Washington DC from 28th to 30th April 2013. The topline goal for the initiative is to obtain commitment and action from nations and relevant stakeholders to promote policies and invest in projects that open access to publicly funded global agriculturally relevant data streams, making such data readily accessible to users in Africa and world-wide, and ultimately supporting a sustainable increase in food security in developed and developing countries. Examples of the innovative use of data which is already easily available will be presented, as well as more in-depth talks and discussion on data availability, demand for data from Africa and on technical issues. Data in this context ranges from the level of the genome through the level of yields on farm to data on global food systems.