Why evaluations fail: The importance of good monitoring (DCED, 2014)

Adam Kessler and Jim Tanburn, August 2014, Donor Committee for Enterprise Development (DCED). 9 pages. Available as pdf.

Introduction:  A development programme without a strong internal monitoring system often cannot be effectively evaluated. The DCED Standard for Results Measurement is a widely-used monitoring framework, and this document discusses how it relates to external evaluations. Why should evaluators be interested in monitoring systems? How can the DCED Standard support evaluations, and vice versa? Who is responsible for what, and what are the expectations of each? This document expands previous work by the UK Department for International Development (DFID).

This document is relevant for evaluators, those commissioning evaluations, and practitioners in  programmes using the DCED Standard and undergoing an evaluation. It provides a basis for dialogue with the evaluation community; the aims of that dialogue are to identify sources of evaluation expertise
available to support programmes using the DCED Standard, and to promote the Standard to programmes needing to improve their monitoring system. We would welcome further discussions on the topic, and invite you to contact us at Results@Enterprise-Development.org with any questions or comments.

Contents
1 Introduction
2 Why should evaluators be interested in monitoring?
2.1 Good monitoring is essential for effective management
2.2 Good monitoring is essential for effective evaluation
2.3 Some evaluation methodologies incorporate monitoring
3 What is the DCED Standard for Results Measurement?
4 How does the DCED Standard support evaluation?
4.1 The DCED Standard promotes clear theories of change
4.2 The DCED Standard provides additional data to test the theory of change
5 How do evaluations supplement the DCED Standard?
5.1 Evaluations are independent
5.2 Evaluations have more expertise and larger budgets
5.3 Evaluations can examine broader effects
5.4 Evaluations and the DCED Standard are for different audiences
6 Division of responsibilities between evaluator and programme team
7 Key References and further reading

DCED publications on M&E, M&E audits and the effectiveness of M&E standards

This posting is overdue. The Donor Committee for Enterprise Development (DCED) has been producing a lot of material on results management this year. Here are some of the items I have seen.

Of particular interest to me is the DCED Standard for Results Measurement. According to Jim Tanburn, there are about 60-70 programmes now using the standard. Associated with this is an auditing service offered by DCED. From what I can see nine programmes have been audited so far. Given the scale and complexity of the standards, the question in my mind, and probably that of others, is whether their use makes a significant difference to the performance of the programmes that have implemented the standards.Are they cost-effective?

This would not be an easy question to answer in any rigorous fashion, I suspect. There are likely to be many case-specific accounts of where and how the standards have helped improve performance, and perhaps some of where they have have not helped or even hindered. Some accounts are already available via the Voices from the Practitioners part of the DCED website.

The challenge would be how to aggregate judgements about impacts on a diverse range of programmes in a variety of settings. This is the sort of situation where one is looking for the “effects of a cause”, rather than “the causes of an effect”, because there is a standard intervention (adoption of the standards) but it is one which may have many different effects. A three step process might be feasible, or at least worth exploring:

1. Rank programmes in terms of the degree to which they have successfully adopted the standards. This should be relatively easy, given that there is a standard auditing process

2. Rank programmes in terms of the relative observed/reported effects of the standards. This will be much more difficult because of the apple and pears  nature of the impacts. But I have been exploring a way of doing so here: Pair comparisons: For where there is no common outcome measure? Another difficulty, which may be surmountable, is that “all the audit reports remain confidential and DCED will not share the contents of the audit report without seeking permission from the audited programmes”.

3. Look for the strength and direction of the correlation between the two measures, and for outliers (poor adoption/big effects, good adoption/few effects) where lessons could be learned.

 

Reframing the evidence debates: a view from the media for development sector

Abraham-Dowsing, Kavita, Anna Godfrey, and Zoe Khor. 2014. “Reframing the Evidence Debates: A View from the Media for Development Sector”. BBC Media Action. Available as pdf. This is part of BBC Media Action’s Bridging Theory and Practice series. An accompanying appendices document is available here. It includes priority research questions, and more detail on the evidence examples cited in the paper. The report was prepared with funding from the UK Department for International Development.

Introduction : “Donors, policy-makers and practitioners need evidence  to inform their policy and programming choices, resource allocation and spending decisions, yet producing and making use of high-quality research and evidence is not straightforward. This is particularly the case in sectors that do not have a long history of research or evaluation, that are operating in fragile states with low research capacity and that are trying to bring about complex change. The media for development sector (see Box 1) is one such example. Nonetheless, donors, governments and private foundations working in international development have long recognised the importance of independent media and information sources in their work and the role that communication can play in bringing about change. Despite this recognition, however, in debates around evidence on the role of media and communication in achieving development communication in achieving development outcomes, assertions of “no evidence” or “not enough evidence” are commonplace. With the evidence agenda gaining more prominence in the development sector, there is a risk for any sector that finds it difficult to have a clear, concise and cohesive narrative around its evidence of impact.

This paper is based on a series of interviews with practitioners, evaluators and donors working in the media for development sector, and looks at their understanding of what counts as evidence and their views on the existing evidence base. It argues that compelling evidence of impact does exist and is being used – although this varies by thematic area. For example, it highlights that evidence in the area of health communication is stronger and more integrated into practice compared with other thematic areas such as media and governance or  humanitarian response outcomes. The paper also contends that, alongside evidencing development outcomes (for example, media’s impact on knowledge, attitudes, efficacy, norms and behaviours), more evidence is needed to answer specific questions about how, why and in what ways media and communication affect people and societies – and how this varies by local context.

The paper argues that the lack of clear evidential standards for reporting evidence from media for development programmes, the limited efforts to date to collate and systematically review the evidence that does exist, and the lack of relevant fora in which to critique and understand evaluation findings, are significant barriers to evidence generation. The paper calls for an “evidence agenda”, which creates shared evidential standards to systematically map the existing evidence, establishes fora to discuss and share existing evidence, and uses strategic, longer-term collaborative investment in evaluation to highlight where evidence gaps need to be filled in order to build the evidence base. Without such an agenda, as a field, we risk evidence producers, assessors and funders talking at cross purposes. ”

As the paper’s conclusion states, we actively welcome conversations with you and we expect that these will affect and change the focus of the evidence agenda. We also expect to be challenged! What we have tried to do here is articulate a clear starting point, highlighting the risk of not taking this conversation forward. We actively welcome your feedback during our consultation on the paper which runs from August until the end of October 2014, and invite you to share the paper and appendices widely with any colleagues and networks who you think appropriate.

Contents page
Introduction
1. What is evidence? An expert view
2. Evidence – what counts and where are there gaps?
3. Building an evidence base – points for consideration
4. The challenges of building an evidence base
5. An evidence agenda – next steps in taking this conversation forward
Conclusion
Appendix 1: Methodology and contributors
Appendix 2: Examples of compelling evidence
Appendix 3: Priority research questions for the evidence agenda
Appendix 4: A note on M&E, R & L and DME
Appendix 5: Mixed methods evaluation evidence – farmer field schools
Appendix 6: Methodological challenges

CDI conference proceedings: Improving the use of M&E processes and findings

“On the 20th and 21st of March 2014 CDI organized her annual ‘M&E on the cutting edge’ conference on the topic: ‘Improving the Use of M&E Processes and Findings’.

This conference is part of our series of yearly ‘M&E on the cutting edge’ events. The conference was held on the 20th and 21st of March 2014 in Wageningen, the Netherlands. This conference particularly looked at under what conditions the use of M&E processes and findings can be improved. The conference report can now be accessed here in pdf format

Conference participants had the opportunity to learn about:

  • frameworks to understand utilisation of monitoring and evaluation findings and process;
  • different types of utilisation of monitoring and evaluation process and findings, when and for whom these are relevant;
  • conditions that improve utilisation of monitoring and evaluation processes and findings.

Conference presentations can be found online here: http://www.managingforimpact.org/event/cdi-conference-improving-use-me-processes-and-findings

Gender, Monitoring, Evaluation and Learning – 9 new articles in pdfs

…in Gender & Development Volume 22, Issue 2, July 2014   Gender, Monitoring, Evaluation and Learning
“In this issue of G&D, we examine the topic of Gender, Monitoring, Evaluation and Learning (MEL) from a gender equality and women’s rights perspective, and hope to prove that a good MEL system is an activist’s best friend! This unique collection of articles captures the knowledge of a range of development practitioners and women’s rights activists, who write about a variety of organisational approaches to MEL. Contributors come from both the global South and the global North and have tried to share their experience accessibly, making what is often very complex and technical material as clear as possible to non-MEL specialists.”

Contents

The links below will take you to the article abstract on the Oxfam Policy & Practice website, from where you can download the article for free.

Editorial

Introduction to Gender, Monitoring, Evaluation and Learning
Kimberly Bowman and Caroline Sweetman

Articles

Women’s Empowerment Impact Measurement Initiative
Nidal Karim, Mary Picard, Sarah Gillingham and Leah Berkowitz

Helen Lindley

and girls in the Democratic Republic of Congo to influence policy and practice
Marie-France Guimond and Katie Robinette

Learning about women’s empowerment in the context of development projects: do the figures tell us enough?

Jane Carter, Sarah Byrne, Kai Schrader, Humayun Kabir, Zenebe Bashaw
Uraguchi, Bhanu Pandit, Badri Manandhar, Merita Barileva, Norbert Pijls & Pascal Fendrich

Resources

Compiled by Liz Cooke

Resources List – Gender, Monitoring, Evaluation and Learning

 

Review of evaluation approaches and methods for interventions related to violence against women and girls (VAWG)

[From the R4D website] Available as pdf

Raab, M.; Stuppert, W. Review of evaluation approaches and methods for interventions related to violence against women and girls (VAWG). (2014) 123 pp.

Summary:

The purpose of this review is to generate a robust understanding of the strengths, weaknesses and appropriateness of evaluation approaches and methods in the field of development and humanitarian interventions on violence against women and girls (VAWG). It was commissioned by the Evaluation Department of the UK Department for International Development (DFID), with the goal of engaging policy makers, programme staff, evaluators, evaluation commissioners and other evaluation users in reflecting on ways to improve evaluations of VAWG programming. Better evaluations are expected to contribute to more successful programme design and implementation.

The review examines evaluations of interventions to prevent or reduce violence against women and girls within the contexts of development and humanitarian aid.

Rick Davies comment: This paper is of interest for two reasons: (a) The review process was the subject of a blog that documented its progress, from beginning to end. A limited number of comments were posted on the blog by interested observers (including myself) and these were responded to by the reviewers; (b) The review used Qualitative Comparative Analysis (QCA) as its means of understanding the relationship between attributes of evaluations in this area and their results. QCA is an interesting but demanding method even when applied on a modest scale.

I will post more comments here after taking the opportunity to read the review with some care.

The authors have also invited comments from anyone else who is interested, via their email address available in their report

Postscript 2014 10 08: At the EES 2014 at Dublin I made a presentation of the Triangulation of QCA findings, which included some of their data and analysis. You can see the presentation here on Youtube (it has attached audio). Michaela and Wolfgang have subsequently commented on that presentation and in turn I have responded to their comments

 

Incorporating people’s values in development: weighting alternatives

Laura Rodriguez Takeuchi, ODI Project Note 04, June 2014. Available as pdf

“Key messages:

  • In the measurement of multidimensional well-being, weights aim to capture the relative importance of each componentto a person’s overall  well-being. The choice of weights needsto be explicit and could be used to incorporate people’sperspectives into a final metric.
  • Stated preferences approaches aim to obtain weights from individuals’ responses to hypothetical scenarios. We outline six of these approaches. Understanding their design and limitations is vital to make sense of potentially dissimilar result.
  • It is important to select and test an appropriate method for specific contexts, considering the challenges of relying on people’s answers. Two methodologies, DCE and PTO, are put forward for testing in a pilot project.”

See also:Laura Rodriguez Takeuchi blog posting on Better Evaluation: Week 26: Weighing people’s values in evaluation

Rick Davies comment: Although this was a very interesting and useful paper overall, I was fascinated by this part of Laura’s paper ”

Reflecting on the psychology literature, Kahneman and Krueger (2006) argue that it is difficult to deduce preferences from people’s actual choices because of limited rationality:

“[People] make inconsistent choices, fail to learn from experience, exhibit reluctance to trade, base their own satisfaction on how their situation compares with the satisfaction of others and depart from the standard  model of the rational economic agent in other ways.”   (Kahneman and Krueger 2006: 3)

Rather than using these ‘real’ choices, stated preferences approaches rely on surveys to obtain weights from individuals’ responses to hypothetical scenarios.

This seems totally bizarre. What would happen if we insisted on all respondents’ survey responses being rational, and applied various other remedial measures to make them so!  Would we end up with a perfectely rational set of responses that have no actual fit with how people or behave in the world? How useful would that be? Perhaps this is what happens when you spend too much time in the company of economists? :-))

On another matter…. Table 1 usefully lists eight different weighting methods, which are explained in the text. However this list does not include one of the simplest methods that exists, and which is touched upon tangentially in the reference to the South African study on social perceptions of material needs (Wright, 2008). This is the use of weighted checklists, where respondents choose both items on a checklist and the weights to be given to each item, in a series of binary (yes/no) choices. This method was used in a series of household poverty surveys in Vietnam in 1997 and 2006 using an instrument called a Basic Necessities Survey. The wider potential uses of this participatory and democratic method are are discussed in a related blog on weighted checklists.

Postscript: Julie Newton has point out this useful related website:

  • Measuring National Well-being.  “ONS [UK Office for National Statistics] is developing new measures of national well-being. The aim is to provide a fuller picture of how society is doing by supplementing existing economic, social and environmental measures. Developing better measures of well-being is a long term programme. ONS  are committed to sharing ideas and proposals widely to ensure that the measures are relevant and founded on what matters to people.” Their home page lists a number of new publications on this subject

Composite measures of local disaster impact – Lessons from Typhoon Yolanda, Philippines

by Aldo Benini, Patrice Chataigner, 2014. Available as pdf

Purpose:”When disaster strikes, determining affected areas and populations with the greatest unmet  needs is a key objective of rapid assessments. This note is concerned with the logic and scope for improvement in a particular tool, the so-called “prioritization matrix”, that has increasingly been employed in such assessments. We compare, and expand on, some variants that sprang up in the same environment of a large natural disaster. The fixed context lets us attribute differences to the creativity of the users grappling with the intrinsic nature of the tool, rather than to fleeting local circumstances. Our recommendations may thus be translated more easily to future assessments elsewhere.

The typhoon that struck the central Philippines in November 2013 – known as “Typhoon Yolanda” and also as “Typhoon Haiyan” – triggered a significant national and international relief response. Its information managers imported the practice, tried and tested in other disasters, of ranking affected communities by the degree of impact and need. Several lists, known as prioritization matrices, of ranked municipalities were produced in the first weeks of the response. Four of them, by different individuals and organizations, were shared with us. The largest in coverage ranked 497 municipalities.

The matrices are based on indicators, which they aggregate into an index that determines the ranks. Thus they come under the rubric of composite measures. They are managed in spreadsheets. We review the four for their particular emphases, the mechanics of combining indicators, and the statistical distributions of the final impact scores. Two major questions concern the use of rankings (as opposed to other transformations) and the condensation of all indicators in one combined index. We propose alternative formulations, in part borrowing from recent advances in social indicator research. We make recommendations on how to improve the process in future rapid assessments.”

Rick Davies comment: Well worth reading!

How Shortcuts Cut Us Short: Cognitive Traps in Philanthropic Decision Making

Beer, Tanya, and Julia Coffman. 2014. “How Shortcuts Cut Us Short: Cognitive Traps in Philanthropic Decision Making”. Centre for Evaluation Innovation. Available as pdf

Found courtesy of “people-centered development” blog (michaela raab)

Introduction: “Anyone who tracks the popular business literature has come across at least one article or book, if not a half dozen, that applies the insights of cognitive science and behavioral economics to individual and organizational decision making.   These authors apply social science research to the question of why so many strategic decisions yield disappointing results, despite extensive research and planning and the availability of data about how strategies are (or are not) performing.  The diagnosis is that many of our decisions rely on mental shortcuts or “cognitive traps,” which can lead us to make uninformed or even bad decisions.   Shortcuts provide time-pressured staff with simple ways of making decisions and managing complex strategies that play  out an uncertain world. These shortcuts affect how we access information, what information  we pay attention to, what we learn, and whether and how we apply what we learn. Like all  organizations, foundations and the people who work in them are subject to these same traps.  Many foundations are attempting to make better decisions by investing in evaluation and other data collection efforts that support their strategic learning. The desire is to generate more timely and actionable data, and some foundations have even created staff positions dedicated entirely to supporting learning and the ongoing application of data for purposes of continuous improvement.  While this is a useful and positive trend, decades of research have shown that despite the best of intentions, and even when actionable data is presented at the right time, people do not automatically make good and rational decisions. Instead, we are hard-wired to fall into cognitive traps  that affect how we process (or ignore) information that could help us to make better judgments.”

Rick Davies comment: Recommended, along with the videosong by Mr Wray on cognitive bias, also available via Michaela’s blog

Running Randomized Evaluations: A Practical Guide

Glennerster, Rachel, and Kudzai Takavarasha. Running Randomized Evaluations: A Practical Guide. Princeton: Princeton University Press, 2013.

Overview

This book provides a comprehensive yet accessible guide to running randomized impact evaluations of social programs. Drawing on the experience of researchers at the Abdul Latif Jameel Poverty Action Lab, which has run hundreds of such evaluations in dozens of countries throughout the world, it offers practical insights on how to use this powerful technique, especially in resource-poor environments.

This step-by-step guide explains why and when randomized evaluations are useful, in what situations they should be used, and how to prioritize different evaluation opportunities. It shows how to design and analyze studies that answer important questions while respecting the constraints of those working on and benefiting from the program being evaluated. The book gives concrete tips on issues such as improving the quality of a study despite tight budget constraints, and demonstrates how the results of randomized impact evaluations can inform policy.

With its self-contained modules, this one-of-a-kind guide is easy to navigate. It also includes invaluable references and a checklist of the common pitfalls to avoid.

Provides the most up-to-date guide to running randomized evaluations of social programs, especially in developing countries

Offers practical tips on how to complete high-quality studies in even the most challenging environments

Self-contained modules allow for easy reference and flexible teaching and learning

Comprehensive yet nontechnical

Contents pages and more (via Amazon)  &    Brief chapter summaries

The first chapter “This chapter provides an example of how a randomized evaluation can lead to large-scale change and provides a road map for an evaluation and for the rest of the book”

Book review: The impact evaluation primer you have been waiting for? Mark Goldstein, Development Impact blog. 27/11/2013

YouTube video: Book launch talk (1:21) “On 21 Nov, 2013, author of “Running Randomized Evaluations” and Executive Director of J-PAL, Rachel Glennerster, launched the new book at the World Bank. This was followed by a panel discussion with Alix Zwane, Executive Director of Evidence Action, Mary Ann Bates, Deputy Director of J-PAL North America and David Evans, Senior Economist, Office of the Chief Economist, Africa Region, World Bank, led by the Head of DIME, Arianna Legovini.”

%d bloggers like this: