Review of evaluation approaches and methods for interventions related to violence against women and girls (VAWG)

Posted on 9 July, 2014 – 3:57 PM

[From the R4D website] Available as pdf

Raab, M.; Stuppert, W. Review of evaluation approaches and methods for interventions related to violence against women and girls (VAWG). (2014) 123 pp.


The purpose of this review is to generate a robust understanding of the strengths, weaknesses and appropriateness of evaluation approaches and methods in the field of development and humanitarian interventions on violence against women and girls (VAWG). It was commissioned by the Evaluation Department of the UK Department for International Development (DFID), with the goal of engaging policy makers, programme staff, evaluators, evaluation commissioners and other evaluation users in reflecting on ways to improve evaluations of VAWG programming. Better evaluations are expected to contribute to more successful programme design and implementation.

The review examines evaluations of interventions to prevent or reduce violence against women and girls within the contexts of development and humanitarian aid.

Rick Davies comment: This paper is of interest for two reasons: (a) The review process was the subject of a blog that documented its progress, from beginning to end. A limited number of comments were posted on the blog by interested observers (including myself) and these were responded to by the reviewers; (b) The review used Qualitative Comparative Analysis (QCA) as its means of understanding the relationship between attributes of evaluations in this area and their results. QCA is an interesting but demanding method even when applied on a modest scale.

I will post more comments here after taking the opportunity to read the review with some care.

The authors have also invited comments from anyone else who is interested, via their email address available in their report

Postscript 2014 10 08: At the EES 2014 at Dublin I made a presentation of the Triangulation of QCA findings, which included some of their data and analysis. You can see the presentation here on Youtube (it has attached audio). Michaela and Wolfgang have subsequently commented on that presentation and in turn I have responded to their comments


VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Incorporating people’s values in development: weighting alternatives

Posted on 7 July, 2014 – 10:44 PM

Laura Rodriguez Takeuchi, ODI Project Note 04, June 2014. Available as pdf

“Key messages:

  • In the measurement of multidimensional well-being, weights aim to capture the relative importance of each componentto a person’s overall  well-being. The choice of weights needsto be explicit and could be used to incorporate people’sperspectives into a final metric.
  • Stated preferences approaches aim to obtain weights from individuals’ responses to hypothetical scenarios. We outline six of these approaches. Understanding their design and limitations is vital to make sense of potentially dissimilar result.
  • It is important to select and test an appropriate method for specific contexts, considering the challenges of relying on people’s answers. Two methodologies, DCE and PTO, are put forward for testing in a pilot project.”

See also:Laura Rodriguez Takeuchi blog posting on Better Evaluation: Week 26: Weighing people’s values in evaluation

Rick Davies comment: Although this was a very interesting and useful paper overall, I was fascinated by this part of Laura’s paper ”

Reflecting on the psychology literature, Kahneman and Krueger (2006) argue that it is difficult to deduce preferences from people’s actual choices because of limited rationality:

“[People] make inconsistent choices, fail to learn from experience, exhibit reluctance to trade, base their own satisfaction on how their situation compares with the satisfaction of others and depart from the standard  model of the rational economic agent in other ways.”   (Kahneman and Krueger 2006: 3)

Rather than using these ‘real’ choices, stated preferences approaches rely on surveys to obtain weights from individuals’ responses to hypothetical scenarios.

This seems totally bizarre. What would happen if we insisted on all respondents’ survey responses being rational, and applied various other remedial measures to make them so!  Would we end up with a perfectely rational set of responses that have no actual fit with how people or behave in the world? How useful would that be? Perhaps this is what happens when you spend too much time in the company of economists? :-))

On another matter…. Table 1 usefully lists eight different weighting methods, which are explained in the text. However this list does not include one of the simplest methods that exists, and which is touched upon tangentially in the reference to the South African study on social perceptions of material needs (Wright, 2008). This is the use of weighted checklists, where respondents choose both items on a checklist and the weights to be given to each item, in a series of binary (yes/no) choices. This method was used in a series of household poverty surveys in Vietnam in 1997 and 2006 using an instrument called a Basic Necessities Survey. The wider potential uses of this participatory and democratic method are are discussed in a related blog on weighted checklists.

Postscript: Julie Newton has point out this useful related website:

  • Measuring National Well-being.  “ONS [UK Office for National Statistics] is developing new measures of national well-being. The aim is to provide a fuller picture of how society is doing by supplementing existing economic, social and environmental measures. Developing better measures of well-being is a long term programme. ONS  are committed to sharing ideas and proposals widely to ensure that the measures are relevant and founded on what matters to people.” Their home page lists a number of new publications on this subject
VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Composite measures of local disaster impact – Lessons from Typhoon Yolanda, Philippines

Posted on 2 June, 2014 – 3:03 PM

by Aldo Benini, Patrice Chataigner, 2014. Available as pdf

Purpose:”When disaster strikes, determining affected areas and populations with the greatest unmet  needs is a key objective of rapid assessments. This note is concerned with the logic and scope for improvement in a particular tool, the so-called “prioritization matrix”, that has increasingly been employed in such assessments. We compare, and expand on, some variants that sprang up in the same environment of a large natural disaster. The fixed context lets us attribute differences to the creativity of the users grappling with the intrinsic nature of the tool, rather than to fleeting local circumstances. Our recommendations may thus be translated more easily to future assessments elsewhere.

The typhoon that struck the central Philippines in November 2013 – known as “Typhoon Yolanda” and also as “Typhoon Haiyan” – triggered a significant national and international relief response. Its information managers imported the practice, tried and tested in other disasters, of ranking affected communities by the degree of impact and need. Several lists, known as prioritization matrices, of ranked municipalities were produced in the first weeks of the response. Four of them, by different individuals and organizations, were shared with us. The largest in coverage ranked 497 municipalities.

The matrices are based on indicators, which they aggregate into an index that determines the ranks. Thus they come under the rubric of composite measures. They are managed in spreadsheets. We review the four for their particular emphases, the mechanics of combining indicators, and the statistical distributions of the final impact scores. Two major questions concern the use of rankings (as opposed to other transformations) and the condensation of all indicators in one combined index. We propose alternative formulations, in part borrowing from recent advances in social indicator research. We make recommendations on how to improve the process in future rapid assessments.”

Rick Davies comment: Well worth reading!

VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

How Shortcuts Cut Us Short: Cognitive Traps in Philanthropic Decision Making

Posted on 30 May, 2014 – 11:48 AM

Beer, Tanya, and Julia Coffman. 2014. “How Shortcuts Cut Us Short: Cognitive Traps in Philanthropic Decision Making”. Centre for Evaluation Innovation. Available as pdf

Found courtesy of “people-centered development” blog (michaela raab)

Introduction: “Anyone who tracks the popular business literature has come across at least one article or book, if not a half dozen, that applies the insights of cognitive science and behavioral economics to individual and organizational decision making.   These authors apply social science research to the question of why so many strategic decisions yield disappointing results, despite extensive research and planning and the availability of data about how strategies are (or are not) performing.  The diagnosis is that many of our decisions rely on mental shortcuts or “cognitive traps,” which can lead us to make uninformed or even bad decisions.   Shortcuts provide time-pressured staff with simple ways of making decisions and managing complex strategies that play  out an uncertain world. These shortcuts affect how we access information, what information  we pay attention to, what we learn, and whether and how we apply what we learn. Like all  organizations, foundations and the people who work in them are subject to these same traps.  Many foundations are attempting to make better decisions by investing in evaluation and other data collection efforts that support their strategic learning. The desire is to generate more timely and actionable data, and some foundations have even created staff positions dedicated entirely to supporting learning and the ongoing application of data for purposes of continuous improvement.  While this is a useful and positive trend, decades of research have shown that despite the best of intentions, and even when actionable data is presented at the right time, people do not automatically make good and rational decisions. Instead, we are hard-wired to fall into cognitive traps  that affect how we process (or ignore) information that could help us to make better judgments.”

Rick Davies comment: Recommended, along with the videosong by Mr Wray on cognitive bias, also available via Michaela’s blog

VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Making impact evaluation matter: Better evidence for effective policies and programmes

Posted on 27 May, 2014 – 9:16 PM

Asian Development Bank, Manila, 1-5 September 2014

The Asian Development Bank (ADB) and the International Initiative for Impact Evaluation (3ie) are hosting a major international impact evaluation conference Making Impact Evaluation Matter from 1-5 September 2014 in Manila. The call for proposals to present papers and conduct workshops at the conference is now open.

Making Impact Evaluation Matter will comprise pre-conference workshops for 2.5 days from 1-3 September 2014, and 2.5 days of the conference from 3-5 September. Major international figures in the field of impact evaluation are being invited to speak at the plenary sessions of the conference. There will be five to six streams of pre-conference workshops and up to eight streams of parallel sessions during the conference, allowing for over 150 presentations.

Proposals are now being invited for presentations on any aspect of impact evaluations and systematic reviews, including findings, methods and translation of evidence into policy. Researchers are welcome to submit proposals on the design (particularly innovative designs for difficult to evaluate interventions), implementation, findings and use of impact evaluations and systematic reviews. Policymakers and development programme managers are welcome to submit proposals on the use of impact evaluation and systematic review findings.

Parallel sessions at the conference will be organised around the following themes/sectors: (a) infrastructure (transport, energy, information and communication technology, urban development, and water), (b) climate change/ environment/ natural resources, (c) social development (health, education, gender equity, poverty and any other aspect of social development),  (d) rural development (agriculture,  food security and any other aspect of rural development),  (e)  financial inclusion, (f) institutionalisation of impact evaluation, and incorporating impact evaluation or systematic reviews into institutional appraisal and results frameworks, (g) impact evaluation of institutional and policy reform (including public management and governance), (h) impact evaluation methods, and (g) promotion of the use of evidence.

Workshop proposals are being invited on all aspects of designing, conducting and disseminating findings from impact evaluations and systematic reviews. The workshops can be at an introductory, intermediate or advanced level.  The duration of a workshop can vary from half a day to two full days.

All proposals must be submitted via email to : with email subject line ‘Proposal: presentation’ or ‘Proposal: workshop’. The proposal submission deadline is 3 July 2014.

Bursaries are available for participants from low- and middle-income countries. Employees of international organisations are however not eligible for bursaries (except the Asian Development Bank). A bursary will cover return economy airfare and hotel accommodation. All other expenses (ground transport, visa, meals outside the event) must be paid by the participant or their employer. Bursary applications must be made through the conference website: The deadline for bursary applications is 15 July 2014.

Non-sponsored participants are required to pay a fee of US$250 for participating in the conference or US$450 for participating in the pre-conference workshops as well as the conference. Those accepted to present a workshop will be exempted from the fee.

For more information on the submission of proposals for the conference, read the Call for Proposals.

For the latest updates on Making Impact Evaluation Matter, visit

Queries may be sent to
Copyright © 2014 International Initiative for Impact Evaluation (3ie), All rights reserved.
You are receiving this email because you have subscribed to the 3ie mailing list.

Our mailing address is:
International Initiative for Impact Evaluation (3ie)

2nd Floor, East Wing, ISID Complex,
Plot No. 4, Vasant Kunj Institutional Area

New Delhi 110070

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

International Energy Policies & Programmes Evaluation Conference (IEPPEC) conference 9-11 September 2014

Posted on 27 May, 2014 – 9:09 PM

– the leading event for energy policy and programme evaluators

Sharing and Accelerating the Value and Use of Monitoring, Reporting and Verification Practices.

There are a wide range of regional, national and international policies and programmes designed to achieve improved energy efficiency, and therefore reductions in GHG emissions and reductions in living costs. These are top priorities for bodies such as the EU, IEA and UN in addressing the critical issues of climate change, resource conservation and living standards.

The increasing focus on this policy area has resulted in more challenging objectives and intended outcomes for interventions, along with growing investment. But are we investing correctly?

Pioneering approaches to evaluating investments and policy decisions related to energy efficiency will be at the forefront of presentations and debate at the IEPPEC, held in Berlin between the 9th and 11th of September 2014.

The conference presents an unparalleled opportunity to bring together policy and evaluation practitioners, academics and others from around the world involved in evaluation of energy and low carbon policies and programs. Attendees will be able to debate the most effective means of assuring that both commercial and community-based approaches to improving the sustainability of our energy use and making our economies more efficient are based on common metrics that can be compared across regions and regulatory jurisdictions. The focus over the three day conference is for policy makers, program managers and evaluators to share ideas for improving the assessment of potential and actual impacts of low carbon policies and programmes, and to facilitate a deeper understanding of evaluation methods that work in practice.

The conference features:

•          Presentation of over 85 full and peer-reviewed evaluation papers by their authors

•          Four panel discussions

•          Two keynote sessions

•          A two-day poster exhibit

·               Lots of opportunity to share learning and network with other attendees

The conference is filling up fast, so to avoid disappointment, please book your place now by visiting

Additional information:

-       For the draft conference agenda, please click here

-       Refreshments, breakfasts and lunches are provided.

-       For any further information, please visit

VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Running Randomized Evaluations: A Practical Guide

Posted on 22 May, 2014 – 10:01 AM
Glennerster, Rachel, and Kudzai Takavarasha. Running Randomized Evaluations: A Practical Guide. Princeton: Princeton University Press, 2013.


This book provides a comprehensive yet accessible guide to running randomized impact evaluations of social programs. Drawing on the experience of researchers at the Abdul Latif Jameel Poverty Action Lab, which has run hundreds of such evaluations in dozens of countries throughout the world, it offers practical insights on how to use this powerful technique, especially in resource-poor environments.

This step-by-step guide explains why and when randomized evaluations are useful, in what situations they should be used, and how to prioritize different evaluation opportunities. It shows how to design and analyze studies that answer important questions while respecting the constraints of those working on and benefiting from the program being evaluated. The book gives concrete tips on issues such as improving the quality of a study despite tight budget constraints, and demonstrates how the results of randomized impact evaluations can inform policy.

With its self-contained modules, this one-of-a-kind guide is easy to navigate. It also includes invaluable references and a checklist of the common pitfalls to avoid.

Provides the most up-to-date guide to running randomized evaluations of social programs, especially in developing countries

Offers practical tips on how to complete high-quality studies in even the most challenging environments

Self-contained modules allow for easy reference and flexible teaching and learning

Comprehensive yet nontechnical

Contents pages and more (via Amazon)  &    Brief chapter summaries

The first chapter “This chapter provides an example of how a randomized evaluation can lead to large-scale change and provides a road map for an evaluation and for the rest of the book”

Book review: The impact evaluation primer you have been waiting for? Mark Goldstein, Development Impact blog. 27/11/2013

YouTube video: Book launch talk (1:21) “On 21 Nov, 2013, author of “Running Randomized Evaluations” and Executive Director of J-PAL, Rachel Glennerster, launched the new book at the World Bank. This was followed by a panel discussion with Alix Zwane, Executive Director of Evidence Action, Mary Ann Bates, Deputy Director of J-PAL North America and David Evans, Senior Economist, Office of the Chief Economist, Africa Region, World Bank, led by the Head of DIME, Arianna Legovini.”

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Working with messy data sets? Two useful and free tools

Posted on 25 April, 2014 – 5:52 PM

I have just come across two useful apps (aka software packages (aka tools)) for when you are working with someone else’s data sets and/or data sets from multiple sources and times. Or,  just your own data that was in a less than perfect state when you last left it :-)

  • OpenRefine: Initially developed by Google and now open source with its own support and development community. You can explore the characteristics of a data set, clean it in quick and comprehensive moves, transform its layout and formats, as well as reconcile and match multiple data sets. There is documentation and videos to show you how to do all this. There is also a book, which you can purchase.The wikipedia entry provides a good overview.
  • Tabula: This package allows you to extract tables of data from pdfs, a task which otherwise can be very tiresome, messy and error prone

And some other packages I have yet to explore

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

“Quality evidence for policymaking. I’ll believe it when I see the replication”

Posted on 14 April, 2014 – 12:17 PM

3ie Replication Paper 1, by Annette N Brown, Drew B Cameron, Benjamin DK Wood, March 2014. Available as pdf

“1. Introduction:  Every so often, a well-publicised replication study comes along that, for a brief period, catalyses serious discussion about the importance of replication for social science research, particularly in economics. The most recent example is the Herndon, Ash, and Pollin replication study (2013) showing that the famous and highly influential work of Reinhart and Rogoff (2010) on the relationship between debt and growth is flawed.

McCullough and McKitrick (2009) document numerous other examples from the past few decades of replication studies that expose serious weaknesses in policy influential research across several fields. The disturbing inability of Dewald et al. (1986) to replicate many of the articles in their Journal of Money, Credit and Banking experiment is probably the most well-known example of the need for more replication research in economics. Yet, replication studies are rarely published and remain the domain of graduate student exercises and the occasional controversy.

This paper takes up the case for replication research, specifically internal replication, or the reanalysis of original data to address the original evaluation question. This focus helps to demonstrate that replication is a crucial element in the production of evidence for evidence-based policymaking, especially in low-and middle-income countries.

Following an overview of the main challenges facing this type of research, the paper then presents a typology of replication approaches for addressing the challenges. The approaches include pure replication, measurement and estimation analysis (MEA), and theory of change analysis (TCA). Although the challenges presented are not new, the discussion here is meant to highlight that the call for replication is not about catching bad or irresponsible researchers. It is about addressing very real challenges in the research and publication processes and thus about producing better evidence to inform development policymaking.”

Other quotes:

“When single evaluations are influential, and any contradictory evaluations of similar interventions can be easily discounted for contextual reasons, the
minimum requirement for validating policy recommendations should be recalculating and re-estimating the measurements and findings using the original raw data to confirm the published results, or a pure replication.”

“On the bright side, there is some evidence of a correlation between public data availability and increased citation counts in the social sciences. Gleditsch (2003) finds that articles published in the Journal of Conflict Resolution that offer data in any form receive twice as many citations as comparable papers without available data (Gleditsch et al. 2003; Evanschitzky et al. 2007). ”

“Replication should be seen as part of the process for translating research findings into evidence for policy and not as a way to catch or call out researchers who, in all likelihood, have the best of intentions when conducting and submitting their research, but face understandable challenges. These challenges include the inevitability of human error, the uncontrolled nature of social science, reporting and publication bias, and the pressure to derive policy recommendations from empirical findings”

“Even in the medical sciences, the analysis of heterogeneity of outcomes, or post-trial subgroup analysis, is not accorded ‘any special epistemic status’ by the United States Food and Drug Administration rules (Deaton 2010 p.440). In the social sciences, testing for and understanding heterogeneous outcomes is crucial to policymaking. An average treatment effect demonstrated by an RCT could result from a few strongly positive outcomes and many negative outcomes, rather than from many positive outcomes, a distinction that would be important for programme design. Most RCT-based studies in development do report heterogeneous outcomes.Indeed, researchers are often required to do so by funders who want studies to have policy recommendations. As such, RCTs as practised – estimating treatment effects for groups not subject to random assignment – face the same challenges as other empirical social science studies.”

“King (2006) encourages graduate students to conduct replication studies but, in his desire to help students publish, he suggests they may leave out replication findings that support the original article and instead look for findings that contribute by changing people’s minds about something. About sensitivity analysis, King (2006 p.121) advises, ‘If it turns out that all those other changes don’t change any substantive conclusions, then leave them out or report them” Aaarrrggghhh!

Rick Davies Comment: This paper is well worth reading!

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)


Posted on 12 April, 2014 – 3:50 PM

BY DEVRA C. MOEHLER, BBC Media Action RESEARCH REPORT // ISSUE 03 // MARCH 2014 // GOVERNANCE. Available as pdf

Foreword by BBC Media Action

“This report summarises how experimental design has been used to assess the effectiveness of governance interventions and to understand the effects of the media on political opinion and behaviour. It provides an analysis of the benefits and drawbacks of experimental approaches and also highlights how field experiments can challenge the assumptions made by media support organisations about the role of the media in different countries.

The report highlights that – despite interest in the use of RCTs to assess governance outcomes – only a small number of field experiments have been conducted in the area of media, governance and democracy.

The results of these experiments are not widely known among donors or implementers. This report aims to address that gap. It shows that media initiatives have led to governance outcomes including improved accountability. However, they have also at times had unexpected adverse effects.

The studies conducted to date have been confined to a small number of countries and the research questions posed were linked to specific intervention and governance outcomes. As a result, there is a limit to what policymakers and practitioners can infer. While this report highlights an opportunity for more experimental research, it also identifies that the complexity of media development can hinder the efficacy of experimental evaluation. It cautions that low?level interventions (eg those aimed at individuals as opposed to working at a national or organisational level) best lend themselves to experimentation. This could create incentives for researchers to undertake experimental research that answers questions focused on individual change rather than wider organisational and systemic change. For example, it would be relatively easy to assess whether a training course does or does not work. Researchers can randomise the journalists that were trained and assess the uptake and implementation of skills. However, it would be much harder to assess how capacity?building efforts affect a media house, its editorial values, content, audiences and media/state relations.

Designing such experiments will be challenging. The intention of this report is to start a conversation both within our own organisation and externally. As researchers we should be prepared to discover that experimentation may not be feasible or relevant for evaluation. In order to strengthen the evidence base, practitioners, researchers and donors need to agree which research questions can and should be answered using experimental research, and, in the absence of experimental research, to agree what constitutes good evidence.

BBC Media Action welcomes feedback on this report and all publications published under our Bridging Theory and Practice Research Dissemination Series.”

Introduction 5
Chapter 1: Background on DG field experiments 7
Chapter 2: Background on media development assistance and evaluation 9
Chapter 3: Current experiments and quasi?experimental studies on media in developing countries 11
Field experiments
Quasi experiments
Chapter 4: Challenges of conducting field experiments on media development 21
Level of intervention
Complexity of intervention
Research planning under ambiguity
Chapter 5: Challenges to learning from field experiments on media development 26
Chapter 6: Solutions and opportunities 29
Research in media scarce environments
Test assumptions about media effects
To investigate influences on media
References 33

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)