Walking the talk: the need for a trial registry for development interventions

By Ole Dahl Rasmussen, University of Southern Denmark and DanChurchAid, Nikolaj Malchow-Møller, University of Southern Denmark, Thomas Barnebeck Andersen, University of Southern Denmark. April 2011 Available as pdf Found courtesy of @ithorpe

Abstract: Recent  advances  in  the  use  of  randomized  control  trials  to  evaluate  the  effect  of development interventions promise to enhance our knowledge of what works and why. A core argument supporting randomised studies is the claim that they have high internal validity. We argue that this claim is weak as long as a trial registry of development interventions is not in place. Without a trial registry, the possibilities for data mining, created by analyses of multiple outcomes  and  subgroups,  undermine  the  internal  validity.  Drawing  on  experience  from evidence-based medicine and recent examples from microfinance, we argue that a trial registry would also enhance external validity and foster innovative research.

RD Comment: Well worth reading. The proposal and supporting argument is not only relevant to thinking about RCTs, but to all forms of impact evaluation. In fact, one could argue for similar registeries not only where new interventions are being tested, but also where interventions are being replicated or scaled up (where there also needs to be some accountability for, and analysis of, the results). The problem being addressed, perhaps not made clearly enough in the abstract, is pervasive bias towards publicising and publishing positive results, and the failure to acknowledge and use negative results. One quote is illustrative: “A recent review of evidence on microcredit found that all except one of the evaluations carried out by donor agencies and  large NGOs showed positive and significant effects, suggesting that bias exists (Kovsted et al., 2009)

Related to this issue of failure to identify and use negative results, see this blog posting on “Do we need a Minimum Level of Failure(MLF)?

Quantitative and Qualitative Methods in Impact Evaluation and Measuring Results

Governance and Social Development Resource Centre. Issues Paper by Sabine Garbarino and Jeremy Holland March 2009

1 Introduction
There has been a renewed interest in impact evaluation in recent years amongst development agencies and donors. Additional attention was drawn to the issue recently by a Center for Global Development (CGD) report calling for more rigorous impact evaluations, where ‘rigorous’ was taken to mean studies which tackle the selection bias aspect of the attribution problem (CGD, 2006). This argument was not universally well received in the development community; among other reasons there was the mistaken belief that supporters of rigorous impact evaluations were pushing for an approach solely based on randomised control trials (RCTs). While ‘randomisers’ have appeared to gain the upper hand in a lot of the debates—particularly in the United States—the CGD report in fact recognises a range of approaches and the entity set up as a results of its efforts, 3ie, is moving even more strongly towards mixed methods (White, nd). The Department for International Development (DFID) in its draft policy statements similarly stresses the opportunities arising from a synthesis of qualitative and qualitative approaches in impact evaluation. Other work underway on ‘measuring results’ and ‘using numbers’ recognises the need to find standard indicators which capture non-material impacts and which are sensitive to social difference. This work also stresses the importance of supplementing standard indicators with narrative that can capture those dimensions of poverty that are harder to measure. This paper contributes to the ongoing debate on ‘more and better’ impact evaluations by highlighting experience on combining qualitative and quantitative methods for impact evaluation to ensure that we:

1. measure the different impact of donor interventions on different groups of people and

2. measure the different dimensions of poverty, particularly those that are not readily quantified but which poor people themselves identity as important, such as dignity, respect, security and power.

A third framing question was added during the discussions with DFID staff on the use of the research process itself as a way of increasing accountability and empowerment of the poor.

This paper does not intend to provide a detailed account of different approaches to impact evaluation nor an overview of proposed solutions to specific impact evaluation challenges. Instead it defines and reviews the case for combining qualitative and quantitative approaches to impact evaluation. An important principle that emerges in this discussion is that of equity, or what McGee (2003, 135) calls ‘equality of difference’. By promoting various forms of mixing we are moving methodological discussion away from a norm in development research in which qualitative research plays ‘second fiddle’ to conventional empiricist investigation. This means, for example, that contextual studies should not be used simply to confirm or ‘window dress’ the findings of non-contextual surveys. Instead they should play a more rigorous role of observing and evaluating impacts, even replacing, when appropriate, large-scale and lengthy surveys that can ‘overgenerate’ information in an untimely fashion for policy audiences.

The remainder of the paper is structured as follows. Section 2 briefly sets the scene by summarising the policy context. Section 3 clarifies the terminology surrounding qualitative and quantitative approaches, including participatory research. Section 4 reviews options for combining and sequencing qualitative and quantitative methods and data and looks at recent methodological innovations in measuring and analysing qualitative impacts. Section 5 addresses the operational issues to consider when combing methods in impact evaluation. Section 6 briefly concludes.

“Intelligence is about creating and adjusting stories”

…says Gregory Treverton, in his Prospect article “What should we expect of our spies?” , June 2011

RD comment: How do you assess the performance of intelligence agencies, in the way they collect and make sense of the world around them? How do you explain their failure to predict some of the biggest developments in the last thirty years, including the collapse of the Soviet Union, the failure to find weapons of mass destruction (WMD) in Iraq, and  the contagion effects in the more recent Arab Spring?

The American intelligence agencies described by Treverton struggle to make sense of vast masses of information, much of which is incomplete and ambiguous. Storylines emerge and become dominant, which have some degree of fit with the sorrounding political context. “Questions not asked or stories not imagined by policy are not likely to be developed by intelligence”. Referring to the end of the Soviet Union Treverton identifies two possible counter-measures: “What we could have expected of intelligence was not better prediction but earlier and better monitoring  of internal shortcomings. We could also have expected competing stories to challenge the prevailing one. Very late, in 1990, an NIE, “The deepening crisis in the USSR”, did just that laying our four different scenarious, or stories for the coming year”. ”

Discussing the WMD story, he remarks “the most significant part of the WMD story was what intelligence and policy shared: a deeply held mindset that Saddam must have WMD…In the end if most people believe one thing, arguing for another is hard. There is little pressure to rethink the issue and the few dissenters in intelligence are lost in the wilderness. What should have been expected from intelligence in this case was a section of the assessments asking what was the best case that could be made that Iraq did not have WMD.”

Both sets of suggestions seem to have some relevance to the production of evaluations. Should alternate interpretations be more visible? Should evaluations reports contain their own best counter-arguments (as a a free standing section, not simply as straw men to be dutifuly propped up then knocked down)?

There are also other echoes in Treverton’s paper with the practice and problems of monitoring and evaluating aid interventions. The pressing demand for immediate information, at the expense  of a long term perspective: “We used to do analysis, now we do reporting” says one American analyst. Some  aid agency staff have reported similar problems. Impact evaluations? Yes, that would be good, but in reality we are busy meeting the demand for information about more immediate aspects of performance.

Interesting conclusions as well: “At the NIC, I came to think that, for all the technology, strategic analysis was best done in person. I came to think that our real products weren’t those papers, the NIEs. Rather they were the NIOs, the National Intelligence Officers—the experts, not papers. We all think we can absorb information more efficiently by reading, but my advice to my policy colleagues was to give intelligence officers some face time… In 20 minutes, though, the intelligence officers can sharpen the question, and the policy official can calibrate the expertise of the analyst. In that conversation, intelligence analysts can offer advice; they don’t need to be as tightly restricted as they are on paper by the “thou shalt not traffic in policy” edict. Expectations can be calibrated on both sides of the conversation. And the result might even be better policy.”

Evidence of the effectiveness of evidence?

Heart + Mind? Or Just Heart? Experiments in Aid Effectiveness (And a Contest!) by Dean Karlan 05/27/2011 | 4:00 pm Found courtesy of @poverty_action

RD comment: There is a killer assumption behind many of the efforts being made to measure aid effectiveness – that evidence of the effectiveness of specific aid interventions will make a difference. That is,  it will be used to develop better policies and practices. But, as far as I know, much less effort is being invested into testing this assumption, to find out when and where evidence works this way, or not. This is worrying, because anyone looking into how policies are actually made knows that it is often not a pretty picture.

That is why, contrary to my normal policy, I am publicising a blog posting. This posting is by Dean Karlan on an actual experiment that looks at the effect of providing evidence of an aid intervention (a specific form of micro-finance assistance) on the willingness of individual donors to make donations to the aid agency that is delivering the intervention. This relatively simple experiment is now underway.

Equally interesting is the fact that the author has launched, albeit on a very modest scale, a prediction market on the likely results of this experiment. Visitors to the blog are asked to make their predictions on the results of the experiment. When the results of the experiment are available Dean will identify and reward the most successful “bidder” (with two free copies of his new book More Than Good Intentions). Apart from the fun element involved, the use of a prediction maket will enable  Dean to identify to what extent his experiment has generated new knowledge [i.e. experiment results differ a lot from the average prediction], versus confirmed existing common knowledge [i.e. results = the average prediction]. That sort of thing does not happen very often.

So, I encourage you to visit Dean’s blog and participate. You do this by making your predictions using the Comment facility at the end of the blog (where you can also read other’s predictions already made, plus their comments).

Good Enough Guide to Impact Measurement – Rapid Onset Natural Disasters

[from the Emergency Capacity Building Project website]

Published on 6 April 2011

The Department for International Development (DfID / UKAID) awarded a grant for the ECB Project to develop a new Good Enough Guide to Impact Measurement. Lead by Dr. Vivien Walden from Oxfam, a team of ECB specialists from CRS, Save the Children, and World Vision will work together with the British University of East Anglia (UEA).

This guide, and supporting capacity-building materials, will include the development of an impact measurement methodology for rapid onset natural disasters. The methodologies will be field tested by the editorial team in Pakistan and one other country location from September 2011 onwards.

The team welcomes suggestions and input on developing methodologies for impact measurement. Contact us with your ideas at info@ecbproject.org

A CAN OF WORMS? IMPLICATIONS OF RIGOROUS IMPACT EVALUATIONS FOR DEVELOPMENT AGENCIES

Eric Roetman,  International Child Support,  Email: eric.roetman@ic s.nl

3ie Working Paper 11, March 2011 Found courtesy of  @txtpablo

Abstract
“Development agencies are under great pressure to show results and evaluate the impact of projects and programmes. This paper highlights the practical and ethical dilemmas of conducting impact evaluations for NGOs (Non Governmental Organizations). Specifically the paper presents the case of the development organization, International Child Support (ICS). For almost a decade, all of ICS’ projects in West Kenya were evaluated through rigorous, statistically sound, impact evaluations. However, as a result of logistical and ethical dilemmas ICS decided to put less emphasis on these evaluations. This particular case shows that rigorous impact evaluations are more than an additional step in the project cycle; impact evaluations influence every step of the programme and project design. These programmatic changes, which are needed to make rigorous impact evaluations possible, may go against the strategy and principles of many development agencies. Therefore, impact evaluations not only require additional resources but also present organizations with a dilemma if they are willing to change their approach and programmes.”

[RD comment: I think this abstract is somewhat misleading. My reading of the story in this paper is that ICS’s management made some questionable decisions, not that there was something intrinsically questionable about rigourous impact evaluations per se. In the first half of the story the ICS management allowed researchers, and their methodological needs, to drive ICS programming decisions, rather than to serve and inform programming decisions. In the second half of the story the evidence from some studies of the efficacy of particular forms of participatory development seems to have been overriden by the sheer strength of ICSs belief’s in the primacy of participatory approaches. Of course this would not be the first time that evidence has been sidelined, when an organisation’s core values and beliefs are threatened.]

Theory-Based Stakeholder Evaluation

Morten Balle Hansen and Evert Vedung. American Journal of Evaluation
31(3) 295-313, 2010. Available as pdf

Abstract
“This article introduces a new approach to program theory evaluation called theory-based stakeholder evaluation or the TSE model for short. Most theory-based approaches are program theory driven and some are stakeholder oriented as well. Practically, all of the latter fuse the program perceptions of the various stakeholder groups into one unitary program theory. The TSE model  keeps the program theories of the diverse stakeholder groups apart from each other and from the program theory embedded in the institutionalized intervention itself. This represents, the authors argue, an important clarification and extension of the standard theory-based evaluation. The TSE model is elaborated to enhance  theory-based evaluation of interventions characterized by conflicts and competing program theories. The authors argue that especially in evaluations of complex and complicated multilevel and multisite interventions, the presence of competing theories is likely and the TSE model may prove useful.”

Theory of Change: A thinking and action approach to navigate in the complexity of social change processes

Iñigo Retolaza Eguren, HIVOS/DD/UNDP, May 2011 Available as pdf.

“This guide has been jointly published by Hivos and UNDP, and is aimed at the rich constellation of actors linked to processes of social development and change: bilateral donors, community leaders, political and social leaders, NGO’s representatives, community-base organizations, social movements, public decision makers, and other actors related to social change processes.

The Theory of Change approach applied to social change processes represents a thinking-action alternative to other more rigid planning approaches and logics. When living in complex and conflictive times, we need to count with more flexible instruments that allow us to plan and monitor our actions in uncertain, emergent, and complex contexts from a flexible and non-rigid logic. As known, this thinking-action approach is also applied to institutional coaching processes and to the design of social development and change programs.

In general terms, the Guide synthesizes the core of the methodological contents and steps that are developed in a Theory of Change design workshop. The first part of the Guide describes some theoretical elements to consider when designing a Theory of Change applied to social change processes. The second part describes the basic methodological steps to develop in every design of a Theory of Change. For reinforcing this practical part, a workshop route is included, illustrating the dynamics in a workshop of this kind.

The approach and contents of the guide emerge from the learning synthesis of the author, Iñigo Retolaza, as facilitator of Theory of Change design processes where social change actors from several Latin American countries have been involved. His two main bodies of experience and knowledge are: (i) the learning space offered by Hivos, where he could facilitate several Theory of Change workshops with Hivos partner organisations in South and Central America, and (ii) his professional relation with the Democratic Dialogue Regional Project of UNDP, from a research-action approach around dialogic processes applied to various areas of the socio-political field: national dialogues on public policy making and adjusting and legislative proposals, facilitation of national and regional dialogue spaces on several issues, capacity building on dialogue for social and political leaders from several countries in the region”

 

 

Capturing Change in Women’s Realities A Critical Overview of Current M&E Frameworks and Approaches

by Srilatha Batliwala and Alexandra Pittman. Association for Women’s Rights in Development (AWID) Dec 2010. Available as pdf Found courtesy of @guijti

“The two part document begins with a broad overview of common challenges with monitoring and evaluation (M&E) and identifies feminist practices for engaging in M&E to strengthen organizational learning and more readily capture the complex changes that women’s empowerment and gender equality work seek. The document concludes with an overview and in-depth analysis of some of the most widely used and recognized M&E frameworks, approaches, and tools.”

[RD Comment: A bit of text that interested me…”Some women’s rights activists and their allies consequently propose that we need to develop a “theory of constraints” to accompany our “theory of change” in any given context, in order to create tools for tracking the way that power structures are responding to the challenges posed by women’s rights interventions“. ….[and before then, also on page 12] … most tools do not allow for tracking negative change, reversals, backlash, unexpected change, and other processes that push back or shift the direction of a positive change trajectory. How do we create tools that can capture this “two steps forward, one step back” phenomenon that many activists and organizations acknowledge as a reality and in which large amounts of learning lay hidden? In women’s rights work, this is vital because as soon as advances seriously challenge patriarchal or other social power structures, there are often significant reactions and setbacks. These are not, ironically, always indicative of failure or lack of effectiveness, but exactly the opposite— this is evidence that the process was working and was creating resistance from the status quo as a result .”

This useful proposal could apply to other contexts where change is expected to be difficult]

GTZ/BMZ Evaluation and Systems Conference papers

(via Bob Williams on EvalSys)

Systemic Approaches in Evaluation

Documentation of the Conference on 25-26 January 2011

“Development programs promote complex reforms and change processes. Such processes are often characterized by insecurity and unpredictability, posing a big challenge to the evaluation of development projects. In order to understand which projects work, why and under which conditions, evaluations also need to embrace the interaction of various influencing factors and the multi-dimensionality of societal change. However, present evaluation approaches often premise predictability and linearity of event chains.

In order to fill this gap, systemic approaches in evaluation of development programs are increasingly being discussed. A key concept is interdependency instead of linear cause-effect-relations. Systemic approaches in evaluation focus on interrelations and the interaction between various stakeholders with different motivations, interests, perceptions and perspectives.

On January 25 and 26, 2011 the Evaluation and Audit Division of the Federal Ministry of Economic Cooperation and Development (BMZ) and the Evaluation Unit of GIZ offered a forum to discuss systemic approaches to evaluation at an international conference.
More than 200 participants from academia, consulting firms and NGOs discussed, amongst others, the following questions:

  • What are systemic approaches in evaluation?
  • For which kind of evaluations are systemic approaches (not) useful? Can they be used to enhance accountability, for example?
  • Are rigorous impact studies and systemic evaluations antipodes or can we combine elements of both approaches?
  • Which concrete methods and tools can be used in systemic evaluation?

On this website you will find the documentation of all sessions, speeches and discussion rounds. The main conclusions of the conference were summarized in the  final panel discussion.”

 

%d bloggers like this: