Monitoring and Evaluation NEWS

Rapid Review of Embedding Evaluation in UK Department for International Development

February 2014 Executive Summary ….Final Report

“Purpose of the rapid review: Since 2009/10, there has been a drive within the Department for International Development (DFID) to strengthen the evidence base upon which policy and programme decisions are made. Evaluation plays a central role in this and DFID has introduced a step change to embed evaluation more firmly within its programmes. The primary purpose of this rapid review is to inform DFID and the international development evaluation community of the progress made and the challenges and opportunities encountered in embedding evaluation across the organisation.”

Selected quotes:…

“There has been a strong drive to recruit, accredit and train staff in evaluation in DFID since 2011. There have been 25 advisers working in a solely or shared evaluation role, a further 12 advisers in roles with an evaluation component, 150 staff accredited in evaluation and 700 people receiving basic training. …While the scaling up of capacity has been rapid, the depth of this capacity is less than required. The number of embedded advisory posts created is significantly fewer than envisaged at the outset, with eight of 25 advisers working 50% or less on evaluation. ”

“The embedding evaluation approach has contributed to a significant, but uneven, increase in the quantity of evaluations commissioned by DFID. These have increased from around 12 per year, prior to 2011, to an estimated 40 completed evaluations in 2013/14”

“The focus of evaluation has changed to become almost exclusively programme oriented. There are very few thematic or country level evaluations planned whereas previously these types of evaluations accounted for the majority of DFID’s evaluation portfolio. This presents a challenge to DFID as it seeks to synthesise the learning from individual projects and programmes into broader lessons for policy and programme planning and design.”

“The embedding evaluation approach has been accompanied by a significant increase in the number of evaluations which has, in turn, led to an increase in the total amount spent on evaluation. However, the average total cost per evaluation has changed little since 2010. ”

“Externally procured evaluation costs appear to be in line with those of other donors. However, forecasts of future spending on evaluation indicate a likely increase in the median amount that DFID pays directly for evaluations. For non-impact evaluations the median budget is £200,000 and for IEs the median budget is £500,000. This represents a significant under-estimation of evaluation costs.”

“Evaluation accounts for a median of 1.9% of programme value, which is in line with expectations. The amount DFID spends on IEs is higher at 2.6% of programme value but this is consistent with the figures of other donors such as the Millennium Challenge Corporation and the World Bank”.

“There has been considerable enthusiasm shown by programme managers for conducting IEs, which now comprise 28% of planned evaluations.”

Do you have a Data Management Plan?

Sam Held discusses Data Management Plans in his 14 February 2014 AEA blog posting on Federal (US) Data Sharing Policies

“A recent trend in the STEM fields is the call to share or access research data, especially data collected with federal funding. The result is requirements from the federal agencies for data management plans in grants, but the different agencies have different requirements. NSF requires a plan for every grant, but NIH only requires plans for grants over $500,000.

The common theme in all policies is “data should be widely and freely available as possible while safeguarding the privacy of participants, and protecting confidential and proprietary data” (NIH’s Statement on Sharing Data 2/26/2003). The call for a data sharing plan forces the PIs, evaluators, and those involved with the proposals to consider what data will be collected, how will it be stored and preserved, and what will be the procedures for sharing or distributing the data within privacy or legal requirements (i.e., HIPAA or IRB requirements). To me – the most important feature here is data formatting. What format will the data be in now and still be accessible or usable in the future or to those who cannot afford expensive software?”

He then points to DMPTool – a University of California online system for developing Data Management Plans. This site includes more than 20 different templates for the plans, provided by different funding bodies.

DMPTool – a website from the University of California system for developing Data Management Plans. The best component of this site is their collection of funder requirements, including those for NIH, NSF, NEH, and some private foundations. This site includes templates for the plans. – See more at: http://aea365.org/blog/stem-tig-week-sam-held-on-federal-data-sharing-policies/#sthash.U5QbE7zj.dpuf

Reflections on research processes in a development NGO: FIVDB’s survey in 2013 of the change in household conditions and of the effect of livelihood trainings

Received from Aldo Benini:

“Development NGOs are under increasing pressure to demonstrate impact. The methodological rigor of impact studies can challenge those with small research staffs and/or insufficient capacity to engage with outside researchers. “Reflections on research processes in a development NGO: Friends In Village Development Bangladesh’s (FIVDB) survey in 2013 of the change in household conditions and of the effect of livelihood trainings” (2013, with several others) grapples with some related dilemmas. On one side, it is a detailed and careful account of how a qualitative methodology known as “Community-based Change Ranking” and data from previous baseline surveys were combined to derive an estimate of the livelihood training effect distinct from highly diverse changes in household conditions. In the process, over 9,000 specific verbal change statements were condensed into a succinct household typology. On the other side, the report discusses challenges that regularly arise from the study design to the dissemination of findings. The choice of an intuitive impact metric (as opposed to one that may seem the best in the eyes of the analyst) and the communication of uncertainty in the findings are particularly critical.”

Produced by Aldo Benini, Wasima Samad Chowdhury, Arif Azad Khan, Rakshit Bhattacharjee, Friends In Village Development Bangladesh (FIVDB), 12 November 2013

PS: See also...

“Personal skills and social action” (2013, together with several others) is a sociological history of the 35-year effort, by Friends In Village Development Bangladesh (FIVDB), to create and amplify adult literacy training when major donors and leading NGOs had opted out of this sector. It is written in Amartya Sen’s perspective that

“Illiteracy and innumeracy are forms of insecurity in themselves. Not to be able to read or write or count or communicate is itself a terrible deprivation. And if a person is thus reduced by illiteracy and innumeracy, we can not only see that the person is insecure to whom something terrible could happen, but more immediately, that to him or her, something terrible has actually happened”.

The study leads the reader from theories of literacy and human development through adult literacy in Bangladesh and the expert role of FIVDB to the learners’ experience and a concept of communicative competency that opens doors of opportunity. Apart from organizational history, the empirical research relied on biographic interviews with former learners and trainers, proportional piling to self-evaluate relevance and ability, analysis of test scores as well as village development budget simulations conducted with 33 Community Learning Center committees. A beautifully illustrated printed version is available from FIVDB.

Meta-evaluation of USAID’s Evaluations: 2009-2012

(from USAID Learning Lab website)

Author(s):Molly Hageboeck, Micah Frumkin, and Stephanie Monschein

Organization(s):Management Systems International

Institution(s):United States Agency for International Development (USAID)

Date Published:November 25, 2013

Report available as a pdf (a big file). See also video and PP presentations (worth reading!)

Context and Purpose

This evaluation of evaluations, or meta-evaluation, was undertaken to assess the quality of USAID’s evaluation reports. The study builds on USAID’s practice of periodically examining evaluation quality to identify opportunities for improvement. It covers USAID evaluations completed between January 2009 and December 2012. During this four-year period, USAID launched an ambitious effort called USAID Forward, which aims to integrate all aspects of the Agency’s programming approach, including program and project evaluations, into a modern, evidence-based system for realizing development results. A key element of this initiative is USAID’s Evaluation Policy, released in January 2011.

Meta-Evaluation Questions

The meta-evaluation on which this volume reports systematically examined 340 randomly selected evaluations and gathered qualitative data from USAID staff and evaluators to address three questions:

1. To what degree have quality aspects of USAID’s evaluation reports, and underlying practices, changed over time?

2. At this point in time, on which evaluation quality aspects or factors do USAID’s evaluation reports excel and where are they falling short?

3. What can be determined about the overall quality of USAID evaluation reports and where do the greatest opportunities for improvement lie?

Meta-Evaluation Methodology and Study Limitations

The framework for this study recognizes that undertaking an evaluation involves a partnership between the client for an evaluation (USAID) and the evaluation team. Each party plays an important role in ensuring overall quality. Information on basic characteristics and quality aspects of 340 randomly selected USAID evaluation reports was a primary source for this study. Quality aspects of these evaluations were assessed using a 37-element checklist. Conclusions reached by the meta-evaluation also drew from results of four small-group interviews with staff from USAID’s technical and regional bureaus in Washington, 15 organizations that carry out evaluations for USAID, and a survey of 25 team leaders of recent USAID evaluations. MSI used chi-square and t–tests to analyze rating data. Qualitative data were analyzed using content analyses. No specific study limitation unduly hampered MSI’s ability to obtain or analyze data needed to address the three meta-evaluation questions. Nonetheless, the study would have benefited from reliable data on the cost and duration of evaluations, survey or conference call interviews with USAID Mission staff, and the consistent inclusion of the names of evaluation team leaders in evaluation reports.”

Rick Davies comment: Where is the dataset? 340 evaluations were scored on a 37 point checklist. Ten of the 37 checklist items used to creat an overall “score” This data could be analysed in N different ways by many more people, it it was made readily available. Responses please, from anyone..

LineUp: Visual Analysis of Multi-Attribute Rankings

Gratzl, S., A. Lex, N. Gehlenborg, H. Pfister, and M. Streit. 2013. “LineUp: Visual Analysis of Multi-Attribute Rankings.” IEEE Transactions on Visualization and Computer Graphics 19 (12): 2277–86. doi:10.1109/TVCG.2013.173.

“Abstract—Rankings are a popular and universal approach to structuring otherwise unorganized collections of items by computing a rank for each item based on the value of one or more of its attributes. This allows us, for example, to prioritize tasks or to evaluate the performance of products relative to each other. While the visualization of a ranking itself is straightforward, its interpretation is not, because the rank of an item represents only a summary of a potentially complicated relationship between its attributes and those of the other items. It is also common that alternative rankings exist which need to be compared and analyzed to gain insight into how multiple heterogeneous attributes affect the rankings. Advanced visual exploration tools are needed to make this process ef?cient. In this paper we present a comprehensive analysis of requirements for the visualization of multi-attribute rankings. Based on these considerations, we propose LineUp – a novel and scalable visualization technique that uses bar charts. This interactive technique supports the ranking of items based on multiple heterogeneous attributes with different scales and semantics. It enables users to interactively combine attributes and ?exibly re?ne parameters to explore the effect of changes in the attribute combination. This process can be employed to derive actionable insights as to which attributes of an item need to be modi?ed in order for its rank to change. Additionally, through integration of slope graphs, LineUp can also be used to compare multiple alternative rankings on the same set of items, for example, over time or across different attribute combinations. We evaluate the effectiveness of the proposed multi-attribute visualization technique in a qualitative study. The study shows that users are able to successfully solve complex ranking tasks in a short period of time.”

“In this paper we propose a new technique that addresses the limitations of existing methods and is motivated by a comprehensive analysis of requirements of multi-attribute rankings considering various domains, which is the ?rst contribution of this paper. Based on this analysis, we present our second contribution, the design and implementation of LineUp, a visual analysis technique for creating, re?ning, and exploring rankings based on complex combinations of attributes. We demonstrate the application of LineUp in two use cases in which we explore and analyze university rankings and nutrition data. We evaluate LineUp in a qualitative study that demonstrates the utility of our approach. The evaluation shows that users are able to solve complex ranking tasks in a short period of time.”

Rick Davies comment: I have been a long time advocate of the usefullness of ranking measures in evaluation, because they can combine subjective judgements with numerical values. This tool is focused on ways of visualising and manipulating existing data rather than elicitation of the ranking data (a seperate and important issue of its own). It includes lot of options for weighting different attributes to produce overall ranking scores

Free open source software, instructions, example data sets, introductory videos and more available here

Qualitative Comparative Analysis (QCA) An application to compare national REDD+ policy processes

Sehring, Jenniver, Kaisa Korhonen-Kurki, and Maria Brockhaus. 2013. “Qualitative Comparative Analysis (QCA) An Application to Compare National REDD+ Policy Processes”. CIFOR. http://www.cifor.org/publications/pdf_files/WPapers/WP121Sehring.pdf.

“This working paper gives an overview of Qualitative Comparative Analysis (QCA), a method that enables systematic cross-case comparison of an intermediate number of case studies. It presents an overview of QCA and detailed descriptions of different versions of the method. Based on the experience applying QCA to CIFOR’s Global Comparative Study on REDD+, the paper shows how QCA can help produce parsimonious and stringent research results from a multitude of in-depth case studies developed by numerous researchers.QCA can be used as a structuring tool that allows researchers to share understanding and produce coherent data, as well as a tool for making inferences usable for policy advice.

REDD+ is still a young policy domain, and it is a very dynamic one. Currently, the benefits of QCA result mainly from the fact that it helps researchers to organize the evidence generated. However, with further and more differentiated case knowledge, and more countries achieving desired outcomes, QCA has the potential to deliver robust analysis that allows the provision of information, guidance and recommendations to ensure carbon-effective, cost-efficient and equitable REDD+ policy design and implementation.”

Rick Davies comment: I like this paper because it provides a good how-to-do-it overview of different forms of QCA, illustrated in a step-by-step fashion with one practical case example. It may not be quite enough to enable one to do a QCA from the very start, but it provides a very good starting point

The Science of Evaluation: A Realist Manifesto

Pawson, Ray. 2013. The Science of Evaluation: A Realist Manifesto. UK: Sage Publications. http://www.uk.sagepub.com

Chapter 1 is available as a pdf. Hopefully other chapters will also become available this way, because this 240 page book is expensive.

Contents
Preface: The Armchair Methodologist and the Jobbing Researcher
PART ONE: PRECURSORS AND PRINCIPLES
Precursors: From the Library of Ray Pawson
First Principles: A Realist Diagnostic Workshop
PART TWO: THE CHALLENGE OF COMPLEXITY – DROWNING OR WAVING?
A Complexity Checklist
Contested Complexity
Informed Guesswork: The Realist Response to Complexity
PART THREE: TOWARDS EVALUATION SCIENCE
Invisible Mechanisms I: The Long Road to Behavioural Change
Invisible Mechanisms II: Clinical Interventions as Social Interventions
Synthesis as Science: The Bumpy Road to Legislative Change
Conclusion: A Mutually Monitoring, Disputatious Community of Truth Seekers

Reviews

Lawrence Buhagiar at LSE Review of Books
Alliance for Useful Evidence
Adam Fletcher at Decipher
Seetzen, Heidi. 2013. “The Science of Evaluation: A Realist Manifesto.” International Journal of Social Research Methodology 16 (6): 547–48. doi:10.1080/13645579.2013.823286.
Astbury, Brad. 2013. “Some Reflections on Pawson’s Science of Evaluation: A Realist Manifesto.” Evaluation 19 (4): 383–401. doi:10.1177/1356389013505039.

j

Twelve reasons why climate change adaptation M&E is challenging

Bours, Dennis, Colleen McGinn, and Patrick Pringle. 2014. “Guidance Note 1: Twelve Reasons Why Climate Change Adaptation M&E Is Challenging.” SeaChange & UKCIP Available as a pdf

“Introduction: Climate change adaptation (CCA) refers to how people and systems adjust to the actual or expected effects of climate change. It is often presented as a cyclical process developed in response to climate change impacts or their social, political, and economic consequences. There has been a recent upsurge of interest in CCA among international development agencies resulting in stand-alone adaptation programs as well as efforts to mainstream CCA into existing development strategies. The scaling up of adaptation efforts and the iterative nature of the adaptation process means that Monitoring and Evaluation (M&E) will play a critical role in informing and improving adaptation polices and activities. Although many CCA programmes may look similar to other development interventions, they do have specific and distinct characteristics that set them apart. These stem from the complex nature of adaptation itself. CCA is a dynamic process that cuts across scales and sectors of intervention, and extends long past any normal project cycle. It is also inherently uncertain: we cannot be entirely sure about the course of climate change consequences, as these will be shaped by societal decisions taken in the future. How then should we define, measure, and assess the achievements of an adaptation programme? The complexities inherent in climate adaptation programming call for a nuanced approach to M&E research. This is not, however, always being realised in practice. CCA poses a range of thorny challenges for evaluators. In this Guidance Note, we identify twelve challenges that make M&E of CCA programmes difficult, and highlight strategies to address each. While most are not unique to CCA, together they present a distinctive package of dilemmas that need to be addressed.”

See also: Bours, Dennis, Colleen McGinn, and Patrick Pringle. 2013. Monitoring and evaluation for climate change adaptation: A synthesis of tools, frameworks and approaches, UKCIP & SeaChange, pdf version (3.4 MB)

See also: Dennis Bours, Colleen McGinn, Patrick Pringle, 2014, “Guidance Note 2: Selecting indicators for climate change adaptation programming” SEA Change CoP, UKCIP

” This second Guidance Note follows on from that discussion with a narrower question: how does one go about choosing appropriate indicators? We begin with a brief review of approaches to CCA programme design, monitoring, and evaluation (DME). We then go on to discuss how to identify appropriate indicators. We demonstrate that CCA does not necessarily call for a separate set of indicators; rather, the key is to select a medley that appropriately frames progress towards adaptation and resilience. To this end, we highlight the importance of process indicators, and conclude with remarks about how to use indicators thoughtfully and well”

Monitoring and evaluating civil society partnerships

A GSDRC Help Desk response

Request: Please identify approaches and methods used by civil society organisations (international NGOs and others) to monitor and evaluate the quality of their relationships with partner (including southern) NGOs. Please also provide a short comparative analysis.

Helpdesk response

Key findings: This report lists and describes tools used by NGOs to monitor the quality of their relationships with partner organisations. It begins with a brief analysis of the types of tools and their approaches, then describes each tool. This paper focuses on tools which monitor the partnership relationship itself, rather than the impact or outcomes of the partnership. While there is substantial general literature on partnerships, there is less literature on this particular aspect.

Within the development literature, ‘partnership’ is most often used to refer to international or high-income country NGOs partnering with low-income country NGOs, which may be grassroots or small-scale. Much of a ‘north-south’ partnership arrangement centres around funding, meaning accountability arrangements are often reporting and audit requirements (Brehm, 2001). As a result, much of the literature and analysis is heavily biased towards funding and financial accountability. There is a commonly noted power imbalance in the literature, with northern partners controlling the relationship and requiring southern partners to report to them on use of funds. Most partnerships are weak on ensuring Northern accountability to Southern organisations (Brehm, 2001). Most monitoring tools are aimed at bilateral partnerships.

The tools explored in the report are those which evaluate the nature of the partnership, rather than the broader issue of partnership impact. The ‘quality’ of relationships is best described by BOND, in which the highest quality of partnership is described as joint working, adequate time and resources allocated specifically to partnership working, and improved overall effectiveness. Most of the tools use qualitative, perception-based methods including interviewing staff from both partner organisations and discussing relevant findings. There are not many specific tools available, as most organisations rely on generic internal feedback and consultation sessions, rather than comprehensive monitoring and evaluation of relationships. Resultantly, this report only presents six tools, as these were the most referred to by experts.

Full response: http://www.gsdrc.org/docs/open/HDQ1024.pdf

DCED Global Seminar on Results Measurement 24-26 March 2014, Bangkok

Full text available here: http://www.enterprise-development.org/page/seminar2014

“Following popular demand, the DCED is organising the second Global Seminar on results measurement in the field of private sector development (PSD), 24-26 March 2014 in Bangkok, Thailand. The Seminar is being organised in cooperation with the ILO and with financial support from the Swiss State Secretariat for Economic Affairs (SECO). It will have a similar format to the DCED Global Seminar in 2012, which was attended by 100 participants from 54 different organisations, field programmes and governments.

Since 2012, programmes and agencies have been adopting the DCED Standard for results measurement in increasing numbers; recently, several have published the reports of their DCED audit. This Seminar will explore what is currently known, and what we need to know; specifically, the 2014 Seminar is likely to be structured as follows:

An introduction to the DCED, its Results Measurement Working Group, the DCED Standard for results measurement and the Standard audit system
Insights from 10 programmes experienced with the Standard, based in Bangladesh, Cambodia, Fiji, Georgia, Kenya, Nepal, Nigeria and elsewhere (further details to come)
Perspectives from development agencies on results measurement
Cross cutting issues, such as the interface between the Standard and evaluation, measuring systemic change, and using results in decision-making
A review of the next steps in learning, guidance and experience around the Standard
Further opportunities for participants to meet each other, learn about each others’ programmes and make contacts for later follow-up

You are invited to join the Seminar as a participant. Download the registration form here, and send to Admin@Enterprise-Development.org. There is a fee of $600 for those accepted for participation, and all participants must pay their own travel, accommodation and insurance costs. Early registration is advised.”

Rapid Review of Embedding Evaluation in UK Department for International Development

Like this:

Do you have a Data Management Plan?

Like this:

Reflections on research processes in a development NGO: FIVDB’s survey in 2013 of the change in household conditions and of the effect of livelihood trainings

Like this:

Meta-evaluation of USAID’s Evaluations: 2009-2012

Like this:

LineUp: Visual Analysis of Multi-Attribute Rankings

Like this:

Qualitative Comparative Analysis (QCA) An application to compare national REDD+ policy processes

Like this:

The Science of Evaluation: A Realist Manifesto

Like this:

Twelve reasons why climate change adaptation M&E is challenging

Like this:

Monitoring and evaluating civil society partnerships

Like this:

DCED Global Seminar on Results Measurement 24-26 March 2014, Bangkok

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: