Collaborative Evaluations Step by Step

by Liliana Rodriguez-Campos and Rigoberto Rincones-Gomez, Stanford Business Books, 2013 (2nd edition, first was in 2005)

Book website here and available on Amazon. But neither sites show the contents pages or exerpts

Book website says “Collaborative Evaluations is a highly comprehensive and easy-to-follow book for those evaluators who want to engage and succeed in collaborative evaluations. The author presents the Model for Collaborative Evaluations (MCE) with its six major components: (1) identify the situation, (2) clarify the expectations, (3) establish a shared commitment, (4) ensure open communication, (5) encourage best practices, and (6) follow specific guidelines. In clear and simple language, the author outlines key concepts and methods to help master the mechanics of collaborative evaluations.

Each section deals with fundamental factors inside each of the six collaborative evaluation components. In addition, each section provides practical tips for “real-life” applications and step-by-step suggestions or guidelines on how to apply this information. The MCE has emerged from a wide range of collaboration efforts that the author has conducted in the private sector, nonprofit organizations, and institutions of higher education. The author shares her experience and insights regarding this subject in a precise and easy-to-understand fashion, so that the reader can use the information learned from this book immediately.”

Related blog posting: Josey Landrieu on Collaborative Evaluation, on the AEA365 | A Tip-a-Day by and for Evaluators website

What counts as good evidence?

by Sandra Nutley, Alison Powelland Huw Davies,  Research Unit for Research Utilisation (RURU), School of Management, University of St Andrews, www.ruru.ac.uk November 2012

Available as pdf. This is a paper for discussion. The authors would welcome comments, which should be emailed to smn@st-andrews.ac.uk or Jonathan.Breckon@nesta.org.uk

In brief

Making better use of evidence is essential if public services are to deliver more for less. Central to this challenge is the need for a clearer understanding about standards of evidence that can be applied to the research informing social policy.  This paper reviews the extent to which it is possible to reach a workable consensus on ways of identifying and labelling evidence. It does this b y exploring the efforts made to date and the debates that have ensued . Throughout, the focus is on evidence that is underpinned by research, rather than other sources of evidence such as expert opinion or stakeholder views .

After setting the scene, the review and arguments are presented in five main sections:

We begin by exploring practice recommendations: many bodies provide practice recommendations, but concerns remain as to what kinds of research evidence  can or  should underpin such labelling schemas.

T his leads us to examine hierarchies of evidence: study design has long been used as  a key marker for evidence quality, but such ‘hierarchies of evidence ’ raise many  issues and have remained contested. Extending the hierarchies so that they also consider the quality of study conduct or the use of underpinning theory have  enhanced their usefulness but have also exposed new fault – lines of debate.

More broadly, in beyond hierarchies, we recognise that hierarchies of evidence have  seen most use in addressing the evidence for  what works .  As a consequence,several  agencies and authors have developed more complex  matrix approaches for  identifying evidence quality in ways that are more closely linked to the wider range  of  policy o r practice questions being addressed.

Strong evidence, or just good enough? A further pragmatic twist is seen by the recognition that evaluative evidence is always under development. Thus it may be  more helpful to think of an ‘evidence journey’ from promising early findings to  substantive bodies of knowledge.

Finally, we turn to the uses and impacts of standards of evidence and endorsing  practices .  In this section we raise many questions as to the use, uptake and impacts of evidence labelling schemes, bu t are able to provide few definitive answers as the research here is very patchy.

We conclude that there is no simple answer to the question of what counts as good evidence. It depends on what we want to know, for what purposes, and in what contexts we en visage that evidence being used. Thus while there is  a  need to debate  standards of evidence we should be realistic about the extent to which such  standard – setting will shape complex , politicised,  decision – making by policy makers,  service managers  and local practitioners.

 

The impact of statistics classes ;-)

(found via Duncan Green at Oxfam)
 

What she should then have said: “Well, let’s look to see if there is any plausible causal mechanism underneath this correlation” “Can you remember where you were when you first changed your mind? Can you remember what the discussion was about at that time?

See also many other similar comics at the XKDC website, including:

Duggan & Bush on Evaluation in Settings Affected by Violent Conflict: What Difference Does Context Make?

From AEA365:| A Tip-a-Day by and for Evaluators. Posted: 08 Feb 2013 12:51 AM PST

“We are Colleen Duggan, Senior Evaluation Specialist, International Development Research Centre (Canada) and Kenneth Bush, Director of Research, International Conflict Research (Northern Ireland).  For the past three years, we have been collaborating on a joint exploratory research project called Evaluation in Extremis:  The Politics and Impact of Research in Violently Divided Societies, bringing together researchers, evaluators, advocates and evaluation commissioners from the global North and South. We looked at the most vexing challenges and promising avenues for improving evaluation practice in conflict-affected environments.

CHALLENGES Conflict Context Affects Evaluation – and vice versa.  Evaluation actors working in settings affected by militarized or non-militarized violence suffer from the typical challenges confronting development evaluation.  But, conflict context shapes how, where and when evaluations can be undertaken – imposing methodological, political, logistical, and ethical challenges. Equally, evaluation (its conduct, findings, and utilization) may affect the conflict context – directly, indirectly, positively or negatively.

Lessons Learned:

Extreme conditions amplify the risks to evaluation actors.  Contextual volatility and political hyper-sensitivity must be explicitly integrated into the planning, design, conduct, dissemination, and utilization of evaluation.

  1. Some challenges may be anticipated and prepared for, others may not. By recognizing the most likely dangers/opportunities at each stage in the evaluation process we are better prepared to circumvent “avoidable risks or harm” and to prepare for unavoidable negative contingencies.
  2. Deal with politico-ethics dilemmas. Being able to recognize when ethics dilemmas (questions of good, bad, right and wrong) collide with political dilemmas (questions of power and control) is an important analytical skill for both evaluators and their clients.  Speaking openly about how politics and ethics – and not only methodological and technical considerations – influence all facets of evaluation in these settings reinforces local social capital and improves evaluation transparency.
  3. The space for advocacy and policymaking can open or close quickly, requiring readiness to use findings posthaste. Evaluators need to be nimble, responsive, and innovative in their evaluation use strategies.

Rad Resources:

  • 2013 INCORE Summer School Course on Evaluation in Conflict Prone Settings , University of Ulster, Derry/ Londonderry (Northern Ireland. A 5-day skills building course for early to mid-level professionals facing evaluation challenges in conflict prone settings or involved in commissioning, managing, or conducting evaluations in a programming or policy-making capacity.
  • Kenneth Bush and Colleen Duggan ((2013) Evaluation in Extremis: the Politics and Impact of Research in Violently Divided Societies (SAGE: Delhi, forthcoming)

“Big Data for Development: Opportunities & Challenges”

Published by Global Pulse, 29 May 2012

Abstract: “Innovations in technology and greater affordability of digital devices have presided over  today’s Age of Big Data, an umbrella term for the explosion in the quantity and diversity of high frequency digital data. These data hold the potential—as yet largely untapped— to allow decision makers to track development progress, improve social protection, and understand where existing policies and programmes require adjustment.  Turning Big Data—call logs, mobile-banking transactions, online user-generated content such as blog posts and Tweets, online searches, satellite images, etc.—into actionable information requires using computational techniques to unveil trends and patterns within and between these extremely large socioeconomic datasets. New insights gleaned from such data mining should complement official statistics, survey data, and information generated by Early Warning Systems, adding depth and nuances on human behaviours  and experiences—and doing so in real time, thereby narrowing both information and  time gaps. With the promise come questions about the analytical value and thus policy relevance of  this data—including concerns over the relevance of the data in developing country contexts, its representativeness, its reliability—as well as the overarching privacy issues of utilising personal data. This paper does not offer a grand theory of technology-driven social change in the Big Data era. Rather it aims to delineate the main concerns and challenges raised by “Big Data for Development” as concretely and openly as possible, and to suggest ways to address at least a few aspects of each.”

“It is important to recognise that Big Data and real-time analytics are no modern panacea for age-old development challenges.  That said, the diffusion of data science to the realm of international development nevertheless constitutes a genuine opportunity to bring powerful new tools to the fight against poverty, hunger and disease.”

“The paper is structured to foster dialogue around some of the following issues:

  • What types of new, digital data sources are potentially useful to the field of international development?
  • What kind of analytical tools, methodologies for analyzing Big Data have already been tried and tested by academia and the private sector, which could have utility for the public sector?
  • What challenges are posed by the potential of using digital data sources (Big Data) in development work?
  • What are some specific applications of Big Data in the field of global development?
  • How can we chart a way forward?”

Click here to download the PDF:

Read about Global Pulse. “Global Pulse is an innovation initiative launched by the Executive Office of the United Nations Secretary-General, in response to the need for more timely information to track and monitor the impacts of global and local socio-economic crises. The Global Pulse initiative is exploring how new, digital data sources and real-time analytics technologies can help policymakers understand human well-being and emerging vulnerabilities in real-time, in order to better protect populations from shocks.”

See also: World Bank Project Performance Ratings. “IEG independently validates all completion reports that the World Bank prepares for its projects (known as Implementation Completion Reports, or ICRs).  For a subset of completed projects (target coverage is 25%), IEG performs a more in-depth project evaluation that includes extensive primary research and field work.  The corresponding ICR Reviews and Project Performance Assessment Reports (PPARs), codify IEG’s assessments using Likert-scale project performance indicators.  The World Bank Project Performance Ratings database is the collection of more than 8000 project assessments covering about 6000 completed projects, since the unit was originally established in 1967.  It is the longest-running development project performance data collection of its kind.”(1981-2010)

Rick Davies comment: There is a great opportunity here for a data mining analysis to find decision rules that best predict successful projects [Caveat: GIVEN THE FIELDS AVAILABLE IN THIS DATA SET] ”

See also: Good countries or good projects ? macro and micro correlates of World Bank project performance. Author: Denizer, Cevdet; Kaufmann, Daniel; Kraay, Aart; 2011/05/01, Policy Research working paper ; no. WPS 5646 . Summary:”The authors use data from more than 6,000 World Bank projects evaluated between 1983 and 2009 to investigate macro and micro correlates of project outcomes. They find that country-level “macro” measures of the quality of policies and institutions are very strongly correlated with project outcomes, confirming the importance of country-level performance for the effective use of aid resources. However, a striking feature of the data is that the success of individual development projects varies much more within countries than it does between countries. The authors assemble a large set of project-level “micro” correlates of project outcomes in an effort to explain some of this within-country variation. They find that measures of project size, the extent of project supervision, and evaluation lags are all significantly correlated with project outcomes, as are early-warning indicators that flag problematic projects during the implementation stage. They also find that measures of World Bank project task manager quality matter significantly for the ultimate outcome of projects. They discuss the implications of these findings for donor policies aimed at aid effectiveness.”

 See also: A Few Useful Things to Know about Machine Learning.  Pedro Domingos. Department of Computer Science and Engineering, University of Washington, Seattle, WA , 8195-2350, U.S.A.pedrod@cs.washington.edu

PROCESS TRACING: Oxfam’s Draft Protocol

Undated, but possibly 2012. Available as pdf

Background: “Oxfam GB has adopted a Global Performance Framework.  Among other things, this framework involves the random selection of samples of closing or sufficiently mature projects under six outcome areas each year and rigorously evaluating their performance.  These are referred to as Effectiveness Reviews.  Effectiveness Reviews carried out under the Citizen Voice and Policy Influencing thematic areas are to be informed by a research protocol based on process tracing, a qualitative research approach used by case study researchers to investigate casual inference.”

Oxfam is seeking feedback on this draft.    Please send your comments to PPAT@oxfam.org.uk

See also the related blog posting by Oxfam on the “AEA365 | A Tip-a-Day by and for Evaluators”website:

Rick Davies comment: While the draft protocol already includes six references on process tracing, I would recommend two more which I think are especially useful and recent:

  • Mahoney, James. “Mahoney, J. (2012). The Logic of Process Tracing Tests in the Social Sciences.  1-28.” Sociological Methods & Research XX(X) (March 2, 2012): 1–28. doi:10.1177/0049124112437709. http://smr.sagepub.com/content/early/2012/02/29/0049124112437709.full.pdf
    • Abstract: This article discusses process tracing as a methodology for testing hypotheses in the social sciences. With process tracing tests, the analyst combines preexisting generalizations with specific observations from within a single case to make causal inferences about that case. Process tracing tests can be used to help establish that (1) an initial event or process took place, (2) a subsequent outcome also occurred, and (3) the former was a cause of the
      latter. The article focuses on the logic of different process tracing tests, including hoop tests, smoking gun tests, and straw in the wind tests. New criteria for judging the strength of these tests are developed using ideas concerning the relative importance of necessary and sufficient conditions. Similarities and differences between process tracing and the deductive nomological model of explanation are explored.

 

Do we need more attention to monitoring relative to evaluation?

This post title was prompted by my reading of Daniel Ticehurst’s paper (below), and some of my reading of literature on complexity theory and on data mining.

First, Daniel’s paper: Who is listening to whom, and how well and with what effect?   Daniel Ticehurst, October 16th, 2012. 34 pages

Abstract:

“I am a so called Monitoring and Evaluation (M&E) specialist although, as this paper hopefully reveals, my passion is monitoring. Hence I dislike the collective term ‘M&E’. I see them as very different things. I also dislike the setting up of Monitoring and especially Evaluation units on development aid programmes: the skills and processes necessary for good monitoring should be an integral part of management; and evaluation should be seen as a different function. I often find that ‘M&E’ experts, driven by donor insistence on their presence backed up by so-called evaluation departments with, interestingly, no equivalent structure, function or capacity for monitoring, over-complicate the already challenging task of managing development programmes. The work of a monitoring specialist, to avoid contradicting myself, is to help instil an understanding of the scope of what a good monitoring process looks like. Based on this, it is to support those responsible for managing programmes to work together in following this process through so as to drive better, not just comment on, performance.”

“I have spent most of my 20 years in development aid working on long term assignments mainly in various countries in Africa and exclusively on ‘M&E’ across the agriculture and private sector development sectors hoping to become a decent consultant. Of course, just because I have done nothing else but ‘M&E.’ does not mean I excel at both. However, it has meant that I have had opportunities to make mistakes and learn from them and the work of others. I make reference to the work of others throughout this paper from which I have learnt and continue to learn a great deal.”

“The purpose of this paper is to stimulate debate on what makes for good monitoring. It  draws on my reading of history and perceptions of current practice, in the development aid and a bit in the corporate sectors. I dwell on the history deliberately as it throws up some good practice, thus relevant lessons and, with these in mind, pass some comment on current practice and thinking. This is particularly instructive regarding the resurgence of the aid industry’s focus on results and recent claims about how there is scant experience in involving intended beneficiaries and establishing feedback loops, in the agricultural sector anyway.The main audience I have in mind are not those associated with managing or carrying out evaluations. Rather, this paper seeks to highlight particular actions I hope will be useful to managers responsible for monitoring (be they directors in Ministries, managers in consulting companies, NGOs or civil servants in donor agencies who oversee programme implementation) and will improve a neglected area.”

 Rick Davies comment: Complexity theory writers seem to give considerable emphasis to the idea of constant  change and substantial unpredictability of complex adaptive systems (e.g. most human societies). Yet surprisingly enough we find more writings on complexity and evaluation than we do on complexity and monitoring.  For a very crude bit of evidence compare Google searches for “monitoring and complexity  -evaluation” and “evaluation and complexity -monitoring”. There are literally twice as many search results for the second search string. This imbalance is strange because monitoring typically happens more frequently and looks at smaller units of time, than evaluation. You would think its use would be more suited to complex projects and settings.  Is this because we have not had in the past the necessary analytic tools to make best use of monitoring data? Is it also because the audiences for any use of the data have been quite small, limited perhaps to the implementing agency, their donor(s) and the intended beneficiaries at best? The latter should not longer be the case, given the global movement for greater transparency in the operations of aid programs, aided by continually widening internet access. In addition to the wide range of statistical tools suitable for hypothesis testing (generally under-utilised, even in their simplest forms e.g. chi-square tests) there are now a range of data mining tools that are useful for more inductive pattern finding purposes. (Dare I say it, but…) These are already in widespread use by big businesses to understanding and predict their customers behaviors (e.g. their purchasing decisions). The analytic tools are there, and available in in free open source forms (e.g. RapidMiner)

Quality in policy impact evaluation: understanding the effects of policy from other influences

Authors:  Siobhan Campbell, Gemma Harper

Published by HM Treasury, Dept of Energy and Climate Change, Dept of Environment, Food and Rural Affairs, December 2012

Quality in policy impact evaluation (QPIE) is a supplement to the Magenta Book (see below) and provides a guide to the quality of impact evaluation designs. It has been developed to aid policy makers and analysts understand and make choices about the main impact evaluation designs by understanding their pros and cons and how well each design can allow for any measured change to be attributed to the policy intervention being investigated.

Contents

Executive summary
Chapter 1 Introduction
Chapter 2 Quality in policy impact evaluation
Chapter 3 Strong research designs in the measurement of attribution
Chapter 4 Weaker/riskier research designs in the measurement of attribution
Annex A Acknowledgements
Annex B References

===================================================================

The Magenta Book

27 April 2011

The Magenta Book is HM Treasury guidance on evaluation for Central Government, but will also be useful for all policy makers, including in local government, charities and the voluntary sectors. It sets out the key issues to consider when designing and managing evaluations, and the presentation and interpretation of evaluation results. It describes why thinking about evaluation before and during the policy design phase can help to improve the quality of evaluation results without needing to hinder the policy process.

The book is divided into two parts.

Part A is designed for policy makers. It sets out what evaluation is, and what the benefits of good evaluation are. It explains in simple terms the requirements for good evaluation, and some straightforward steps that policy makers can take to make a good evaluation of their intervention more feasible.

Part B is more technical, and is aimed at analysts and interested policy makers. It discusses in more detail the key steps to follow when planning and undertaking an evaluation and how to answer evaluation research questions using different evaluation research designs. It also discusses approaches to the interpretation and assimilation of evaluation evidence.

The Magenta Book will be supported by a wide range of forthcoming supplementary guidance containing more detailed guidance on particular issues, such as statistical analysis and sampling.

The Magenta Book is also available for download in PDF format:

 

Dealing with complexity through “actor-focused” Planning, Monitoring & Evaluation (PME)

From results-based management  towards results-based learning
Jan Van Ongevalle (HIVA), Huib Huyse (HIVA), Cristien Temmink (PSO), Eugenia Boutylkova (PSO), Anneke Maarse (Double Loop)
November 2012. Available as pdf

This document is the final output of the PSO Thematic Learning Programme (TLP) on Planning, Monitoring and Evaluation (PME) of Complex Processes of Social Change, facilitated and funded by PSO, Netherlands and supported by HIVA (Belgium).

1. Introduction

This paper reports the results of a collaborative action-research process (2010-2012) in which 10 development organisations (nine Dutch and one Belgian), together with their  Southern partners, explored if and how a variety of Planning, Monitoring and Evaluation  (PME) approaches and methods helped them  deal with processes of complex change. These  approaches include Outcome Mapping (OM),  Most Significant Change (MSC), Sensemaker,  client-satisfaction instruments, personal-goal  exercises, outcome studies, and scorecards.

The study has been supported by PSO, an  association of Dutch development organisations that supports capacity-development  processes. The Research Institute for Work and Society (HIVA) at the University of Leuven (KU Leuven) provided methodological support.

The collaborative-action research took place on two interconnected levels. At the first level, individual organisations engaged in their own action-research processes in order to address their organisation-specific PME challenges. At a collective level, we wanted to draw lessons from across the individual cases. The overall aim was to find out if and how the various PME approaches piloted in the cases had helped the organisations and their partners to deal with complex change processes. We tried to answer this question by exploring how the PME approaches assisted the pilot cases to deal with the following four implications of PME in complexity: 1) dealing with multiple relations and perspectives; 2) learn about the results of the programme; 3) strengthen adaptive capacity;  and 4) satisfy different accountability needs.  These four questions constitute the main analytic framework of the action research.

A PME approach in this paper refers to the PME methods, tools and concepts and the way they are implemented within a specific context of a programme or organisation. A PME approach also encompasses the underlying values, principles and agenda that come with its methods, tools and concepts. A PME system refers to the way that PME approaches and PME related activities are practically organised, interlinked and implemented within a specific context of a programme or organisation.

Part of the uniqueness of this paper stems from the fact that it is based on the “real life” experiences of the ten pilot cases, where the participants took charge of their own individual action-research processes with the aim of strengthening their PME practice. The results  presented in this article are based on an analysis across the 10 cases. It is the result of close collaboration with representatives of the different cases through various rounds of revision. A group of external advisors also gave input in the cross case analysis. Extracts of the different  cases are given throughout the results chapter  to illustrate arguments made. More detailed information about each case can be found in the individual case reports, which are available at:  https://partos.nl/content/planning-monitoring-and-evaluation -complex-processes-social-change

Pan Africa-Asia Results-Based M&E Forum, Bangkok, Nov 2012 – Presentations now available

The 2012 Pan Africa-Asia Results-Based M&E Forum

Bangkok November 26-28

Sixteen presentations over three days  listed and available online here.

Monday 26 November, 2012

Dr John Mayne, Independent Public Sector Performance Adviser,  ”Making Causal Claims” (9.15 – 10.15am)

Jennifer Mullowney, Senior RBM&E Specialist, CIDA.  ”How to Apply Results-Based Management in Fragile States and Situations: Challenges, Constraints, and Way Forward”. (10.15 to 10.45 am)

Shakeel Mahmood, Coordinator Strategic Planning & M&E, ICDDR.  Strategies for Capacity Building for Health Research in Bangladesh: Role of Core Funding and a Common Monitoring and Evaluation Framework”.  (11.30 -12 noon)

Troy Stremler, CEO, Newdea Inc“Social Sector Trends”  ( 1.40 – 2.40 pm)

Dr Carroll Patterson, Co-founder, SoChaFrom M&E to Social Change: Implementation Imperatives.” 2.40 – 3.10 pm

Susan Davis, Executive Director, Improve International and Marla-Smith-Nilson, Executive Director, Water 1st International), “A Novel Way to Promote Accountability in WASH: Results from First Water & Sanitation Accountability Forum & Plans for the Future. (3.55 – 4.25 pm)

 

 Tuesday 27 November, 2012

Sanjay Saxena, Director, TSCPL Director, M&E/MIS System Consultant.  “Challenges in Implementing M&E systems for Reform Programs.”  (9.15 – 10.15 am)

Chung Lai, Senior M&E Officer, International Relief and Development.  “ Using Data Flow Diagrams in Data Management Processes (demonstration)” (10.15 – 10.45 pm)

Korapin Tohtubtiang, International Livestock Research Institute, Thailand, Lessons Learned from Outcome Mapping in an IDRC Eco-Health Project.”   (11.30 – 12 noon)

Dr Paul Duignan of  DoView, International Outcomes and Evaluation Specialist. “Anyone Else Think the Way We Do Our M&E Work is Too Cumbersome and Painful? Using DoView Visual Strategic Planning & Success Tracking M&E Software – Simplifying, Streamlining and Speeding up Planning, Monitoring and Evaluation” (1.40 – 2.40 pm)

Ahmed Ali, M&E Specialist, FATA Secretariat, Multi-Donor Trust Fund & the World Bank. The Sub-national M&E Systems of the Government of Khyber Pakhtunkhwa and FATA – the Case Study of M&E Multi-donor Trust Fund Projects.” 2.40 – 3.10 pm

Global Health Access Program (GHAP) Backpack Health Worker Teams, Thailand, Cross-border M&E of Health Programs Targeting Internally Displaced Persons (IDPs) in Conflict-affected Regions of Eastern Burma (3.55 – 4.25 pm)

 

Wednesday 28 November, 2012

Dr .V. Rengarajan (Independent M&E & Micro-financing Consultant, Author, & Researcher). What is Needed is an Integrated Approach to M&E.” (9.15 – 10.15 am)

Dr Lesley Williams,  Independent M&E & Capacity-building Consultant, Learn MandE. “Value for Money (VfM): an Introduction.” (10.15 – 10.45 am)

Eugenia Boutylkova (Program Officer, PSO, Holland) and Jan Van Ongevalle (Research Manager, HIVA/KULeuven, Belgium). Thematic Learning Program (TLP): Dealing with Complexity through Planning, Monitoring & Evaluation (PME) (11.30 – 12 noon)

Catharina Maria. Does the Absence of Conflict Indicate a Successful Peace-building Project? (1.40 – 2.40 pm)

 

%d bloggers like this: