The precarious nature of knowledge – a lesson that we have not yet learned?

Is medical science built on shaky foundations? by Elizabeth Iorns New Scientist article (15 Sept 2012).

The following text is relevant to the debate about the usefullness of randomised control trials (RCTs)  in assessing the impact of development aid initiatives. RCTs are an essential part of medical science research, but they are by no means the only research methods used. The article continues…

“More than half of biomedical findings cannot be reproduced – we urgently need a way to ensure that discoveries are properly checked

REPRODUCIBILITY is the cornerstone of science. What we hold as definitive scientific fact has been tested over and over again. Even when a fact has been tested in this way, it may still be superseded by new knowledge. Newtonian mechanics became a special case of Einstein’s general relativity; molecular biology’s mantra “one gene, one protein” became a special case of DNA transcription and translation.

One goal of scientific publication is to share results in enough detail to allow other research teams to reproduce them and build on them. However, many recent reports have raised the alarm that a shocking amount of the published literature in fields ranging from cancer biology to psychology is not reproducible.

Pharmaceuticals company Bayer, for example, recently revealed that it fails to replicate about two-thirds of published studies identifying possible drug targets (Nature Reviews Drug Discovery, vol 10, p 712).

Bayer’s rival Amgen reported an even higher rate of failure – over the past  decade its oncology and haematology researchers could not replicate 47 of 53 highly promising results they examined (Nature, vol 483, p 531). Because drug companies scour the scientific literature for promising leads, this is a good way to estimate how much biomedical research cannot be replicated. The answer: the majority” (read the rest of the article here)

See also: Should Deworming Policies in the Developing World be Reconsidered? The sceptical findings of a systematic review of the impact of de-worming initiatives in schools. De-worming has been one of the methods found effective via RCTs, and widely publicised as an example of how RCTs can really find out what works. The quote below is from Paul Garner’s comments on the systematic review. The same web page also has rejoinders to Garner’s comments, which are also worth reading.

“The Cochrane review on community programmes to deworm children of intestinal helminths has just been updated. We want people to read it, particularly those with an influence on policy, because it is important to understand the evidence, but the message is pretty clear. For the community studies where you treat all school children (which is what WHO advocates) there were some older studies which show an effect on weight gain after a single dose of deworming medicine; but for the most part, the effects on weight, haemoglobin, cognition, school attendance, and school performance are  either absent, small, or not statistically significant. We also found some surprises: a trial published in the British Medical Journal reported that deworming led to better weight gain in a trial of more than 27,000 children, but in fact the statistical test was wrong and in reality the trial did not detect a difference. We found a trial that examined school performance in 2659 children in Vietnam  that did not demonstrate a difference on cognition or weight that has never been published even though it was completed in 2006. We also note that a trial of 1 million children from India, which measured mortality and data collection completed in 2004, has never been published. This challenges the principles of scientific integrity. However, I heard within the last week that the authors do intend to get the results into the public domain-which is where it belongs.

We want to see powerful interventions that help people out of poverty, but they need to work, otherwise we are wasting everyone’s time and money. Deworming schoolchildren to rid them of intestinal helminths seems a good idea in theory, but the evidence for it just doesn’t stack up. We want policy makers to look at the evidence and the message and consider if deworming is as good as it is cracked up to be.”

Taylor-Robinson et al. “Deworming drugs for soil-transmitted intestinal worms in children: effects on nutritional indicators, haemoglobin and school performance” Cochrane Database of Systematic Reviews 2012.

See also: Truth decay: The half-life of facts ,by Samuel Arbesman, New Scientist, 19 September 2012

IN DENTAL school, my grandfather was taught the number of chromosomes in a human cell. But there was a problem. Biologists had visualised the nuclei of human cells in 1912 and counted 48 chromosomes, and it was duly entered into the textbooks studied by my grandfather. In 1953, the prominent cell biologist Leo Sachs even said that “the diploid chromosome number of 48 in man can now be considered as an established fact”.

Then in 1956, Joe Hin Tjio and Albert Levan tried a new technique for looking at cells. They counted over and over until they were certain they could not be wrong. When they announced their result, other researchers remarked that they had counted the same, but figured they must have made a mistake. Tjio and Levan had counted only 46 chromosomes, and they were right.

Science has always been about getting closer to the truth, …

See also book by the same author “The Half-Life of Facts: Why Everything We Know Has an Expiration Date on Amazon. Published October 2012

See also: Why Most Biomedical Findings Echoed by Newspapers Turn out to be False: the Case of Attention Deficit Hyperactivity Disorder by François Gonon, Jan-Pieter Konsman, David Cohen and Thomas Boraud, Plos One, 2012

Summary: Newspapers biased toward reporting early studies that may later be refuted : 7 of top 10 ADHD studies covered by media later attenuated or refuted without much attention

Newspaper coverage of biomedical research leans heavily toward reports of initial findings, which are frequently attenuated or refuted by later studies, leading to disproportionate media coverage of potentially misleading early results, according to a report published Sep. 12 in the open access journal PLOS ONE.

The researchers, led by Francois Gonon of the University of Bordeaux, used ADHD (attention deficit hyperactivity disorder) as a test case and identified 47 scientific research papers published during the 1990’s on the topic that were covered by 347 newspaper articles. Of the top 10 articles covered by the media, they found that 7 were initial studies. All 7 were either refuted or strongly attenuated by later research, but these later studies received much less media attention than the earlier papers. Only one out of the 57 newspaper articles echoing on these subsequent studies mentioned that the corresponding initial finding has been attenuated. The authors write that, if this phenomenon is generalizable to other health topics, it likely causes a great deal of distortion in health science communication.

See alsoThe drugs dont work – a modern medical scandal. The doctors prescribing them don’t know that. Nor do their patients. The manufacturers know full well, but they’re not telling”  by Ben Goldacre, he Guardian Weekend, 22 September 2012 p21-29

Excerpt: “In 2010, researchers from Harvard and Toronto found all the trials looking at five major classes of drug – antidepressants, ulcer drugs and so on – then measured two key features: were they positive, and were they funded by industry? They found more than 500 trials in total: 85% of the industry-funded studies were positive, but only 50% of the government-funded trials were. In 2007, researchers looked at every published trial that set out to explore the benefits of a statin. These cholesterol-lowering drugs reduce your risk of having a heart attack and are prescribed in very large quantities. This study found 192 trials in total, either comparing one statin against another, or comparing a statin against a different kind of treatment. They found that industry-funded trials were 20 times more likely to give results favouring the test drug.

These are frightening results, but they come from individual studies. So let’s consider systematic reviews into this area. In 2003, two were published. They took all the studies ever published that looked at whether industry funding is associated with pro-industry results, and both found that industry-funded trials were, overall, about four times more likely to report positive results. A further review in 2007 looked at the new studies in the intervening four years: it found 20 more pieces of work, and all but two showed that industry-sponsored trials were more likely to report flattering results.

It turns out that this pattern persists even when you move away from published academic papers and look instead at trial reports from academic conferences. James Fries and Eswar Krishnan, at the Stanford University School of Medicine in California, studied all the research abstracts presented at the 2001 American College of Rheumatology meetings which reported any kind of trial and acknowledged industry sponsorship, in order to find out what proportion had results that favoured the sponsor’s drug.”

The results section is a single, simple and – I like to imagine – fairly passive-aggressive sentence: “The results from every randomised controlled trial (45 out of 45) favoured the drug of the sponsor.”

Read more in Ben Goldacre’s new bookBad Pharma: How drug companies mislead doctors and harm patients” published in Sept 2012

See also Reflections on bias and complexity May 29, 2012 by Ben Ramalingam, which talks about a paper in Nature, May 2012 by Daniel Sarewitz, titled “Beware the creeping cracks of bias: Evidence is mounting that research is riddled with systematic errors. Left unchecked, this could erode public trust…”

A move to more systematic and transparent approaches in qualitative evidence synthesis

An update on a review of published papers.
By Karin Hannes and Kirsten Macaitis  Qualitative Research 2012 12: 402 originally published online 11 May 2012

Abstract

In 2007, the journal Qualitative Research published a review on qualitative evidence syntheses conducted between 1988 and 2004. It reported on the lack of explicit detail regarding methods for searching, appraisal and synthesis, and a lack of emerging consensus on these issues. We present an update of this review for the period 2005–8. Not only has the amount of published qualitative evidence syntheses doubled, but authors have also become more transparent about their searching and critical appraisal procedures. Nevertheless, for the synthesis component of the qualitative reviews, a black box remains between what people claim to use as a synthesis approach and what is actually done in practice. A detailed evaluation of how well authors master their chosen approach could provide important information for developers of particular methods, who seem to succeed in playing the game according to the rules. Clear methodological instructions need to be developed to assist others in applying these synthesis methods.

Review of the use of ‘Theory of Change’ in International Development

By Isabel Vogel. Funded by DFID, 2012

Review of the use of ‘Theory of Change’ in international development (full report)
Review of the use of ‘Theory of Change’ in international development (summary)
Appendix 3: Examples of Theories of Change

1. Executive Summary
‘Theory of change’ is an outcomes-based approach which applies critical thinking to the design, implementation and evaluation of initiatives and programmes intended to support change in their contexts. It is being increasingly used in international development by a wide range of governmental, bilateral and multi-lateral development agencies, civil society organisations, international non-governmental organisations and research programmes intended to support development outcomes. The UK’s Department for International Development (DFID) commissioned this review of how theory of change is being used in order to learn from this growing area of practice. DFID has been working formally with theory of change in in its programming since 2010. The purpose was to identify areas of consensus, debate and innovation in order to inform a more consistent approach within DFID.
Continue reading “Review of the use of ‘Theory of Change’ in International Development”

Working with Assumptions in International Development Program Evaluation

By Nkwake, Apollo M., with a Foreword by Michael Bamberger.  2013, 2013, XXI, 184 p. 14 illus., 7 in color. Published by Springer and available on Amazon

Publisher description

“Provides tools for understanding effective development programming and quality program evaluations Contains workshop materials for graduate students and in-service training for development evaluators The author brings together more than 12 years of experience in evaluation of international development programs

Regardless of geography or goal, development programs and policies are fueled by a complex network of implicit ideas. Stakeholders may hold assumptions about purposes, outcomes, methodology, and the value of project evaluation and evaluators—which may or may not be shared by the evaluators. Even when all participants share goals, failure to recognize and articulate assumptions can impede clarity and derail progress.

Working with Assumptions in International Development Program Evaluation probes their crucial role in planning, and their contributions in driving, global projects involving long-term change. Drawing on his extensive experience in the field, the author offers elegant logic and instructive examples to relate assumptions to the complexities of program design and implementation, particularly in weighing their outcomes. The book emphasizes clarity of purpose, respect among collaborators, and collaboration among team members who might rarely or never meet otherwise. Importantly, the book is a theoretical and practical volume that:

·          Introduces the multiple layers of assumptions on which global interventions are based.

·          Explores various approaches to the evaluation of complex interventions, with their underlying assumptions.

·          Identifies ten basic types of assumptions and their implications for program development and evaluation.

·          Provides examples of assumptions influencing design, implementation, and evaluation of development projects.

·          Offers guidelines in identifying, explicating, and evaluating assumptions

A first-of-its-kind resource, Working with Assumptions in International Development Program Evaluation opens out the processes of planning, implementation, and assessment for professionals in global development, including practitioners, development economists, global development program designers, and nonprofit personnel.”

Rick Davies comment: Looks potentially useful, but VERY expensive at £85.50 Few individuals will buy it but organisations might do so. Ideally the author would make a cheaper paperback version available. And Amazaon should provide a “Look inside this book” option, to help people decide if spending £85.50 would be worthwhile. PS: I think the publishers, and maybe the author, would fail the marshmellow test

Rick Davies postcript: The Foreword, Preface and Contents page of the book is available as a pdf, here on the Springer website.

See also:


UNDERSTANDING ‘THEORY OF CHANGE IN INTERNATIONAL DEVELOPMENT: A REVIEW OF EXISTING KNOWLEDGE

DANIELLE STEIN AND CRAIG VALTERS, JULY 2012. Available as pdf.

THIS PUBLICATION IS AN OUTPUT FROM A COLLABORATION BETWEEN THE ASIA FOUNDATION AND THE [LSE] JUSTICE AND SECURITY RESEARCH PROGRAMME

Summary

This is a review of the concepts and common debates within ‘Theory of Change’ (ToC) material, resulting from a search and detailed analysis of available donor, agency and expert guidance documents. The review was undertaken as part of a Justice and Security Research Program

(TAF) collaborative project, and focuses on the field of international development. The project will explore the use of Theories of Change (ToCs) in international development programming, with field research commencing in August 2012. While this document will specifically underpin the research of this collaboration, we also hope it will be of interest to a wider audience of those attempting to come to grips with ToC and its associated literature.

From the literature, we find that there is no consensus on how to define ToC, although it is commonly understood as an articulation of how and why a given intervention will lead to specific change. We identify four main purposes of ToC – strategic planning, description, monitoring and evaluation and learning – although these inevitably overlap. For this reason, we have adopted the term ‘ToC approaches’ to identify the range of applications associated with this term. Additionally, we identify some confusion in the terminology associated with ToC. Of particular note is the lack of clarity surrounding the use of the terms ‘assumption’ and ‘evidence’. Finally, we have also drawn out information on what authors feel makes for ToC ‘best practice’ in terms of both content and process, alongside an exploration of the remaining gaps where more clarity is needed.

A number of ‘key issues’ are highlighted throughout this review. These points are an attempt to frame the literature reviewed analytically, as informed by the specific focus of the JSRP-TAF collaboration. These issues are varied and include the confusion surrounding ToC definitions and use, the need to ‘sell’ a ToC to a funder, how one can know which ‘level’ a ToC should operate on, the relationship between ToC and evidence-based policy, and the potential for accuracy, honesty and transparency in the use of ToC approaches.

This paper does not aim to give definitive answers on ToC; indeed there are many remaining important issues that lie beyond the scope of this review. However, in highlighting a number of key issues surrounding current understandings of ToC approaches, this review hopes to pave the way for more constructive and critical discussion of both the concept and practical application of ToCs.

What Causes What & Hypothesis testing: Truth and Evidence

Two very useful chapters in Denise Cummins (2012) “Good Thinking“, Cambridge University Press

Cummins is a professor of psychology and philosophy, both of which she brings to bear in this great book. Read an interview with author here

Contents include:

1. Introduction
2. Rational choice: choosing what is most likely to give you what you want
3. Game theory: when you’re not the only one choosing
4. Moral decision-making: how we tell right from wrong
5. The game of logic
6. What causes what?
7. Hypothesis testing: truth and evidence
8. Problem solving: another way of getting what you want
9. Analogy: this is like that.

UK centre of excellence for evaluation of international development

Prior Information Notice

DFID is planning to establish a Centre of Excellence to assist with our commitment to use high quality evaluation to maximise the impact UK funded international development. DFID would like to consult with a wide range of research networks and experts in the field, and invite ideas and suggestions to help develop our ideas further before formally issuing invitations to tender to the market for this opportunity. There are two main channels for interested parties to contribute to this process:

1. Comments and views on the draft scope can be fed in through the DFID supplier portal by registering for this opportunity at https://supplierportal.dfid.gov.uk/selfservice/ and accessing the documentation.

2. DFID will hold bilateral discussions and/or information sharing sessions with interested parties depending on demand.

Please ensure all comments are fed in through the DFID portal by 31st August 2012. Once the consultation process is complete and the scope of the Centre of Excellence fully defined, DFID plans to a run a competitive tender for this work. The target date for establishment of the Centre is mid 2013.

RD Comment: Why is this consultation process not more open? Why do particpants have to register as potential suppliers, when many who might be interested to read and comment on the proposal would probably not necessarily want to become suppliers?

DFID How To Note: Reviewing and Scoring Projects Introduction

November 2011. Available as pdf.

” Introduction This guidance is to help DFID staff, project partners and other stakeholders use the scoring system and complete the latest templates when undertaking an Annual Review (AR) or Project Completion Review (PCR – but formerly known as Project Completion Report)) for projects due for review from January 2012.  This guidance applies to all funding types however separate templates are available for core contributions to multilateral organisations.  The guidance does not attempt to cover in detail how to organise the review process, although some help is provided.

Contents:
Principal changes from previous templates        2
Introduction        2
What is changing?        3
What does it involve?        4
Using the logframe as a monitoring tool        5
If you don?t have a logframe        6
Assessing the evidence base        6
The Scoring System        6
Updating ARIES        7
Transparency and Publishing AR?s and PCR?s       7
Projects below £1m approved prior to the new Business Case format   8
Multilateral Core Contributions        9
Filling in the templates/ Guidance on the template contents     9
Completing the AR/PCR and information onto ARIES     19
Annex A:  Sample Terms of Reference ”

RD Comment: To my surprise, although this How To Note gives advice on how to assign weights to each outputs, it does not explain how these interact with output scores to generate a weighted achievement score for each output. Doing so would help explain why the weightings are being requested. At present weightings are requested but their purpose is not explained.

The achievement scoring system is a definite improvement on the previous system. The focus is now on actual achievement to date rather than expected achievement by the end of the project, and the scale is evenly balanced, with the top and bottom of the scale representing over and under-achievement respectively.

Models of Causality and Causal Inference

by BarbaraBefani. An annex to BROADENING THE RANGE OF DESIGNS AND METHODS FOR IMPACT EVALUATIONS. Report of a study commissioned by the Department for International Development, APRIL 2012 ,  by Elliot Stern (Team Leader), Nicoletta Stame, John Mayne, Kim Forss, Rick Davies, Barbara Befani

Introduction

The notion of causality has given rise to disputes among philosophers which still continue today. At the same time, attributing causation is an everyday activity of the utmost importance for humans and other species, that most of us carry out successfully outside the corridors of academic departments. How do we do that? And what are the philosophers arguing about? This chapter will attempt to provide some answers, by reviewing some of the notions of causality in the philosophy of science and “embedding” them into everyday activity. It will also attempt to connect these with impact evaluation practices, without embracing one causation approach in particular, but stressing strengths and weaknesses of each and outlining how they relate to one another. It will be stressed how both everyday life, social science and in particular impact evaluation have something to learn from all these approaches, each illuminating on single, separate, specific aspects of the relationship between cause and effect. The paper is divided in three parts: the first addresses notions of causality that focus on the simultaneous presence of a single cause and the effect; alternative causes are rejected depending on whether they are observed together with effect. The basic causal unit is the single cause, and alternatives are rejected in the form of single causes. This model includes multiple causality in the form of single independent contributions to the effect. In the second part, notions of causality are addressed that focus on the simultaneous presence of multiple causes that are linked to the effect as a “block” or whole: the block can be either necessary or sufficient (or neither) for the effect, and single causes within the block can be necessary for a block to be sufficient (INUS causes). The third group discusses models of causality where simultaneous presence is not enough: in order to be defined as such, causes need to be shown to actively manipulate / generate the effect, and focus on how the effect is produced, how the change comes about. The basic unit here – rather than a single cause or a package – is the causal chain: fine-grained information is required on the process leading from an initial condition to the final effect.

The second type of causality is something in-between the first and third: it is used when there is no finegrained knowledge on how the effect is manipulated by the cause, yet the presence or absence of a number of conditions can be still spotted along the causal process, which is thus more detailed than the bare “beginning-end” linear representation characteristic of the successionist model.

 

RD Comment: I strongly recommend this paper

For more on necessary and/or sufficient conditions see this blog posting which shows how different combinations of causal conditions can be visually represented and recognised, using Decision Trees

 

DFID’s Approach to Impact Evaluation – Part I

[from Development Impact: News, views, methods, and insights from the world of impact evaluation.  Click here https://blogs.worldbank.org/impactevaluations/node/838 to view full story.
As part of a new series looking how institutions are approaching impact evaluation, DI virtually sat down with Nick York, Head of Evaluation and Gail Marzetti, Deputy Head, Research and Evidence Division
Development Impact (DI): There has been an increasing interest in impact evaluation (defined as experimental/quasi-experimental analysis of program effects) in DFID. Going forward, what do you see as impact evaluation’s role in how DFID evaluates what it does? How do you see the use of impact evaluation relative to other methods?  
Nick YorkThe UK has been at the forefront among European countries in promoting the use of impact evaluation in international development and it is now a very significant part of what we do – driven by the need to make sure our decisions and those of our partners are based on rigorous evidence.   We are building in prospective evaluation into many of our larger and more innovative operational programmes – we have quite a number of impact evaluations underway or planned commissioned from our country and operational teams. We also support international initiatives including 3ie where the UK was a founder member and a major funder, the Strategic Impact Evaluation Fund with the World Bank on human development interventions and NONIE, the network which brings together developing country experts on evaluation to share experiences on impact evaluation with professionals in the UN, bilateral and multilateral donors.
DI: Given the cost of impact evaluation, how do you choose which projects are (impact) evaluated?
NY:  We focus on those which are most innovative – where the evidence base is considered to be weak and needs to be improved – and those which are large or particularly risky. Personally, I think the costs of impact evaluation are relatively low compared to the benefits they can generate, or compared to the costs of running programmes using interventions which are untested or don’t work.   I also believe that rigorous impact evaluations generate an output – high quality evidence – which is a public good so although the costs to the commissioning organization can be high they represent excellent value for money for the international community. This is why 3ie, which shares those costs among several organisations, is a powerful concept.
%d bloggers like this: