How to Measure Anything: Finding the Value of Intangibles in Business [and elsewhere]

Posted on 20 November, 2017 – 7:24 AM

3rd Edition by Douglas W. Hubbard (Author)

pdf copy of 2nd edition available here

Building up from simple concepts to illustrate the hands-on yet intuitively easy application of advanced statistical techniques, How to Measure Anything reveals the power of measurement in our understanding of business and the world at large. This insightful and engaging book shows you how to measure those things in your business that until now you may have considered “immeasurable,” including technology ROI, organizational flexibility, customer satisfaction, and technology risk.

Offering examples that will get you to attempt measurements-even when it seems impossible-this book provides you with the substantive steps for measuring anything, especially uncertainty and risk. Don’t wait-listen to this book and find out:

  • The three reasons why things may seem immeasurable but are not
  • Inspirational examples of where seemingly impossible measurements were resolved with surprisingly simple methods
  • How computing the value of information will show that you probably have been measuring all the wrong things
  • How not to measure risk
  • Methods for measuring “soft” things like happiness, satisfaction, quality, and more

Amazon.com Review Now updated with new research and even more intuitive explanations, a demystifying explanation of how managers can inform themselves to make less risky, more profitable business decisions This insightful and eloquent book will show you how to measure those things in your own business that, until now, you may have considered “immeasurable,” including customer satisfaction, organizational flexibility, technology risk, and technology ROI.

  • Adds even more intuitive explanations of powerful measurement methods and shows how they can be applied to areas such as risk management and customer satisfaction
  • Continues to boldly assert that any perception of “immeasurability” is based on certain popular misconceptions about measurement and measurement methods
  • Shows the common reasoning for calling something immeasurable, and sets out to correct those ideas
  • Offers practical methods for measuring a variety of “intangibles”
  • Adds recent research, especially in regards to methods that seem like measurement, but are in fact a kind of “placebo effect” for management – and explains how to tell effective methods from management mythology
  • Written by recognized expert Douglas Hubbard-creator of Applied Information Economics

How to Measure Anything, Second Edition illustrates how the author has used his approach across various industries and how any problem, no matter how difficult, ill defined, or uncertain can lend itself to measurement using proven methods.

 

 

VN:F [1.9.22_1171]
Rating: -1 (from 1 vote)

PRISM: TOOLKIT FOR EVALUATING THE OUTCOMES AND IMPACTS ?OF SMALL/MEDIUM-SIZED CONSERVATION PROJECTS

Posted on 18 November, 2017 – 8:38 AM

WHAT IS PRISM?

PRISM is a toolkit that aims to support small/medium-sized conservation projects to effectively evaluate the outcomes and impacts of their work.

The toolkit has been developed by a collaboration of several conservation NGOs with additional input from scientists and practitioners from across the conservation sector.

The toolkit is divided into four main sections:

Introduction and Key Concepts: Provides a basic overview of the theory behind evaluation relevant to small/medium-sized conservation projects

Designing and Implementing the Evaluation: Guides users through a simple, step by step process for evaluating project outcomes and impacts, including identifying what you need to evaluate, how to collect evaluation data, analysing/interpreting results and deciding what to do next.

Modules: Provides users with additional guidance and directs users towards methods for evaluating outcomes/impacts resulting from five different kinds of conservation action:

  • Awareness and Attitudes
  • Capacity Development
  • Livelihoods and Governance
  • Policy
  • Species and Habitat Management

Method factsheets: Outlines over 60 practical, easy to use methods and supplementary guidance factsheets for collecting, analysing and interpreting evaluation data

Toolkit Website: https://conservationevaluation.org/
PDF copy of manual- Download request form: https://conservationevaluation.org/download/

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Recent readings: Replication of findings (not), argument for/against “mixed methods”, use of algorithms (public accountability, cost/benefits, meta data)

Posted on 12 September, 2017 – 10:14 AM

Recently noted papers of interest on my Twitter feed:

  • Go Forth and Replicate: On Creating Incentives for Repeat Studies. Scientists have few direct incentives to replicate other researchers’ work, including precious little funding to do replications. Can that change? 09.11.2017 / BY Michael Schulson
    • “A survey of 1,500 scientists, conducted by the journal Nature last year, suggested that researchers often weren’t telling their colleagues — let alone publishing the results — when other researchers’ findings failed to replicate.”… “Each year, the [US] federal government spends more than $30 billion on basic scientific research. Universities and private foundations spend around $20 billion more, according to one estimate. Virtually none of that money is earmarked for research replication”…”In reality, major scientific communities have been beset these last several years over inadequate replication, with some studies heralded as groundbreaking exerting their influence in the scientific literature — sometimes for years, and with thousands of citations — before anyone bothers to reproduce the experiments and discover that they don’t hold water. In fields ranging from cancer biology to social psychology, there’s mounting evidence that replication does not happen nearly enough. The term “replication crisis” is now well on its way to becoming a household phrase.”
  • WHEN GOVERNMENT RULES BY SOFTWARE, CITIZENS ARE LEFT IN THE DARK. TOM SIMONITE, WIRED, BUSINESS, 08.17.1707:00 AM
    • “Most governments the professors queried didn’t appear to have the expertise to properly consider or answer questions about the predictive algorithms they use”…”Researchers believe predictive algorithms are growing more prevalent – and more complex. “I think that probably makes things harder,” says Goodman.”…”Danielle Citron, a law professor at the University of Maryland, says that pressure from state attorneys general, court cases, and even legislation will be necessary to change how local governments think about, and use, such algorithms. “Part of it has to come from law,” she says. “Ethics and best practices never gets us over the line because the incentives just aren’t there.”
  • The evolution of machine learning. Posted Aug 8, 2017 by Catherine Dong (@catzdong) TechCrunch
    • “Machine learning engineering happens in three stages — data processing, model building and deployment and monitoring. In the middle we have the meat of the pipeline, the model, which is the machine learning algorithm that learns to predict given input data.The first stage involves cleaning and formatting vast amounts of data to be fed into the model. The last stage involves careful deployment and monitoring of the model. We found that most of the engineering time in AI is not actually spent on building machine learning models — it’s spent preparing and monitoring those models.Despite the focus on deep learning at the big tech company AI research labs, most applications of machine learning at these same companies do not rely on neural networks and instead use traditional machine learning models. The most common models include linear/logistic regression, random forests and boosted decision trees.”
  • The Most Crucial Design Job Of The Future. What is a data ethnographer, and why is it poised to become so important? 2017.7.24 BY CAROLINE SINDERS. Co-Design
    • Why we need meta data (data about the data we are using). “I advocate we need data ethnography, a term I define as the study of the data that feeds technology, looking at it from a cultural perspective as well as a data science perspective”…”Data is a reflection of society, and it is not neutral; it is as complex as the people who make it.”
  • The Mystery of Mixing Methods. Despite significant progress on mixed methods approaches, their application continues to be (partly) shrouded in mystery, and the concept itself can be subject to misuse. March 28, 2017 By Jos Vaessen. IEG
    • “The lack of an explicit (and comprehensive) understanding of the principles underlying mixed methods inquiry has led to some confusion and even misuses of the concept in the international evaluation community.”
    • Three types of misuse (
    • Five valid reasons for using mixed methods: (Triangulation, Complementarity, Development, Initiation, Expansion)
  • To err is algorithm: Algorithmic fallibility and economic organisation. Wednesday, 10 May 2017. NESTA
    • We should not stop using algorithms simply because they make errors. Without them, many popular and useful services would be unviable. However, we need to recognise that algorithms are fallible and that their failures have costs. This points at an important trade-off between more (algorithm-enabled) beneficial decisions and more (algorithm-caused) costly errors. Where lies the balance?Economics is the science of trade-offs, so why not think about this topic like economists? This is what I have done ahead of this blog, creating three simple economics vignettes that look at key aspects of algorithmic decision-making. These are the key questions:Risk: when should we leave decisions to algorithms, and how accurate do those algorithms need to be?
      Supervision: How do we combine human and machine intelligence to achieve desired outcomes?
      Scale: What factors enable and constrain our ability to ramp-up algorithmic decision-making?
  • A taxonomy of algorithmic accountability. Cory Doctorow / 6:20 am Wed May 31, 2017 Boing Boing
    • “Eminent computer scientist Ed Felten has posted a short, extremely useful taxonomy of four ways that an algorithm can fail to be accountable to the people whose lives it affects: it can be protected by claims of confidentiality (“how it works is a trade secret”); by complexity (“you wouldn’t understand how it works”); unreasonableness (“we consider factors supported by data, even when you there’s no obvious correlation”); and injustice (“it seems impossible to explain how the algorithm is consistent with law or ethics”)”
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Why have evaluators been slow to adopt big data analytics?

Posted on 9 September, 2017 – 11:45 AM

This is a question posed by Michael Bamberger in his blog posting on the MERL Tech website, titled Building bridges between evaluators and big data analysts. There he puts forward eight reasons (4 main ones and 4 subsidiary points). None of which I disagree with. But I have my own perspective on the same question and posted the following points as a Comment underneath his blog posting.

My take on “Why have evaluators been slow to adopt big data analytics?”

1. “Big data? I am having enough trouble finding any useful data! How to analyse big data is ‘a problem we would like to have’” This is what I suspect many evaluators are thinking.

2. “Data mining is BAD” – because data mining is seen as by evaluators something that is ad hoc and non-transparent. Whereas the best data mining practices are systematic and transparent.

3. “Correlation does not mean causation” – many evaluators have not updated this formulation to the more useful “Association is a necessary but insufficient basis for a strong causal claim”

4. Evaluators focus on explanatory models and do not give much attention to the uses of predictive models, but both are useful in the real world, including the combination of both. Some predictive models can become explanatory models, through follow-up within-case investigations.

5. Lack of appreciation of the limits of manual hypothesis formulation and testing (useful as it can be) as a means of accumulating knowledge. In a project with four outputs and four outcomes there can be 16 different individual causal links between outputs and outcomes, but 2 to the power of 16 possible combinations of these causal links. That’s a lot of theories to choose from (65,536). In this context, search algorithms can be very useful.

6. Lack of knowledge and confidence in the use of machine learning software. There is still work to be done to make this software more user friendly. Rapid Miner, BigML, and EvalC3 are heading in the right direction.

7. Most evaluators probably don’t know that you can use the above software on small data sets. They don’t only work with large data sets. Yesterday I was using EvalC3 with a data set describing 25 cases only.

8. The difficulty of understanding some machine learning findings. Decision tree models (one means of machine learning) are eminently readable, but few can explain the internal logic of specific prediction models generated by artificial neural networks (another means of machine learning, often used for classification of images). Lack of explainability presents a major problem for public accountability. Public accountability for the behavior and use of algorithms is shaping up to be a BIG issue, as highlighted in this week’s Economist Leader article on advances in facial recognition software: What machines can tell from your face

Update: 2017 09 19: See Michael Bamberger’s response to my comments above in the Comment section below. They are copied from his original response posted here http://merltech.org/building-bridges-between-evaluators-and-big-data-analysts/

 

 

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Order and Diversity: Representing and Assisting Organisational Learning in Non-Government Aid Organisations.

Posted on 23 July, 2017 – 10:38 AM

No, history did not begin three years ago ;-)

“It was twenty years ago today…” well almost. Here is a link to my 1998 PhD Thesis of the above title. It was based on field work I carried out in Bangladesh between 1992 and 1995. Chapter 8 describes the first implementation of what later became the Most Significant Change impact monitoring technique. But there is a lot more of value in this thesis as well, including analysis of the organisational learning literature up to that date, an analysis of the Bangladesh NGO sector in the early 1990s, and a summary of thinking about evolutionary epistemology. Unlike all too many PhDs, this one was useful, even for the immediate subjects of my field work. CCDB was still using the impact monitoring process I helped them set up (i.e. MSC)  when I visited them again in the early 2000’s, albeit with some modifications to suit its expanded use.

Abstract: The aim of this thesis is to develop a coherent theory of organisational learning which can generate practical means of assisting organisational learning. The thesis develops and applies this theory to one class of organisations known as non-government organisations (NGOs), and more specifically to those NGOs who receive funds from high income countries but who work for the benefit of the poor in low income countries. Of central concern are the processes whereby these NGOs learn from the rural and urban poor with whom they work.
The basis of the theory of organisational learning used in this thesis is modern evolutionary theory, and more particularly, evolutionary epistemology. It is argued that this theory provides a means of both representing and assisting organisational learning. Firstly, it provides a simple definition of learning that can be operationalised at multiple scales of analysis: that of individuals, organisations, and populations of organisations. Differences in the forms of organisational learning that do take place can be represented using a number of observable attributes of learning which are derived from an interpretation of evolutionary theory. The same evolutionary theory can also provide useful explanations of processes thus defined and represented. Secondly, an analysis of organisational learning using these observable attributes and background theory also suggest two ways in which organisational learning can be assisted. One is the use of specific methods within NGOs: a type of participatory monitoring. The second is the use of particular interventions by their donors: demands for particular types of information which are indicative of how and where the NGO is learning In addition to these practical implications, it is argued that a specific concern with organisational learning can be related to a wider problematic which should be of concern to Development Studies: one which is described as “the management of diversity”. Individual theories, organisations, and larger social structures may not survive in the face of diversity and change. In surviving they may constrain and / or enable other agents, with feedback effects into the scale and forms of diversity possible. The management of diversity can be analysed descriptively and prescriptively, at multiple scales of aggregation.

 

VN:F [1.9.22_1171]
Rating: +2 (from 2 votes)

Twitters posts tagged as #evaluation

Posted on 13 July, 2017 – 8:58 AM

This post should feature a continually updated feed of all Twitter tweets tagged as: #evaluation


VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

REAL-WORLD CHALLENGES TO RANDOMIZATION AND THEIR SOLUTIONS

Posted on 8 May, 2017 – 7:37 PM

Kenya Heard, Elisabeth O’Toole, Rohit Naimpally, Lindsey Bressler. J-PAL North America, April 2017. pdf copy here

INTRODUCTION
Randomized evaluations, also called randomized controlled trials (RCTs), have received increasing attention from practitioners, policymakers, and researchers due to their high credibility in estimating the causal impacts of programs and policies. In a randomized evaluation, a random selection of individuals from a sample pool is offered a program or service, while the remainder of the pool does not receive an offer to participate in the program or service. Random assignment ensures that, with a large enough sample size, the two groups (treatment and control) are similar on average before the start of the program. Since members of the groups do not differ systematically at the outset of the experiment, any difference that subsequently arises between the groups can be attributed to the intervention rather than to other factors.

Researchers, practitioners, and policymakers face many real-world challenges while designing and implementing randomized evaluations. Fortunately, several of these challenges can be addressed by designing a randomized evaluation that accommodates existing programs and addresses implementation challenges.

Program design challenges: Certain features of a program may present challenges to using a randomized evaluation design. This document showcases four of these program features and demonstrates how to alter the design of an evaluation to accommodate them.
• Resources exist to extend the program to everyone in the study area
• Program has strict eligibility criteria
• Program is an entitlement
• Sample size is small

Implementation challenges: There are a few challenges that may threaten a randomized evaluation when a program or policy is being implemented. This document features two implementation challenges and demonstrates how to design a randomized evaluation that mitigates threats and eliminates difficulties in the implementation phase of an evaluation.
• It is difficult for service providers to adhere to random assignment due to logistical or political reasons
• The control group finds out about the treatment, benefits from the treatment, or is harmed by the treatment

 

TABLE OF CONTENTS
INTRODUCTION ……………………………………………………………………….. 3
TABLE OF CONTENTS……………………………………………………………………. 4
PROGRAM DESIGN CHALLENGES ……………………………………………………………. 5
Challenge #1: Resources exist to extend the program to everyone in the study area…………… 5
Challenge #2: Program has strict eligibility criteria …………………………………… 9
Challenge #3: Program is an entitlement…………………………………………………12
Challenge #4: Sample size is small …………………………………………………….16
IMPLEMENTATION CHALLENGES……………………………………………………………..20
Challenge #5: It is difficult for service providers to adhere to random assignment due to logistical
or political reasons …………………………………………………………………20
Challenge #6: Control group finds out about the treatment, benefits from the treatment,
or is harmed by the treatment………………………………………………………….23
SUMMARY TABLE ……………………………………………………………………….27
GLOSSARY ……………………………………………………………………………28
REFERENCES ………………………………………………………………………….29

/p>

VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

Riddle me this: How many interviews (or focus groups) are enough?

Posted on 8 May, 2017 – 7:37 PM

Emily Namey, R&E Search for Evidence http://researchforevidence.fhi360.org/author/enamey

“The first two posts in this series describe commonly used research sampling strategies and provide some guidance on how to choose from this range of sampling methods. Here we delve further into the sampling world and address sample sizes for qualitative research and evaluation projects. Specifically, we address the often-asked question: How many in-depth interviews/focus groups do I need to conduct for my study?

Within the qualitative literature (and community of practice), the concept of “saturation” – the point when incoming data produce little or no new information – is the well-accepted standard by which sample sizes for qualitative inquiry are determined (Guest et al. 2006; Guest and MacQueen 2008). There’s just one small problem with this: saturation, by definition, can be determined only during or after data analysis. And most of us need to justify our sample sizes (to funders, ethics committees, etc.) before collecting data!

Until relatively recently, researchers and evaluators had to rely on rules of thumb or their personal experiences to estimate how many qualitative data collection events they needed for a study; empirical data to support these sample sizes were virtually non-existent. This began to change a little over a decade ago. Morgan and colleagues (2002) decided to plot (and publish!) the number of new concepts identified in successive interviews across four datasets. They found that nearly no new concepts were found after 20 interviews. Extrapolating from their data, we see that the first five to six in-depth interviews produced the majority of new data, and approximately 80% to 92% of concepts were identified within the first 10 interviews.

Emily’s blog continues here http://researchforevidence.fhi360.org/riddle-me-this-how-many-interviews-or-focus-groups-are-enough

VN:F [1.9.22_1171]
Rating: +3 (from 3 votes)

How to find the right answer when the “wisdom of the crowd” fails?

Posted on 9 April, 2017 – 6:39 PM

Dizekes, P. (2017). Better wisdom from crowds. MIT Office News. Retrieved from http://news.mit.edu/2017/algorithm-better-wisdom-crowds-0125  PDF copy pdf copy

Ross, E. (n.d.). How to find the right answer when the “wisdom of the crowd” fails. Nature News. https://doi.org/10.1038/nature.2017.21370

Prelec, D., Seung, H. S., & McCoy, J. (2017). A solution to the single-question crowd wisdom problem.Nature, 541(7638), 532–535. https://doi.org/10.1038/nature21054

Dizekes: The wisdom of crowds is not always perfect. but two scholars at MIT’s Sloan Neuroeconomics Lab, along with a colleague at Princeton University, have found a way to make it better. Their method, explained in a newly published paper, uses a technique the researchers call the “surprisingly popular” algorithm to better extract correct answers from large groups of people. As such, it could refine “wisdom of crowds” surveys, which are used in political and economic forecasting, as well as many other collective activities, from pricing artworks to grading scientific research proposals.

The new method is simple. For a given question, people are asked two things: What they think the right answer is, and what they think popular opinion will be. The variation between the two aggregate responses indicates the correct answer. [Ross: In most cases, the answers that exceeded expectations were the correct ones. Example: If Answer A was given by 70% but 80% expected it to be given and Answer B was given by 30% but only 20% expected it to be given then Answer B would be the “surprisingly popular” answer].

In situations where there is enough information in the crowd to determine the correct answer to a question, that answer will be the one [that] most outperforms expectations,” says paper co-author Drazen Prelec, a professor at the MIT Sloan School of Management as well as the Department of Economics and the Department of brain and Cognitive Sciences.

The paper is built on both theoretical and empirical work. The researchers first derived their result mathematically, then assessed how it works in practice, through surveys spanning a range of subjects, including U.S. state capitols, general knowledge, medical diagnoses by dermatologists, and art auction estimates.

Across all these areas, the researchers found that the “surprisingly popular” algorithm reduced errors by 21.3 percent compared to simple majority votes, and by 24.2 percent compared to basic confidence-weighted votes (where people express how confident they are in their answers). And it reduced errors by 22.2 percent compared to another kind of confidence weighted votes, those taking the answers with the highest average confidence levels”

But “… Prelec and Steyvers both caution that this algorithm won’t solve all of life’s hard problems. It only works on factual topics: people will have to figure out the answers to political and philosophical questions the old-fashioned way”

Rick Davies comment: This method could be useful in an evaluation context, especially where participatory methods were needed or potentially useful

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Fact Checking websites serving as public evidence-monitoring services: Some sources

Posted on 2 March, 2017 – 7:42 AM

These services seem to be getting more attention lately, so I thought it would be worthwhile compiling a list of some of the kinds of fact checking websites that exist, and how they work.

Fact checkers have the potential to influence policies at all stages of the policy development and implementation process, not by promoting particular policy positions based on evidence, but by policing the boundaries of what should be considered as acceptable as factual evidence. They are responsive rather than pro-active.

International

American websites

  • Politifact– PolitiFact is a fact-checking website that rates the accuracy of claims by elected officials and others who speak up in American politics.
  • Fact Check–They monitor the factual accuracy of what is said by major U.S. political players in the form of TV ads, debates, speeches, interviews and news releases.
  • Media Bias / Fact Check…claims to be ” the most comprehensive media bias resource on the internet”, but content is mainly American

Australia

United Kingdom

Discussions of the role of fact checkers

A related item, just seen…

  • This site is “taking the edge off rant mode” by making readers pass a factual knowldge quiz before commenting. ““If everyone can agree that this is what the article says, then they have a much better basis for commenting on it.”

Update 20/03/2017: Read Tim Harford’s blog posting on The Problem With Facts (pdf copy here), and communication value of eliciting curiosity

VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)