Participatory aggregation of qualitative information (PAQI)
[or, Network visualisations of qualitative data]
This page has five sections
- A summary of the core idea: Combining the use of card/pile sorting and network visualisation software
- How to build and explore collective categorisations of qualitative data, sourced from multiple participants
- How to build and explore collective theories-of-change, from the views of multiple participants
- Bringing categories and causal links together
- (new) Associative links, connecting stories and ideas
- References to related work
1. A summary of the core idea
Problem: How do you aggregate large amounts of qualitative data, in a way that does not destroy the interesting details, and/or prematurely impose your own interpretations on the data ? E.g as we often do by counting frequencies of references to things or events of specific interest to ourselves as researchers/evaluators.
Assumption: If we are able to develop better representations of complex bodies of information then this will provide us with more informed choices about how to respond to the content of that information.
The core idea: A combination of two methods can help us aggregate and analyse qualitative information in a way that is participatory, transparent, and systematic.
The two methods are:
1. Pile sorting / card sorting: A simple participatory method of eliciting people’s tacit knowledge, especially the way they categorise people, objects, events etc
2. Social Network Analysis (software): A systematic means of aggregating, visualising and then exploring relationships between people, objects, events
Linking concept: When people categorise people, objects, events, etc, they create relationships beween those events. Two or more entities in the same category can be seen to be related to each other, by that joint membership. And when they categorise objects they also add information to them, in the form of category labels or descriptions (What Dave Snowden calls self-indexing).
2. Building and exploring collective categorisations of qualitative data
A. Pile sorting
Pile or card sorting is a very simple exercise, where participants are asked to sort a set of objects into groups, on the basis of their similarity (i.e. the attributes that they share), as seen by the participant. Having done so, participants are then asked to explain what the objects in each group have in common, and a label is developed for that group, on the basis of that description.
The particular kind of sorting proposed here is “open sorting” by multiple participants, who are give a common set of objects to sort into categories of their choice. Open sorting means participants are allowed to sort the set of objects into any number of categories, as they see fit.
This process can be made more participatory if the objects themselves are generated by the participants, prior to their briefing on the sorting exercise. Participants representing different stakeholders are asked to brainstorm a set of ideas, each of which is written on a filing card, or Post-It note. These cards could describe their views on:
1. Possible objectives for a project (if the focus is on planning),
or
2. Impacts of the project that have been noticed so far (if the focus is on evaluation)
With small groups sorting could be done by individuals. With larger groups, it may be more appropriate to have sub-groups (representing different interests) do their own joint sorting exercise.
Sorting exercises can be done in workshop settings, or online, using services such as WebSort.net (my preference) or OptimalSort. Online sorting can be efficient in terms of use of time, but opportunities are lost to discuss with the participant their experience of the sorting exercise and their rationale for the completed sorting.
PS1: I have set up a seperate post on references and resources on card sorting
PS2: How is pile sorting different from tagging? (a) Tags are usually only one or two words long, whereas descriptions given in pile sort exercises can be whole sentences or longer. The qualitative data is richer; (b) The same tag may be applied to various items at different points in time, and as a result it’s meaning may vary each time. Descriptions given during pile sorts are given to a set of objects at the same time. There is likely to be more consistency of meaning.
B. Network analysis of card sorting results
Once you have results from a set of card sorting exercises there three kinds of network visualisations that can be produced, showing three kinds of relationships:
- Between the sorted items
- Example: A network diagram showing similarities between 24 districts in Indonesia as seperately pile sorted by 5 staff members of a project working in all those districts.
-
- Items that have been categorised in the same way by different respondents are shown with strong (thick) links.
- Groups of items with similar characteristics are visible as cliques or clusters of items. For example, Alore, Sumba Barat and Sumba Timur
- PS: It would be useful to ask the pile sort participants to look at these aggregated results and identify any other common features of the members of each of the clusters
- Items that were categorised differently by different respondents have weak links and are more likely to be on the periphery of the network.
- Between the categories used to describe them,
- Example: A network diagram showing the similarities in the categories used by the 5 staff members, to classify the 24 districts.
- Categories that have many of the same items as members are shown to be strongly linked. For example, in the Indonesian project example, the A4 category label was “These are remote areas” and the C9 label was ” Islands, you need boats to get there. Small populations, different coping mechanisms” Frequently shared categories tell us about common concerns
- Categories with few shared items as members are shown as having weak or non-existent links. For example, those on the top left of the network diagram. These may be of greater value, because they are telling us something that other categories dont.
- Categories that have many of the same items as members are shown to be strongly linked. For example, in the Indonesian project example, the A4 category label was “These are remote areas” and the C9 label was ” Islands, you need boats to get there. Small populations, different coping mechanisms” Frequently shared categories tell us about common concerns
- Example: A network diagram showing the similarities in the categories used by the 5 staff members, to classify the 24 districts.
- Between the participants who sorted them
- Example: A network diagram showing the connections between these 5 participants, arising from similarities in the way they categorised the items
- Participants who have categorised many of the items in the same groups are shown as having strong links. PS: In the example above, there seem to be more similarities between gender than across gender of participants. There are two clusters, of men and women.
- Example: A network diagram showing the connections between these 5 participants, arising from similarities in the way they categorised the items
The network diagrams referred to above have been produced using UCINET & NetDraw. I have set up a seperate web page on the details of the data processing steps that need to be followed to generate each of these visualisations with this widely used software package.
3. Building and exploring collective theories-of-change (ToC)
How do you get many different stakeholders to develop some form of collective theory-of-change, through a process that is systematic and transparent process, with minimal biasing influence by the facilitator? And without assuming that everyone will agree with each other? The method described below would be most useful where there is no central and authoritative version of what the theory of change is. For example amongst a group of independent organisations, or in a network that has no central secretariat.
This method has some additional advantages. Because it is based on aggregating individual decisions it means that the ownership of individual parts of the collective theory-of-change can be identified. This would help highlight where there are different constituencies of support for different parts of an aggregate ToC. It could also help when there is a need to identify interest in, or responsibility for, the implementation, monitoring or evaluation of specific parts of the ToC.
It is also conceivable that the method could be used retrospectively, to reconstruct history, in the form of a theory of change about what has already happened (versus what will happen). This may be especially useful in advocacy activities, which can be very reactive and not as amenable to planning.
The method is as follows:
- Participants brainstorm a set of desired outcomes (events in the short and long term) and record them on Post-Its or some other similar medium that can be easily move around.
- Alt: In the case of advocacy activities it may be better to brainstorm a list of people, or organisations, who will be the subjects and objects of influencing activities
- Individual participants are then asked to sort items into groups of any size. Within each group they identify the one event seen as the influencer of the others, then list the others in consecutive order of the extent to which they are influenced, from most to least. For exampl, one person’s sort results may look like this:
-
- Group 1: A, D, F, C, (means A is believed to influence D, F, and C, and have most influence on D, less for F and least for C)
- Group 2: D, A, G, (means D is believed to influence A and G, and have the most influence on A and less on G)
- Group 3: F, C, A, (means F is believed to influence C and A, and have the most influence on C and less on A
- Participants can use as many of the items generated in the brainstorm as they want to, but they don’t need to use all of them. The same item can be used in more than one grouping (thus recognising that it has multiple causes)
-
- The pile sort data is then entered into a file that is readable by network visualisation software
- There are two software packages that can be used: UCINET and yED.
- yED is probably the simplest so I will describe it here
- Here is an imagined set of cause sort data
- These are then entered into an Excel file available here. This is in an edgelist format, where each row shows one causal relationships, along with the attributes of that relationship. Here the relationship has a rank importance, some qualittaive description and the source person who identified the relationship
- The Excel file is then opened by yED and this kind of network diagram can be produced. Nodes show card ID numbers, links show the rank given to them, and node size reflets the number of incoming links. I have not yet worked how to automate the coloring of links according to who proposed them
- yED is probably the simplest so I will describe it here
-
- Using UCINET
- The same pile sort results can be recorded in Ucinet’s DL Rankedlist format. (See example). Each row lists an item and the others it affects.
- The DL format is then converted to a UCINET file that can be read by NetDraw, using these commans: Data>Inputs text file>DL>Input text file in DL format
- This new file is then opened by NetDraw. See this example network diagram, using the example data above (PS: This was randomly generated, not the result of a real participatory process).
- Here each participants’ causal links are each shown as a different color. Grey links are ones which two or more participants agreed on. Big nodes are those with many incoming links i.e. where impact of all the causal influences should be most evident.
- It is possible to filter links on the basis of which participant proposed them, and what rank was given to them
- Qualitative information can be imported into the file, to provide commentary on each item and link by creating node and link attribute.txt files (to be detailed)
- Using UCINET
- There are two software packages that can be used: UCINET and yED.
- A final participatory stage could be added to this process. For example, participants could be asked to look at the aggregate network structure and then discuss and come to agreement on the relative importance of the causal links from any given node to others. The resulting ranking would help the filtering out of the weaker causal links and create a simpler model that would be easier to work with a communicate to others
PS: Expected linkages between events in a Theory of Change can also be identified using a matrix in an Excel spreadsheet, projected on to the wall of a workshop. I have found this useful when trying to identify the details of how various project Outputs were expected to influence various project Outcomes (aka Purpose level changes in a Logical Framework)
- Outputs are listed in rows and Outcomes in columns. Cells detail the expected relationship between the row Output and the column Outcome. 100% points are allocated down each column, according to participants’ views of how much each row Output is expected to influence that column Outcome. The values given in all the cells of a given row are added up to provide an indication of the relative importance of that row Output. This is one means of generating a set of Output weightings, as required in DFID in the Annual Review reports (along with achievement scores for each Output).
4. Bringing categories and causal links together
This is a more speculative section. I recently posted the following comments on the MSC email list, about the imaginative use of MCS by Claudia Fontes:
“I have been reading a draft report of Claudia Fontes’ work with DOEN Foundation in the Netherlands, which I have mentioned here before. In one visited country the initial set of MSC stories were sorted by the participants into four (I think) domains, based on the similarities that the stories shared with each other. (Note: Not on the basis of their fit with different officially defined goals/objectives). Based on participants’ comments, Claudia then constructed a relatively simple interpretation of how each of these kinds of events (as in each grouping) was perceived to influence one or more of the others.
In effect she was constructing a larger scale story, that brought all the individual stories into a larger and more coherent picture. I thought this had a lot of merit. It is a way of developing a macro theory of change, from the bottom up (i.e. individual stories).
I could also see potential for how participatory exercises could be developed to enable the participants themselves to directly identify which kind of events (i.e. sets of stories) was expected to influence which other kinds of events (sets of stories)…”
It connects to a phrase in Dave Snowden’s text on narrative research: “At the heart of this project is a view of meta-narrative as an emergent property or strange attractor arising from social interaction which is discoverable and actionable…”
The risk with any meta-narrative is that it becomes exclusive and dominant. If the causal links between sets of stories were identified through the process outlined in section 3 above (i.e building up an aggregate view from many individual views) minority interpretations would still be visible and investigatable.
5. Associative links
In this August 2010 posting (Meta-narratives, evaluation and complexity) I raised the idea of a network of stories.
“Stories beget stories. The telling of one can prompt the telling of another. If stories can be seen as linked in this way, then as the number of stories recounted grows we could end up with a network of stories. Some stories in that network may be told more often than others, because they are connected to many others, in the minds of the storytellers. These stories might be what complexity science people call “attractors” Although storytellers may start off telling various different stories, their is a likelihood many of them will end up telling this particular story, because of its connectedness, its position in the network. If these stories are negative, in the sense of provoking antipathy towards others in the same community, then this type of structure may be of concern. Ideally the attractors, the highly connected stories in the network would be positive stories, encouraging peace and cooperation with others. This network structure of stories could be explored by an evaluator asking questions like “What other stories does this story most remind you off? or, “Which of these stories does that story most remind you of?” Or versions thereof. When comparing changes over time the evaluator’s focus would then be on the changing contents of the strongly connected versus weakly connected stories.” …and the overall structure of the network
The answer to the question “What other stories does this story most remind you off?” or, “Which of these stories does that story most remind you of?” would provide data on network linkages of the kind already discussed above. We can list the answers in the form of two lists in Excel, that can be later imported by network software. Firstly, we can list all associations as links in From and To columns in one worksheet. Additional colums in this worksheet can describe the attributes of each link, such as the participants’ explanation of how they saw the connection between the two stories . Secondly, we can list all the stories as nodes in one column in a second worksheet. Additional columns in this worksheet can describe the attributes of each of the stories, as coded by the interviewer, or even by the respondents themselves. This formated network data can then be imported by Ucinet, Visualyser or yED, and possibly others.
6. References
My earlier explanations of this type of analysis can be found here:
- Reflections on Dave Snowden’s presentations on sense-making and complexity (2009),
- A re-analysis of results of an objective identification exercise, for a bushfire prevention program(2007)
“I See How You Think: Using Influence Diagrams to Support Dialogue” (2009) Newell B, Proust K. ANU Centre for Dialogue. The authors describe how they faciliate individuals to develop their own casual networks, then through discussion, create a new shared causal network. As they note, skilled facilitation is very important in this type of highly participatory process.
Assessing Local Knowledge Use in Agroforestry Management with Cognitive Maps (2009) Marney E. Isaac , Evans Dawoe , Krystyna Sieciechowicz, in Environmental Management, Volume 43, Number 6. The authors interviewed 12 individual cocoa farmers to establish their views of the causal steps, and linkages between them, that connected up an initial step of “clearing land” to a final end point of “productive cacoa” or “less productive cacoa”. They allowed for emergent steps that may not be common for all interviewed farmers. However, key words were idnetified to represent common steps. Data was visualised by Decision Explorer software, but analysed using common SNA measures (e.g. number of nodes, links, density, degree centrality. Comparisons of farmers cognitive maps were made on these variables. They did not however generate an aggregate cognitive map, using the steps named and used by two or more respondents. They usefully differentiated between “ordinary variables (nodes) and “transmitter variables”, with the former has bi-directional links and the latter having outard links only. The latter were seen as factors out of the control of the farmer, where as the former were seen as being within control. The status of some of the nodes were used as management indicators to guide decision making. They also note that “Amid complex decision making processes the similarity of the cognitive maps suggests a high likelihood of generaizing individual farmer management techniques. This similarity may be strategically beneficial for regional shifts in agrarian policy toward sustainable practices at the landscape scale”
Visualizing Proximity Data (2007) Rich DeJordy, Stephen P. Borgatti, Chris Roussin, Daniel S. Halgin, on the merits of network models versus multi-dimensional scaling (MDS) for analysing the results of pile sorts (described in the title as proximity data). They identified the potential well before I did. I have been more focused on its application.
Teen Photovoice Project: A Pilot Study to Promote Health Through Advocacy (2007) by Jonathan W. Necheles, MD, MPH, Emily Q. Chung, MPH, Jennifer Hawes-Dawson, BA, Gery W. Ryan, PhD, La’Shield B. Williams, Heidi N. Holmes, Kenneth B. Wells, MD, MPH, Mary E. Vaiana, PhD, Mark A. Schuster, MD, PhD. This paper describes a network visualisation of pile sorting of photographs taken by participants. Two pile sorts were carried out. The first was an unconstrained pile sort, generating 41 categories (described as themes). The second was a constrained pile sort where the researchers seem to have predefined the common categories to be used by all participants, based on the results of the first sorting. The results of these piles sorts were then visualised as a two- mode network (group labels x items), and then shared and discussed with the participants. “Participants were asked to interpret the relationships between piles and pictures to foster a better understanding of how they perceived the pictures and themes.”
Recommending Collaboration with Social Networks:A Comparative Evaluation (2003) by David W. McDonald. “A Successive Pile Sort (SPS) [4, 29] technique was used to collect the second social network. In this technique, the name of every member in the group is written on a card. Participants sort the cards using a high level rubric supplied by the researcher. Each participant is free to interpret the rubric in her own way. The first sort results in a number of “piles” which are, in turn, sorted using the same rubric. The level of the sort at which individuals or groups are broken apart indicates the connection weight between the members. The connection weights are aggregated across all participants to create an edge weighted social network“ Participants were challenged to create sorts with the rubric “who hangs out together.” This rubric was designed to reveal the social structure rather than work context structure at MSC. Motivating the SPS collection by asking “who hangs out together” was one way to consider the more sociable aspect of interaction at MSC. Each participant required between 45 and 90 minutes to sort 47 cards.”
PS: 6th July 2010. In Social Network Analysis the term Cognitive Social Structures refers to social networks , as perceived by the members of those networks (or others). What has been described above is different and could be referred to by the term Social Cognitive Structures: i.e. the social structures created by overlaps in people’s cognitive structures (i.e. their classifications and causal relationships).
Print This Page

One Response to “Participatory aggregation of qualitative information (PAQI)”
Patrick Lambe has emailed me the following useful question:
Hi Rick
Thanks for this – I’m intrigued by this, but I’m missing the last link… which is how insight is derived, and what kinds of insights are derived. This is something common to all network visualisation techniques I find… OK so there’s a map, so what? In social network analysis, for example, the map typically helps analysts/participants identify places in the network that look like they warrant further investigation – eg bottlenecks, disconnects, cliques. These can be either positive or negative forces, depending on the drivers behind that structure, and the overall context. There are different ways of undertaking that investigation to find the “story” behind the structure. I’m curious about where the sensemaking portion lies in the PAQI model – do you have examples of insights and how they are derived and then actioned? I understand this is a work in progress!
My response was as follows:
Hi Patrick
Good questions
I dont think I can promise anyone automatic “insight” as a result of usig the PAQI or any other method. And I doubt if David would do so either re his use of Sensemaker.
Both are tools for providing better/different forms of description of large amounts of qualitative data (and much larger amounts in the case of Sensemaker)
With many forms of measurement and description it is useful to ask people what they expect to be found and then show them what is actually found, and to then discuss and learn from the difference. An NGO in India that I have been working with carried out a large baseline survey of capacities of CBOs in a HIV/AIDS program, that generated a set of performance scores. In response to my suggestion they asked the grantees supporting the CBOs what their expectations were about the CBOs’ scores, then they shared the actual scores, then discussed then differences. In a number of instances this led to agreement on how the survey instrument needed to be changed. I hope in other cases it led to agreement on how the CBOs needed to change!
On the PAQI web page I started with this assumption: If we are able to develop better representations of complex bodies of information then this will provide us with more informed choices about how to respond to the content of that information.
That is what David is doing, providing better representations of large amounts of qual data, less tainted by researcher’s existing beliefs. I think he would then argue the need to explore the outliers, as as well as any central tendency or “averages” The ability to explore data, that is both aggregated but has some structure, is common to both David’s Sensemaker and PAQI
If we look at the one example I presented, which was the result of a very quick inquiry of 5 staff members at the end of a workshop: their classification of 24 Indonesian districts where their project was working. If I had a chance to continue talking to them today I would start by asking them how many clusters do they think might emerge from this analysis, and what would they be.After sharing the aggregated results and discussing any differences between expected and actual results, there could be two ways forward. Ask what are the implications for (a) project design and activities, (b) improved use of this PAQI method). On the former, I would love to know whether given the existence of at least two main clusters of districts (two more can also be identified less distinctly) whether there is any difference in the project strategies being pursued in those areas. And if there is none, should there be a difference?
On the significance of measurement on its own, I often use an imagined situation where two people walk into a room each holding a tape measure. They both proceed to measure the dimensions of a large hole in the wall. One walks away happy, the other walks away unhappy. The first is an air conditioning installer, the other is a security expert. So what use is a tape measure (/network diagram)? Its just numbers on arbitrary scale (/a set of dots connected by lines). Well it turns out it is important if we have prior expectations about what we want to see, but probably meaningless if we dont. Theory and measurement are both needed.
Does this help?
PS: There are other ways of eliciting expectations prior to sharing results, with network diagrams. One easy method is to show network diagrams without visible labels on the nodes and ask participants to identify who is where. Then to make the labels visible.
By rick davies on Apr 2, 2010