NZAID 2008 Evaluations and Reviews: Annual Report on Quality, 2009

Prepared by Miranda Cahn, Evaluation Advisor, Strategy, Advisory and Evaluation Group, NZAID, Wellington, August 2009. Available online

Executive Summary


The New Zealand Agency for International Development (NZAID) is committed to improving evaluative activity1, including evaluations and reviews. Since 2005 NZAID has undertaken annual desk studies of the evaluations and reviews completed by NZAID during the previous calendar year. This 2009 study assesses the quality of 29 NZAID commissioned evaluations and reviews that were submitted to the NZAID Evaluation and Review Committee (ERC) during 2008, and their associated Terms of Reference (TOR). The study identifies areas where quality is of a high standard, and areas where improvement is needed. Recommendations are made on how improvements to NZAID commissioned evaluations and reviews could be facilitated.

The objectives of the study are to:

• assess the quality of the TOR with reference to the NZAID Guidelines on Developing TOR for Reviews and Evaluations

• assess the quality of the NZAID 2008 evaluation and review with reference to the NZAID Evaluation Policy, relevant NZAID Guidelines and Development Assistance Committee of Organisation for Economic Cooperation and Development (DAC) Evaluation Quality Standards

• identify, describe and discuss key quality aspects of the TOR and evaluation and review reports that were of a high standard and those that should be improved in future.


A similar methodology was used in this study as for the 2008 study2 in order that a comparison could be made of the quality of NZAID evaluations and reviews (and their associated TOR) in 2007 and 2008. The 2008 TOR were first assessed on individual quality criteria, guided by the NZAID Guideline for Developing Terms of Reference for Reviews and Evaluations. A matrix was developed for this, similar to that used in the similar 2008 study. The evaluation and review reports were then similarly assessed based on the DAC Evaluation Quality standards, NZAID Evaluation Policy Statement and NZAID evaluation guidelines. Overall assessments were assigned to each TOR, and each review or evaluation. In assigning overall assessments, subjective weighting was given to each quality criteria.

1 The term ‘evaluative activity’ is used by NZAID to refer to a range of evaluation processes and includes analyses conducted for planning, monitoring, review and assessment of ongoing or completed development activities


Terms of Reference (TOR):

There has been an improvement in the quality of the TOR from 2007 to 2008. The area where most improvement was evident in the TOR was in the methodology sections. Improvement was also evident in the 2008 TOR in the:

• description of purpose and rationale

• request for analysis of value for money

• description of the scope of the evaluations/reviews

• outputs and reporting section, and the type of outputs requested being broadened to include feedback workshops and other types of outputs

• instruction for feed back to stakeholders of the findings of the evaluation/review

• description of the management and overseeing of the evaluation.

Despite improvements, the following areas were identified in this 2009 study as key areas (sections in the TOR) where further improvement would improve the quality of the NZAID TOR for evaluations and reviews: objectives, and related evaluation questions; scope; methodology; value for money; outputs and reporting; management and governance. The NZAID Guideline for Developing TOR for Reviews and Evaluations has recently been revised and includes guidance on all of these aspects.

Evaluations and Reviews:

Overall the quality of evaluations and reviews had improved from 2007 to 2008 according to overall assessments allocated for the purpose of this study3. Forty one percent of the 2008 evaluations and reviews were assessed as ‘satisfactory or good in all or many respects’ (highest rank) compared with 22 percent in 2007. Just ten percent of 2008 evaluations and reviews were assessed as ‘not satisfactory’ compared with 28 percent in 2007 (lowest rank). Around half of evaluations and review were assessed as ‘satisfactory or good in some respects’ in both years (medium rank).

The 2008 evaluations and reviews were mostly useful and relevant (as in 2007), their findings meeting the TOR and recommendations flowing logically from the findings. The background and context sections of the reports were generally satisfactory in both years. The review and evaluations had improved considerably between 2007 and 2008 in the following areas:

• reporting

• the way that gender issues had been incorporated into the evaluations and reviews

3 The evaluations and reviews were assessed by the author of the study according to whether they were ‘satisfactory or good in all or many respects’ (highest rank), ‘satisfactory or good in some respects’ (medium rank) or ‘not satisfactory’ (lowest rank). The process for allocating the assessment is described in detail in the report.

• analysis of value for money.

However, as in 2007, confidence in the validity and reliability of the 2008 reports were often compromised by methodologies not being clearly described, and some other issues that are noted below. Furthermore, there was no improvement in the way that objectives of the evaluation or review were described between 2007 and 2008.

The following aspects were identified in this study as key areas where improvement would greatly enhance the quality of the NZAID evaluations and reviews.

• Rationale and purpose of the evaluation or review: reports need to reiterate the rationale and purpose of the evaluation or review in order that sense can be made of the report. This was not done in all the reports.

• Scope of evaluation or review: as with the section on scope in the TOR, so the section on scope in the reports needed to be improved.

• Methodology: better description of information needs (related to evaluation questions); sources of information; data collection methods; data analysis; how NZAID guiding principles have been included in the review or evaluation; and how crosscutting issues have been included. Better description of participatory processes, and moving (where appropriate) from consultative to collaborative and collegiate participatory approaches.

• Description of how ethical issues had been considered.

• Lines of evidence.

• Analysis of value for money.

• Inclusion of crosscutting and mainstreamed issues in a systematic way.

• Reporting: Improvement in areas such as section sequencing; writing style, reducing typological errors; improving executive summaries; better contents pages; and overall, ensuring the reports meet adhere to the NZAID Guideline on the Structure of Evaluation and Review Reports.

Most of these areas of potential improvement have also been identified in previous annual studies of reviews and evaluations, and some aspects have improved over the years.

The study identified a relationship between the quality of TOR and the quality of evaluations and reviews.

Summary of key lessons learned:

1. Despite improvements between 2007 and 2008, there is room for further improvement in TORs, and evaluations and reviews.

2. The quality of the TOR influences the quality of the evaluation or review.

3. The evaluation or review report is the main output of the evaluation that will be available and used by various stakeholders. As such, the reports need to reflect all aspects of the evaluation and review.

4. Contracting evaluators with evaluative experience and skills is important for high quality evaluations and reviews.

5. Submissions to the ERC were useful in confirming the extent to which evaluations and reviews met the TOR objectives and answered the TOR evaluation questions, whether the findings were useful, and whether the recommendations and lessons learned flowed from the findings. However, most submissions were not useful in confirming the extent to which reviews and evaluations met the TOR methodology, or their quality.

6. A participatory approach to this study in the future may provide valuable opportunities for learning on how quality could be improved.

In order that NZAID learns from the evaluations and reviews it commissions, and further improves evaluations and reviews, it is recommended that:

1. NZAID SAEG Evaluation Research and Monitoring (ERM) Team continue to develop processes such as training, advice, guidelines, improved learning from ERC submissions, and support for programme staff, ensuring that areas identified in this report that could be improved continue to be addressed.

2. Evaluations advisors emphasise to NZAID programme and SAEG staff, and contractors, the importance of reports reflecting all aspects of the review or evaluation including, for example, methodology, ethics, principles guiding the review or evaluation, and how these were implemented.

3. An experienced and competent evaluator is included in evaluation and review teams.

4. Peer reviews and appraisals of evaluations and reviews assess the extent to which the reports have described and met ‘process’ aspects of the TOR (eg methodology) as well as the extent to which the purpose and objectives of the review/evaluation have been met (where the emphasis is at present).

5. NZAID Programme staff continue to seek advice from the Evaluation, Research and Monitoring (ERM) Team, identify examples of good evaluative practice in ERC submissions (including where processes have been robust and findings are reliable), and provide feedback on what training and advice provided to them by the ERM team worked and what has not worked.

6. NZAID SAEG ERM Team considers a more participatory and inclusive approach to this study for 2010.

Leave a Reply