When the Evaluation Plan Doesn’t Reflect the Context

By: Aga Khan Foundation

When a partnership offers an opportunity to improve an important value chain

Aga Khan Foundation in Mozambique started the MozaCaju project with USDA funding in late 2013/early 2014, as a subcontractor to TechnoServe. The 3-year project aimed to improve the value chain for cashew production and marketing in Mozambique, providing training and support for cashew producers in growing techniques, harvest and post-harvest improvements, and connections to processing and marketing.

The evaluation plan outlined in the contract was ambitious: It called for baseline, midline and endline evaluations all conducted during the 3-year span, at the tail end of annual cashew sales campaigns. Furthermore, the evaluation design stipulated a more rigorous quasi-experimental approach, with designation of control and beneficiary groups for comparison and attribution of outcomes to project activities. Such an intensive investment in evaluation of a relatively short project was higher than what many non-US donors require in Mozambique and was new to AKF.

We accepted the donor’s plan, however, and the project got underway. Our team had good experience with farmers’ groups in the project area; the cashew project allowed us to leverage that experience with farmers to introduce techniques that would help them improve their production. It appeared to be a promising relationship for improving value chain connections. And in fact, farmer peer-to-peer adoption of the techniques introduced by the project spread faster than we expected.

Project profiling of beneficiaries took time, as did the rollout of project activities and the design of tailored trainings to target smallholder farmers, mid-sized farms, as well as larger and even commercial farm operations. Because there was no random allocation of farmers into beneficiary and non-beneficiary groups at the beginning of the project, there was no scope for a truly experimental research design. However, we were still expected to ensure that the project midline and endline evaluations were done on the basis of a beneficiary vs. control comparison, requiring us to try and ‘match’ non-beneficiaries to our beneficiary sample on characteristics such as household size, number of productive cashew trees and whether farmers sold individually or in a group.

Because our intervention targeted specific administrative posts within a district, our non-beneficiaries were identified in adjacent administrative posts where the project had not targeted its activities.

In theory, this approach could have worked. In reality, it led to problems later.

AKF across its many offices has been grappling with the challenges of Monitoring and Evaluation, creating tools across the organization for better assessment of our impact against key indicators. This case provided useful basis for examination of those tools and shared measurements.

When the donor’s research methods pose a challenge

From the start of MozaCaju, the AKF project team faced difficulties implementing the quasi-experimental methods for evaluation required by the donor. The M&E team lacked experience with this kind of survey design and the program staff and prime grantee decided to accept this design prior to AKF hiring an M&E Director and project M&E Assistant. Additionally, the project planning documents did not provide for training the project team in quasi-experimental research methods. External consultants were hired to oversee the overall project studies, while AKF was responsible for the specific farmer survey. Upon reflection, we determined that while the evaluation design may be appropriate for certain conditions, few AKF teams have the manpower, available time or capacity for implementing it or adjusting such designs in the field.

Compounding the technical challenge of the design was a communications challenge: As a sub on the project, the AKF team didn’t have a direct relationship with the donor and so we couldn’t revisit the evaluation plan directly with the donor. That discussion might have averted later problems.

As noted above, the baseline survey did not make a distinction between control or intervention groups because the survey took place before beneficiary selection. At the time when we chose control and intervention groups for the midline and endline evaluations, we began to note a few problems. First was the fact that our project targeted a high proportion of the cashew farmers in the project districts, leaving a control population that was relatively small from which to draw a sample and which contained a lot of outliers (e.g. low or high production farmers with little interest in being part of the project). This made ‘matching’ rather difficult.

But we did the fieldwork, analyzed the data and found that, in some areas, the control group that we were able to create was dramatically outperforming our beneficiaries. We took a number of steps to try and figure out why this was the case, including re-verifying some of the data in the field, but we were still left with this finding. We eventually realized that when establishing a control group, we forgot to ask non-beneficiaries two very basic questions: 1) have you received any information or training from other farmers on good agricultural practices; and 2) are there are other cashew interventions in this area?

What we found was as: a) there is a strong network between cashew farmers and a number of our beneficiaries were communicating the production techniques they had learned in our project to farmers who were not part of the project (‘contamination’ and ‘spillover’); and b) more important, another NGO was implementing a smaller project focused on some of the same interventions as us, but also on the distribution of chemical pesticides, which we were not. This project also included some of our farmers – which we didn’t realize because farmers thought they were the same project – but was heavily targeting some of the administrative posts where we weren’t working. Chemical pesticide turns out to be absolutely key to improving per tree productivity and this is what we concluded was likely explaining better results in our control group.

These two complicating factors – spillover and the existence of complementary interventions – made it extremely difficult later to isolate factors between the control and intervention groups.

As a result, the midline and endline report analysis showed the control group outperforming the intervention. The project team assessed the results and realized that survey participant selection had failed to reckon with those key contexts.

What lessons went on to inform other programming?

Lessons Learned

One lesson from this experience, in terms of M&E planning, is to recognize the capacities and limitations of your project team at the project’s outset, before accepting donor-driven evaluation procedures and designs. Where there is a relationship and a communication channel for adjusting the evaluation plan given a project’s duration or other tools available, pursue that conversation.

A second lesson is to account for the wider context of the project area or study area to determine whether a specific research design is truly feasible. Context will largely determine the extent to which a quasi-experimental design will be able to capture the project impact or whether alternative approaches would be preferable. For example, spillover effects between beneficiary and control groups can muddy the analysis by bringing into question the degree to which a true control group, isolated from project interventions, can be identified. External factors – such as overlapping and comparable government or other NGO projects in the same sector but implemented in geographies where a control sample is intended to be selected – can also make a quasi-experimental approach of questionable value. As an alternative, mixed method approaches that combine rigorous quantitative data collection among project beneficiaries with good qualitative research to understand the stories behind the numbers, may be a better approach to understanding the impact of project activities.

More broadly for an international NGO is the lesson that project teams need better tools for M&E. This lesson reinforced the need for AKF-wide standards for shared measurements, and for support to ensure teams have the capacity to measure the key indicators. AKF has been conducting a global exercise in M&E shared measurements for this purpose, developing tools for indicators at the global and country level, and standards for benchmarks.


Documenting program impact is important but that documentation is only as good as the methods adopted and the skills of the people doing the evaluations. The methods for monitoring and evaluation of programs must be adapted to the program environment and team capacities. Where a project starts with a donor-driven evaluation plan that is not aligned with team capacities, these must be addressed early, with some combination of: renewed discussion with the donor to adjust the evaluation tools to the project setting and context; reinforcement of team capacity for the resulting adjustment; and continued sharing of M&E measurement experience, tools and standards.


Photo Credit: TechnoServe/Antonio Filippi de Castro