Program Level Monitoring, Evaluation, and Learning A guide for systems design LEAD Team
This document gives guidance to Oxfam America field staff about how to design program-level Monitoring, Evaluation and Learning (Strategic MEL) systems. It covers: 1) What is program-level MEL? 2) What decisions need to be made? 3) How can you document those decisions? The guide is intended for experienced Program Coordinators and Deputy Regional Directors with MEL competence already. It is not a beginner’s guide; nor is it a detailed, step-by-step, “how to.” Instead, the guide describes the minimum elements needed to establish a useful programmatic MEL system. Experienced field staff can, we think, build such systems quite easily on their own, if the requirements are clear. We do not believe that establishment of program level MEL systems should require external assistance from LEAD. The guide relies heavily on actual examples of MEL systems developed in Oxfam America field programs. These are available in the section called “supplemental guidance”. The use of concrete examples is a deliberate learning strategy: LEAD believes that if the elements of program MEL systems are clear, experienced field staff and partners can produce them. Examples give staff a concrete picture of what is “good enough” and, we hope, also allows staff to say, “I have a better idea, one that improves upon this.” While seeking simple and direct guidance, and seeking brevity too, this guide is linked to a much wider body of advice, examples, and guidance for all of ROPE II. You can go to Padare by clicking on the underlined, blue hotlinks embedded in this document. The document you have in your hand is available on Padare – in modified and expanded form – at the Program MEL Guidance section. There, you will also find guidance on a number of other aspects of good monitoring, evaluation, and learning, ones that are essential precursors to being able to develop programlevel MEL systems (such as Program Theory of Change, Impact Benchmark Narratives, Program Assumptions and Risks) or that link program level MEL to more operational M&E (such as Program Implementation Plan (PIP)). Most of this material is included in the document called “Supplemental Guidance” which accompanies these draft guidelines. Finally: This guidance is a draft, a test. It will be field tested by SAMRO staff between AugustOctober 2009, then revised and improved for general circulation in FY10.
Table of Contents I.
Introduction to Program Monitoring, Evaluation and Learning……………………..P. 4
The MEL Plan………………………………………………………………………….. P. 5
Fundamental Elements of Program MEL Design……………………………………P. 7 1.
Impact and outcome indicators………………………………………………………. P. 7
Data collection for indicators (users, sources, frequency, and who collects)…….P. 10
Impact and Learning Products and Calendar ………………………………………P. 12
Budget Estimations……………………………………………………………………..P. 13
Introduction to Program Monitoring, Evaluation and Learning
The purpose of Program Monitoring, Evaluation and Learning (sometimes referred to as Strategic MEL or Program-level MEL1) is to provoke continuous learning and improvement through tracking of progress towards the program goal. A good system allows us to agree on both achievements and setbacks. It also allows partners and primary change agents to engage in these conversations. Effective program MEL leads to changes in program strategies, continuous improvement of our work, empowerment of partners and primary change agents, and credible evidence of Oxfam’s contribution to elimination of root causes of poverty and social injustice. Program level monitoring, evaluation and learning systems – as opposed to project M&E – enable us to tell a story over time (10 – 15 years). The story is focused on the changes in the root causes of poverty and injustice that the program is accomplishing. Therefore, the system focuses strongly on understanding what is changing in the lives of specific primary change agents and is based on the Program Theory of Change2 (See the EARO example on pp. 32-34 of the Supplemental Guidance). Program-level MEL systems take for granted that complex social change processes are neither linear nor easily proven. They take for granted that lasting impacts on root causes of poverty will almost never be attributed to Oxfam alone. Effective Program-level MEL, therefore, focuses on plausible contribution (rather than attribution). Effective Program-level MEL allows us to test our core assumptions and hypothesis of how change happens. It allows us to constantly improve that hypothesis, too. Table 1: Differentiating Levels of Monitoring, Evaluation and Learning Program-level (Strategic) Project/Grant-level (Operational) Program Level (Program Strategy PaperProject and Grant Level (Program Implementation PlanPSP) PIP) Investigates and reports on outcome and Reports on outputs, activities and emerging evidence of impact level changes outcomes Focused on knowledge generation and Focused on accountability to delivering on commitments applied learning leading to improved impacts and attribution of results overtime Tests underlying program assumptions and Confirms or not progress and achievement of activities potential risks to primary change agents
Program-level MEL systems: Design Principles 1. Rights-based Program MEL is different from mainstream, project M&E. A program-level MEL system cannot operate outside of the needs, interests, and capabilities of program partners and primary change agents). They must have a strong influence on definition of program goals, on the indicators that will demonstrate progress, and on interpretation and analysis of monitoring and evaluation data. Rightsbased MEL systems use M&E itself to empower and build capacity of partners and primary change agents. Stakeholder engagement is simply not optional. 2. Program-level MEL systems need to be simple enough in order to generate agreement among all partners and key stakeholders, rigorous enough to tell a credible story of impact and useful enough to 1
The terms Strategic MEL and Program MEL are synonymous in the OA context. Moving forward, we will use the term Program MEL to describe OA’s Monitoring, Evaluation and Learning systems at the program-level. 2 The Program MEL System is designed to test the Program Theory of Change which is the central focus of the MEL system. Thus, we recommend carefully revisiting the Program Theory of Change as a pre-requisite to developing the Program MEL system.
motivate iterative engagement and critical reflection and learning over time. These three aims – simplicity, rigor, and engagement – may sometimes conflict with one another. 3. Each program has different aims, contexts, cultures, and partners. This means different monitoring, evaluation and learning plans, indicators, techniques, and approaches. Each program’s MEL system will be unique. 4. Programs, and Program MEL systems, are ideally co-designed and managed by a coalition of strategic partners – including representatives of primary change agents – who share a commitment to work together toward a common impact goal. (See ROPE II for more on Program Working Groups.) 5. Every program is expected to have an ongoing relationship with a local/regional research institute. The purpose of this relationship is to: 1) provide mechanisms through which program staff are challenged about strategies and monitoring and evaluation data, 2) channel to the program sophisticated, academic, or expert thinking and good practices from elsewhere, and 3) provide objective, external impact research periodically. This external research supplements the ongoing, internal monitoring data that programs will collect and analyze regularly. What we identify as minimum components and requirements should offer enough guidance to program staff and partners to develop a monitoring, evaluation and learning plan that is appropriate and useful to them. •
Program MEL Plan
Program theory of change
Impact indicators and benchmarks)
Explicit program assumptions and risks analysis (based on the program theory of change)
Defined Program MEL products and processes
Program Implementation Plan
II. Program MEL Plan The Program MEL plan is the most important product for documenting MEL choices. It will serve to guide strategies, grant making, research plans, and learning events throughout the life of the program. It sets down the program’s impact and outcome indicators; data collection plans, publication schedules, methods, responsibilities for MEL, learning moments and events, and budget. See the next page for a sample format.
Program-level MEL Plan
This is an example. You should adapt and modify it to your context. (Also see HARO and EARO MEL plans in the Supplemental Guidance pp. 15-20, 35-37)
III. Fundamental Elements of Program Level MEL Deciding what and how to study and assess impact is one of the most important components of program monitoring, evaluation and learning systems. Who collects what information, and how data is collected, analyzed, and used can reinforce power inequalities in a community or, if done smartly, can be a force for creating greater equity. These questions need special attention in a rights based approach. We need to make sure that program level MEL research respects people's time. We need to be careful about protecting people and households that may be providing data that powerful actors can exploit. We need to ensure that data and analysis gets back to communities involved in the research. Involving local actors in defining what to research, and in conducting research can help generate knowledge in line with advocacy objectives of our rights based programming. When many stakeholders are consulted and agree about what and how to collect information, this is the first step in creating real learning and shared knowledge. Successful consensus on what information to collect and to the data collection process can lead to learning events and products with rich comparable information across a wide range of contexts.
1. Program Theory of Change The theory of change is an approach used to design strategic programs and their evaluative systems. A good program theory of change makes the underlying program hypothesis of how change happens explicit, which enables program staff and stakeholders to probe the program assumptions and take stock of the changes expected over the longer term. A program theory of change establishes a road map of the program's overall design, showing how the program design will achieve impact in the lives of people over a 10 â€“ 15 year timehorizon. Program design is made evaluable by linking each key element of the change system to specific monitoring and evaluation processes that generate evidence-based assessments and provide the basis of ongoing learning and adjustment. It is important to understand that a program theory of change identifies all of the social changes necessary to produce impact, whether or not Oxfam and partners are contributing to them. This enables OA staff and partners to be strategic about how their work fits into the greater social change agenda. This also allows for external evaluators and research institutes to further analyze the program's change hypothesis, assess progress (or set-backs) and push Oxfam staff and partners to continually learn about how change IS happening and what that means for the program. For the purposes of designing and learning from our Program MEL systems, it is necessary to have an explicit theory of change for the program. For more about a program theory of change and how to make it explicit and evaluable, please refer to the guidance materials on Padare (ROPEpedia). Examples of different OA program theories of change can also be referenced there.
2. Impact and Outcome Indicator Selection Indicators are variables that assess progress, and make it possible, through data collection, to identify long term trends. Indicators can be quantitative, qualitative, or a combination of both. For program level MEL, we want to focus on program impact indicators, and intermediate outcome indicators. Both types cut across all program interventions, and may involve multiple partners and stakeholders in collecting and
analyzing data. .
The following is a typology of indicators relevant to program level MEL: Impact indicators: Measures of long term significant changes in the lives and social empowerment of primary change agents. Impact on root causes is always measured by a group of outcome indicators, not by any single indicator. Outcome indicators: Outcome indicators assess contribution to changes in relationships, attitudes, behaviors, ideas, policies, values, practices, institutions, human conditions, and enabling environments as a direct or indirect consequence of a set of actions and their outputs. Process indicators: These indicators measure effectiveness of interventions (inputs, activities, outputs), such as timeliness, coordination, quality of research product, people’s perceptions of process, satisfaction with training, satisfaction with partners approach, etc. Process indicators are most often assessed at the project level. However, one or two high level process indicators are good at the program level, for example, community satisfaction rating of community based organization’s effectiveness across coverage area. A unique aspect of Oxfam America’s program strategies are the identification of 3, 6, and 9 year impact benchmarks. Impact, outcome, and process indicators are linked to these benchmarks to provide measurable and time-bound progress reports. Purpose/Function of Impact & Outcome Indicators Indicators ensure that certain minimum information is collected with regularity over the course of a program. They are key to making evaluation iterative: ensuring that similar comparable data is collected at different points of time. Without a set of clearly defined indicators as a framework, evaluation becomes more ad hoc and less able to capture long term changes systematically. Core Elements of the Product 1. Each program should have a clear set of four to eight, long term impact indicators related to the program goal and impact benchmark narrative. Oxfam assumes that long term change in people’s lives is sustainable when people themselves are able to maintain and further positive change through the exercising of their rights, and through an environment that upholds their rights.
A good set of program impact indicators makes it possible to assess at least three of the four major dimensions of rights based impact below: 1. Change in the conditions of people’s lives (e.g., less conflict, better compensation, better health, economic assets etc) 2. Growing agency of primary change agents -- (knowledge, capacity, internal mechanisms, leadership, origination of ideas) 3. Change in the opportunity structures of the environment (policies, structures, codes, norms, laws, procedures, budgets) 4. Change in social behavior (including the enforcement of policies, changes in normative behavior, change in how people are consulted by local governments, change in actions of mining companies, etc) The above are not strict categories, and indicators will overlap. You can see examples of indicator sets in the supplemental guidance. Examples are available for EARO Extractives Program, HARO Water Program, CAMEXCA Gender Based Violence Program, WARO Extractives Industry Program,
Qualities of a "good" indicator If an indicator is meant to make it possible to track change, then a “good” indicator is one that makes this as easy and valid as possible:
Note: An indicator does not need to contain a quantity. In fact, certain indicators can only be studied through qualitative approaches. Qualitative indicators are also assessable, relevant, unambiguous and specific. Example: “Community leaders’ perceptions of most significant changes in women’s safety in their neighborhood. 2. Indicators need to be linked to impact benchmark narratives. The table below provides an example of both long term impact indicators, and important benchmark outcome indicators derived from the benchmark narratives: Impact indicator – Duration of program % of indigenous population that understands basic information that active citizens need to know about rights relative to mining
Category of Change
Comparison of change in provincial Level poverty measure compared to national average. % of government revenue from EI reinvested in affected areas
Quality of community inclusion in key decision points about mining Benchmark Statement Related Outcomes By 2012, the existence of a protocol adopted by parliament for the better application of the Fair Compensation Law. By 2012, quality of the mining codes in Mali, Senegal and Burkina Faso in terms of harmonization with the regional mining code passed by ECOWAS.
Opportunity Structure Social Relations Relationship of Benchmark Outcomes to Above Long-Term Impact Indicators? The benchmark-related outcomes, if accomplished, will provide important evidence that progress is being made toward the four impact indicators listed in the top half of the table. At the same time, they are necessary precursors to the achievement of long-term impact indicators (such as poverty levels in a province).
Minimal Process Elements Indicators can be developed in one of two ways: Oxfam program staff can initially develop indicators through an internal exercise, then validate and adjust with program stakeholders. Alternatively, indicators can be developed jointly with partners, primary change agents, and/or the program working group. Typically, the indicator selection process begins with a brainstorm of many possible indicators, which will result in a very ambitious list. This list needs to be pared down to the four to eight core impact indicators that the program staff, partners, and possible PWG, want to follow throughout the life of the program. The following questions are helpful in defining four to eight core impact indicators after reviewing the program goal, strategic objectives, benchmark narratives and theory of change: •
Ask yourself what pieces of evidence would be most important to monitor in relation to them?
Is this indicator something we reasonably expect to affect over the next ten years?
Will this indicator be an unambiguous measure for the change the program wants to see?
Is the unit of analysis clear?
Do the indicators capture the appropriate level of scale/scope?
10 | P a g e
Do they cover the four important domains of change needed for lasting impacts on root causes (livelihood conditions, agency, opportunity structures, social relations)?
Note: Even if indicators are first developed internally, it is important that they are validated with major program stakeholders, ideally including representatives of primary change agents. Involving partners and stakeholders also creates a sense of commitment to learn and participate in the collection of information on indicators over time. After agreeing on the long term impact indicator set, revisit the PSP benchmark statements and ensure that identified outcomes align with them and have a logical flow. Impact and Outcome Indicator Sets for Programs: Examples HARO - Water Program – pp. 13-14 in Supplemental Guidance EARO - Extractives Program - pp. 32-34 in Supplemental Guidance WARO - Extractives Program – p. 55 in Supplemental Guidance
3. Designing data collection for indicators Data collection is the process by which program stakeholders reflect upon who collects information, how they will do it, how data will be used. Once analyzed, data helps us understand what progress we are making, obstacles we are encountering, and generates insights and positions for policy advocacy. Data collection methods are tailored to the indicators and questions being investigated. They depend, too, on available resources, skills of staff and partners, time frame, scope, and level of rigor needed. This is in addition to the more well-known criteria of validity, reliability, and usefulness of the end data. Purpose/Function of Data Collection The purpose of data collection and analysis is to gauge progress against objectives, help program participants learn, continuously improve strategies, and provide the means for effective public accountability. Core Elements of Data Collection Design The following elements need to be understood as part of data collection design for indicators: 1. Existing information – For each indicator, provide any known data about them, and include dates. Whenever possible, we should use secondary data (data collected by others) because it saves money and time. 2. Alternative Sources of Information – Provide details on feasible sources of data regarding indicators not covered in secondary sources. For example, for the indicator “% of valid complaints/claims coming from villages that are positively acted upon by government/companies”, this column might include details such as “official complaints registered in public hearings, local courts, with the EPA, with mine omsbudsman, or local officials. Records of complaints filed are available from CSO and local monitoring groups.” 3. Who collects the data – In this column describe whether the evidence is collected or compiled internally by Oxfam or partner staff or externally by a research institution or evaluation consultant. Do local communities help collect information? 4. How to collect data – Provide initial ideas about how to study this variable. This includes information about sampling, i.e. how sites or individuals are selected for study (all communities working directly with partner, random, stratified, or purposeful selection, opportunistic).
11 | P a g e
Also provide an idea of scope: approximately how many sites/individuals might be involved, covering what area/population. For example: “Focus groups in ~ 10 purposely selected villages, with men and women separately, across partner coverage area.” Finally, provide ideas about the methods that might be used. Is it participatory action research in a set of purposively chosen sites? A household survey? Is it open interviews with select key policy makers? Is it a national phone survey? It is okay to list more than one method, especially where it is important to triangulate data (looking across different data sets of the same indicator). 5. How often – Is the information collected once a year? Is it compiled on an ongoing basis from weekly logs? Is this data from occasional events that are documented as they happen, such as a results from a lobby visits? 6. For what product – How will the compiled information be aggregated. For example in a database? In an annual written program report? In a map? 7. Learning events – Where and when might results be shared? Minimal Process Elements One way to organize the elements of a data collection process is the use of a table (see pp. of the Supplemental Guidance for examples). The table breaks down data collection into its components, including: potential information sources for indicators, data collection methods, ideas about who collects the information, how often, for what learning product, and for what reflection event(s). The table is an internal intermediary step before finalizing a Program MEL plan. To fill out the table, all impact indicators and benchmark indicators (ideally six to ten total) are listed in a table on the left column. The headings above are listed across the top row, one per column. Completing this table is a collective exercise involving partners and program stakeholders (ideally the Program Working Group). Some internal work can be done by Oxfam program staff before sharing with program stakeholders if desired, as long as this is validated by the wider program working group. In some cases, where it is difficult to bring together a program working group, conversations on the table can happen in smaller groups, or one on one between Oxfam and individual stakeholders. We recommend completing the table all at once in a single workshop. The completion of the table requires approximately one day of plenary and small group work. Once the columns have been described and discussed by a facilitator, a useful process is to complete one indicator collectively in plenary, and then divide into small groups, each filling out no more than two indicators. Groups are carefully assigned indicators according to relevance and expertise. Each group presents their results for discussion and validation in plenary. By the end of the day the table is complete, and there is consensus on contents and responsibilities. A data collection table is also a way to verify the feasibility of Program MEL. It is likely that it will become apparent that the data collection table is too ambitious and requires more time and resources than available. The next conversation is about simplifying and prioritizing the indicators and data collection. Once the table feels realistic, it can be used to write the Program MEL plan. Examples from the Regions HARO - Water Program example of Data Collection for Indicators pp.15-20 in Supplemental Guidance EARO - Extractives Program example of Data Collection for Indicators pp.35-37 in Supplemental
12 | P a g e
4. Impact & Learning Products and Calendar Once we have a clear idea of impacts desired, indicators of that impact, and data collection methods and roles, we need a plan for analytic, advocacy, and learning products and processes. The calendar and nature of products will change as the program environment and strategies change. At the same time, a 10 - 15 year indicative plan is more likely to be realized in practice if envisioned in the design phase. Purpose/Function The planned production of a series of impact and learning products helps us ▪
track changes and the plausible contribution of program interventions to impacts on the root causes of poverty and social injustice;
improve program strategies and tactics, based on objective evidence;
Create better, more evidence-based policy advocacy products and to offer a concrete mechanism through which such products are identified and budgeted for in the planning cycle;
Influence the behavior of other actors, as successful practices are proven and information about them disseminated in local, national, regional, and even global marketplaces of ideas;
Bring together additional stakeholder and donor support for our work as Oxfam knowledge products become trusted and reliable summaries of good practice, and well-founded ideas for policy improvement.
Publication and Learning Event Calendar From our strategic MEL piloting experience, we have found that the development of a production calendar of impact and learning products concretizes our commitment to tell a story of impact over time, pulling from different elements of our program-level MEL system. We expect that a publications calendar established on year one would be reflected on and reaffirmed or amended annually, or at a minimum of every three years. The calendar also informs budgeting for programs. Examples from the Regions EARO Publications Calendar p.37 in Supplemental Guidance
5. Program MEL Budgeting ROPE II stipulates that 7% of program budgets be allocated to MEL. This means that for a program of $600,000 per year, an annual MEL budget of $50,000 is desirable. This should cover all monitoring, evaluation and learning activities above and beyond specific grant requirements. This amount, therefore, covers things like special studies on particular indicators, learning events (conference, meetings, reflections), additional research on context and power, policy research, etc. This amount is also meant to support the costs of monitoring visits and activities by Oxfam staff. Purpose and Function of a MEL budget Explicit, multi-year budgeting is a reality check on MEL plans. Such a budget also allows staff to make smarter choices about including MEL activities in restricted grant proposals. It also makes clear what kinds of impact research cannot be done, given resource limitations. Ultimately, an effective MEL budget
13 | P a g e
is meant to help Oxfam and partners make better choices between MEL methods and available resources. Minimum Process Considerations Budgeting for program level MEL is an iterative process. A useful long-term MEL budget will take several days to develop, and may require that staff have access to people – researchers or MEL specialists – with technical knowledge about data gathering and analysis methods. Such technical knowledge is particularly useful with regard to insights about what different methods might cost -- and lower or higher cost options. The first step is to calculate 7% of your current program budget, over the next three years. This reveals the funds you have for MEL.3 Next, go through your program-level indicators one by one, remind yourself of the method(s) you’ve settled on for collecting data against those indicators, and remind yourself of how often data will be gathered. Estimate the yearly cost for gathering data against the indicators. Note: normally, some indicators are only investigated every 3-5 years, since change against them is not thought to be feasible in shorter time periods. In this case, however, you still need to identify which year a particular study will take place. Your MEL budget, therefore, may fluctuate from year to year, but over time should settle at the 7% budget target. In estimating costs, remember to include all costs: consultants, field travel, interns, temporary data analysts, workshops, conferences, production of reports/studies, etc. The 7% budget guide is meant to support formal process and impact evaluations – led by external researchers – every 3-5 years. Normally, after estimating for the first time the budget for data collection, analysis, and knowledge product and learning events in your Program MEL, you will find that you have been overambitious. You then need to either: a) identify more cost effective methods b) identify different indicators that are easier to assess c) seek additional resources for the MEL system. The budget figures for your MEL system should, when finalized, be included in column 4 of the MEL plan (see page 6 above). Examples from Regions EARO EI MEL plan pp.35-37 in Supplemental Guidance HARO Water MEL plan pp. 15-20 in Supplemental Guidance
Note that the resources for program level baseline studies are not considered part of the core MEL budget. Separate guidance on baselines is available on Padare and from LEAD staff upon request. Currently baseline resources are accorded to LEAD, and LEAD then works with program staff and local researchers on baseline TORs, implementation, and analysis.
14 | P a g e