Step 2: Develop the evaluation brief
An evaluation brief can gain agreement on an evaluation and develop a Request for Tender (RFT). These can be used to commission an external evaluation or develop agreements for an internal evaluation.
The evaluation brief can be developed in the program design phase. It is the basis for developing the evaluation design.
A brief for a program evaluation sets out:
- purpose of evaluation — formative or summative
- type of evaluation needed — process, outcome or economic
- scope and focus of the evaluation
- key stakeholders
- key evaluation questions
- what is already known about the program
- reporting and communication
- the balance of internal and/or external evaluation
- develop an evaluation strategy (for large programs)
- the investment in the evaluation
- governance mechanisms and stakeholders engagement strategy.
Considerations for the evaluation brief include:
- A program with external funding may have specific requirements of the evaluation focus, methods, timing and scale.
- High-profile programs or those with significant risks may need more extensive evaluation to find problems early.
- Pilot initiatives are likely to need more extensive evaluation to provide information not just on whether they work, but how they work, so they can be replicated or used to scale-up.
- A program with multiple stakeholders may need more resourcing to support their involvement in negotiating the evaluation focus and methods and communicating findings.
The details and scope will differ from evaluation to evaluation.
For larger evaluations, you could also progress the development of the evaluation design at this stage, so that an outline of the evaluation design can be included in the brief. This is an important consideration when the brief for evaluation plan is developed during the program design phase.
The evaluation brief is essentially the same as the program evaluation plan in the NSW Government Program Evaluation Guidelines (PDF 543.64KB). It may also be called the terms of reference for the evaluation.
Purpose of evaluation – formative or summative
The starting question for planning a program evaluation is "Why do this evaluation?"
The 2 main evaluation purposes
- Formative evaluation for program improvement, learning and decisions about incremental changes.
- Summative evaluation for accountability and decisions about whether or not to continue or expand a program.
Formative and summative evaluations may use some of the same evaluation methods.
The classic comparison, by Professor Robert Stake, is "When the cook tastes the soup, that's formative; when the customer tastes it, that's summative".
Formative evaluation
Formative evaluation is about evaluation conducted to inform decisions about improvement. It can provide information on how the program might be developed (for new programs) or improved (for both new and existing programs). It is often done during program implementation to inform ongoing improvement, usually for an internal audience. Formative evaluations use process evaluation but can also include outcome evaluation, particularly to assess interim outcomes.
Summative evaluation
Summative evaluation refers to evaluation to inform decisions about continuing, terminating or expanding a program. It is often conducted after a program is completed (or well underway) to present an assessment to an external audience.
Although summative evaluation generally reports when the program has been running long enough to produce results, it should be initiated during the program design phase. Summative evaluations often use outcome evaluation and economic evaluation. Summative evaluations could also use process evaluation, especially where there are concerns or risks around program processes.
The purpose of a program evaluation will inform (and be informed by):
- the audience needs
- reporting requirements
- intended users and uses.
It will also be shaped by program characteristics including:
- significance to government, size of investment, risks, sensitivities and needs for decision
- the stage and maturity of program implementation
- the readiness of the program for evaluation including the extent and quality of administrative data.
Evaluations may be required by legislation or policy. Each agency within the NSW Government will have a rolling 12-month evaluation schedule, which must be prepared and submitted to ERC for approval. Schedules should include:
- a list of programs planned for evaluation and review, and their expected completion date
- who will evaluate or review listed programs
- the governance processes for the schedule, including internal monitoring and reporting
- when the schedule will be reviewed and updated.
Type of evaluation needed – process, outcome and/or economic
The most common types of program evaluation within government are:
- process evaluation
- outcome evaluation
- economic evaluation.
Process evaluation is mainly, but not solely, used for formative purposes. Both outcome evaluation and economic evaluation are used mainly for summative purposes.
Other evaluation tools (such as needs assessment, program logic, and evaluability assessment) may be used in preparing a program evaluation brief or to inform program planning.
Types of evaluation
Type | Focus |
---|---|
Process evaluation | Investigates how the program is delivered, including efficiency, quality and customer satisfaction. May consider alternative delivery procedures. It can help to differentiate ineffective programs from failures of implementation. As an ongoing evaluative strategy, it can be used to continually improve programs by informing adjustments to delivery. |
Outcome evaluation (or impact evaluation) | Determines whether the program caused demonstrable effects on specifically defined target outcomes. Identifies for whom, in what ways and in what circumstances the outcomes were achieved. Identifies unintended impacts (positive and negative). Examines the ways the program contributed to the outcomes, and the influence of other factors. |
Economic evaluation | Addresses questions of efficiency by standardising outcomes in terms of their dollar value to answer questions of value for money, cost-effectiveness and cost-benefit. These types of analyses can also be used in formative stages to compare different options. |
Needs assessment | As part of program planning, assesses the level of need in the community, and what might work to meet the need. For an existing program, assesses who needs the program, and how great the need is. |
Program logic | Used for program planning and for framing a program evaluation to ensure there is a clear picture of how and why the program will produce the expected outcomes. |
Evaluability assessment | Used in developing a program evaluation brief to determine whether a program evaluation is feasible and how stakeholders can help shape its usefulness. This is useful if implementation has commenced without an evaluation plan. |
Scope and focus of the evaluation
All program evaluations should:
- be as rigorous as possible
- aim to produce valid and reliable findings
- reach sound conclusions.
The evaluation brief needs to consider an evaluation design that addresses:
- rigour
- utility
- feasibility
- ethical safeguards.
Whenever feasible and appropriate, program evaluation should aim to measure program outcomes. Planning for rigorous outcome evaluations should begin as early as possible to allow for a strong evaluation design. A strong evaluation design can have comparison groups for quasi-experimental or experimental evaluation approaches. The evaluation design will detail how the required program data will be collected.
It is never feasible or appropriate to try to evaluate every aspect of a program. Any evaluation project needs boundaries in its scope and a focus on key issues, for example:
- a program evaluation might look at how a program has been implemented in the past 3 years, rather than since it began, or
- could look at its performance in particular regions or sites rather than across the whole state.
An outcome evaluation may focus on:
- outcomes at particular levels of the program logic
- particular components of the program.
A process evaluation may focus on the activities of particular stakeholders, such as frontline staff or interagency coordination.
Key stakeholders
Key stakeholders are likely to include:
- senior management in the agency
- the Strategic Centre
- program managers
- program partners
- service providers
- peak interest groups (such as representing industries or program beneficiaries).
In developing the evaluation brief you should consider:
- the questions that significant stakeholders will have of the program
- when stakeholders need answers to their questions
- how they will use the information you provide.
One method is to map significant stakeholders and their actual or likely questions.
Stakeholders will also have expectations about the most credible evidence to answer their questions. Stakeholders will have differing understandings of:
- the program
- the extent to which it can be evaluated
- the suitability of different evaluation designs and methods.
You need to be clear about:
- their interests and understanding of the program
- decide how stakeholder interests should be reflected in the evaluation
- how stakeholder expectations can be managed throughout the evaluation.
Key evaluation questions
A program evaluation should focus on only a small set of key questions. These are not questions that are asked in an interview or questionnaire but high level research questions that will be answered by combining data from several sources.
Key evaluation questions for the 3 main types of evaluation
Type | Typical key evaluation questions |
---|---|
Process evaluation |
|
Outcome evaluation (or impact evaluation) |
|
Economic evaluation (cost-effectiveness analysis and cost-benefit analysis) |
|
Appropriateness, effectiveness and efficiency
In this toolkit, we use 3 broad categories of key evaluation questions to assess whether the program is appropriate, effective and efficient.
Organising key evaluation questions under categories helps you assess how appropriate, effective and efficient your program is and in what circumstances. Suitable questions under these categories will vary with the different types of evaluation (process, outcome or economic).
Typical key evaluation questions | |
Appropriateness |
|
Effectiveness |
|
Efficiency |
|
While you can use different processes to develop evaluation questions, these should emerge in this step as you consider the different activities associated with this step:
- the purpose of the evaluation
- the type of evaluation
- stakeholder interests
- preliminary assessments.
There may be formal or general evaluation questions that are required because of legislation or arrangements such as the National Partnership Agreements.
To clarify the purpose and objectives of an evaluation, there should be a limited number of higher-order evaluation questions (roughly 3 to 5 questions). There can be sub-questions underneath each higher-order question. The higher-order questions can be grouped under the categories of appropriateness, effectiveness and efficiency.
A way to test the validity and scope of evaluation questions is to ask: "When the evaluation has answered these questions, have we met the full purpose of the evaluation?"
What is already known about the program?
You can prepare for a program evaluation by doing preliminary investigations about the program and the scope for evaluation. An evaluation may be for one of the following reasons:
- it is required irrespective of the state of the program
- to make the program more able to be evaluated
- to demonstrate that it is not worth evaluating the program.
Three methods to prepare for an evaluation and inform an evaluation brief are:
- review program logic
- use evaluability assessment to check readiness for evaluation
- identify what is already known about the program.
Review the program logic
Reviewing or developing the program logic is an important prelude to an evaluation. It should provide a useful description of the program and its intended outcomes that will help shape the evaluation questions and data collection methods.
Key evaluation questions for program logic analysis include:
- What is the problem the program is trying to solve or outcomes it is trying to achieve?
- How plausible are the program activities to achieve the outcomes?
- How appropriate is the program in relation to government policy?
Program logic can also be used to assess whether the program is still appropriate. If the program is no longer appropriate then program logic can provide a basis for discontinuation without the need for further evaluation. For example, program logic analysis can show whether the intended outcomes are still appropriate and link to government priorities.
Program logic can decide if the program activities and immediate outcomes link to the intended outcomes, either logically or using evidence from the research literature.
Use evaluability assessment to assess readiness for evaluation
Evaluability assessment is used to determine whether and what form of a program evaluation is feasible. It will also help identify what will make a program more able to be evaluated, such as refining the program logic, or improving the collection of monitoring data.
An evaluability assessment is particularly important if implementation has commenced without an evaluation plan. For example, an evaluability assessment may find that no data on program outcomes is being collected, pointing to data collection design work that is needed prior to conducting an outcome evaluation. The findings from an evaluability assessment should inform the design in Step 4. Manage development of the evaluation design and feasibility of the program evaluation.
Questions for evaluability assessment include:
- Does the program have a plausible program logic?
- Is there a clear purpose and objectives for the evaluation?
- Can you clearly identify an audience for the evaluation and how the findings will be used?
- Are there sufficient resources to conduct an evaluation? Is there suitable data from program implementation and/or monitoring or is it possible to collect data?
- Can a comparison group be identified to better determine program impacts and outcomes?
Identify what is already known that is relevant to answering key evaluation questions
You shouldn't conduct an evaluation when answers can be found in existing data. Before considering program evaluation, analyse performance monitoring data, and scan for evidence about comparable programs.
An analysis of available program monitoring data should reveal trends, patterns and issues with program implementation, and possibly program outcomes. This analysis can answer some questions about the program, and point to other questions that the program evaluation should address.
A scan for existing evidence about effectiveness of comparable programs in other jurisdictions or internationally can point to expected outcomes, standards and issues. These can also inform:
- the development of evaluation questions
- the evaluation design
- methods of data collection
- standards for assessing performance.
Reporting and communication
Evaluation reports are usually the most significant product of a program evaluation project. The final report, either in full or summary form, needs to reach the intended audiences through formats and channels that are meaningful to them. You need to consider which stakeholders will be the audience for the evaluation reports, and how they might use these. Evaluations may be designed to inform decisions in the budget and policy cycle, meaning that reports are required at specific times.
Paying attention to reporting needs when developing an evaluation brief can help:
- clarify expectations about when information from a program evaluation is needed
- clarify the time.
For some programs, interim evaluation reports can be timetabled to provide preliminary findings to decision-makers.
The planning stages of a program evaluation should consider the practice principle. Evaluation processes should be transparent and open to scrutiny in the NSW Government Evaluation Guidelines. You should consider how the evaluation findings, methods and data might be shared within government.
An evaluation report on any program delivering services to the public should be publicly released in a timely manner, except where there is an overriding public interest against disclosure.
During this step, you should set out the reporting requirements which will be further developed in the workplan. You will need to:
- identify suitable reporting for key audiences
- consider issues of length, structure, style, and whether to publish – particularly if tendering services from an external agency
- develop a timeframe for reporting to meet evaluation purposes.
Reporting and communication about the evaluation can have an important influence on how the findings will be used.
Decide on the balance of internal or external evaluation
One of the principles in the NSW Government Evaluation Guidelines is that evaluations should be conducted with the right mix of expertise and independence. In deciding who conducts the evaluation, issues to consider are:
- knowledge of the program or policy
- expertise of the evaluators in program evaluation
- perceived and actual independence of the program
- credibility of the evaluator in the eyes of the intended audience.
For many evaluation projects, a partnership between the internal managers and the external evaluators may be effective and cost-effective. The degree of partnership will vary depending upon capacity, logistics and the need for independence. The internal team is often best suited to:
- manage the overall process and governance arrangements
- provide data from administrative systems
- coordinate intra-government arrangements such as organising stakeholder interviews.
A well-managed partnership approach can:
- bring flexibility to the evaluation
- reduce delays
- be cost-effective
- promote learning about evaluation within program management.
Possible scenarios for internal or external evaluation
An evaluation can be designed and managed internally where:
- the program is a small to moderate investment and a low risk (tier 1 or tier 2 in the NSW Government Evaluation Guidelines)
- the evaluation is limited in scale
- internal staff have skills and resources for systematic data collection and analysis.
External service providers can contribute to hybrid evaluations in different ways:
- supporting internal staff to conduct an evaluation through facilitation or coaching
- undertaking one or more components (for example, specialist data collection or analysis, or reporting)
- providing an external review of a process or product (for example, evaluation design, data collection instruments, evaluation report).
For some small evaluations, or where an evaluation is repeating a previous design, it can be designed internally. After being designed internally, then an RFT will be used to engage an external group to implement it.
External expertise can be useful to design the evaluation. This can either be done as:
- part of proposals for the evaluation, or
- a separate project (if the evaluation is very large and complicated).
Develop an evaluation strategy (for large programs)
The scale of the evaluation should be proportionate to the size or significance of a program, as set out in the NSW Government Evaluation Guidelines. For large programs, this may involve a series of evaluation projects and related activities. Large programs could include those that are:
- large-scale (significant investment, extended reach)
- have a duration of 3 or more years
- are complicated (multiple sub-programs, across agencies or whole of government).
Such programs may warrant an evaluation framework and strategy that sets out:
- a series of evaluation projects and activities for data development
- evaluation capacity building over the period of the program.
This will allow you to build in process, outcome and economic evaluations at key times that:
- match the developing maturity of the program, and
- meet the needs for information for formative and summative purposes.
The evaluation framework and strategy can be developed at the time of the program design and reviewed at milestones, such as after the delivery of each evaluation report.
The investment in the evaluation
All evaluations will require an investment of financial and staffing resources commensurate with the scale of the program and the evaluation. In the program design stage, a proportion of the budget (and/or internal staff time) should be allocated to cover evaluation activities.
The cost of an evaluation project will be shaped by:
- the scope of the evaluation activities
- if evaluation activities are to be carried out internally or by external consultants
- the extensiveness of additional data collection, analysis and report writing.
The detailed tasks and scope of an evaluation project will not be clear until the design step. However, allocating a budget for an evaluation project and commissioning an evaluation will indicate the extent and depth of work available for that budget.
Governance mechanisms and stakeholders engagement strategy
A governance mechanism, such as a steering committee or advisory group should be established to provide direction or advice at various stages of the evaluation. The benefits include a greater range of perspectives and expertise, as well as greater ownership of the evaluation process by key stakeholders.
You should consider a governance group that matches the purpose and the scale of the evaluation. The group may be entirely within government, or include government and external stakeholders.
In most cases, members should be beyond just the program, and include relevant people from elsewhere in the agency, or from partner agencies. For significant evaluations (tier 3 or 4), you should consider a representative from the Centre for Program Evaluation. External stakeholders can include key academics who research the program area, representatives of peak groups for program clients, industry bodies, and program service providers.