Main Body
By: Matthew Wilson, Shilpa Sahay and Cheryl Calhoun, edited by Steve Covello
About evaluation
Creating a program of instruction, as scientific as it intends to be, is an exercise in prediction. IDs are skilled at gathering information, making design decisions, and assembling a program of instruction into a coherent form, but there are no guarantees that learners will experience the instruction as expected, nor is there a guarantee that all learners will achieve the Learning Outcomes. Many confounding factors may impede the success of a given program such as unexpected technical problems, poor teaching engagement, or flaws in the design.
To account for these uncertainties, the ADDIE model includes a built-in self-evaluation phase. Whereas, an instructional program intends to assess learners, the Evaluation phase assesses the program itself:
- Were learners adequately prepared prior to the instructional program?
- Have learners achieved the Learning Outcomes, in whole or in part?
- Was the instruction effective for the requirements of the instructional program?
- Are learners able to transfer their learning into the desired contextual setting?
- Do the lesson plans, instructional materials, media, assessments, etc. meet the learning needs?
- Were the methods of implementation effective?
- What was the return on investment (ROI) for the cost of instruction?
The answers to these questions help determine whether the program of instruction needs to be improved (formative evaluation), and whether it has met its goals (a summative evaluation).
In the professional ID field, evaluation is a highly complex discipline with a wide range of approaches to consider for each project. This chapter will describe some of the foundation principles of evaluation which are relevant to students of Communications.
Your goal in this stage is to produce the following sections:
Evaluation goal and audience: A statement that describes the goals and purpose of the evaluation based on stakeholders’ needs and interests; a description of the audiences for whom the evaluation would be of interest.
Formative evaluation plan: A description of how the program of instruction will be reviewed, tested, and revised prior to official launch.
Summative evaluation plan: A matrix of questions and analysis plans designed to produce data from which you can determine whether the program of instruction was successful.
Determining goals and audiences
All evaluation plans must have a clearly defined goal, just as a program of instruction must have a goal. The ID works with stakeholders to establish the goals for an evaluation so that the methods of inquiry are appropriately aligned to them.
For example, stakeholders may be interested in the time it takes to complete a program of instruction given the cost of trainees’ absence from production to attend training. In such a case, return on investment (ROI) would factor into evaluating a training program.
Goals: In determining the goal of an evaluation plan, the ID conducts an evaluation needs assessment with their stakeholders:
- What are the organization’s goals and strategic interests?
- How does the program of instruction fit into those goals and strategic interests?
- What decisions do stakeholders need to make?
- How will the evaluation inform those decisions?
- What will be the critical areas of focus in the program of instruction?
The answers to these questions will inform the orientation of your evaluation questions and the basis of your interpretation of the data.
Audiences: The evaluation plan and its outcome is intended to serve the needs of individuals and groups, to one degree or another. Each of these audiences must be identified since it informs how you craft the questions and interpret the findings. A typical list of audiences includes the following:
- Entities that have sponsored or funded the evaluation.
- Executives who have requested the evaluation to be conducted.
- Management staff affected by the outcome of employee training and performance.
- Staff/trainees that are the subject of the program of instruction.
- Customers/constituencies that are affected by the performance of staff.
Forms of evaluation
There are two basic forms of evaluation conducted by IDs. Formative evaluation is intended to refine the program of instruction itself before it is launched. Summative evaluation is conducted after several sessions of instruction have been implemented to determine whether the program was effective.
Formative evaluation
Formative evaluation is a process for testing the program of instruction prior to full implementation. The purpose for conducting formative evaluation is mainly because the design decisions, media, and manner of implementation may seem ideal on paper, but in reality, the best intended plans for instruction may not play out as expected. Formative evaluation is the process of gathering feedback information to determine whether any of the work up to this point needs to be reviewed or revised.
Formative evaluation is conducted iteratively in at least three phases. It begins with one-to-one evaluation, then small group evaluation, and finally a field trial, as seen in Figure 1. Results from each phase of evaluation are fed back to the instructional designers to be used in the process of improving design.
Figure 1. The cycle of formative evaluation.
Each of the phases in the cycle of formative evaluation are described below.
One-to-one: The purpose of the one-to-one evaluation is to identify and remove the most obvious errors and to obtain initial feedback on the effectiveness of the instruction. During this evaluation IDs should be looking for clarity, impact and feasibility (Dick, Carey, & Carey, 2009, p. 262). Results from one-to-one evaluation can be used to improve instructional components and materials before a pilot implementation.
A one-to-one evaluation is much like a usability study that evaluates the instruction and instructional materials, not the learner. The learner should be presented with the instructional materials that will be provided during the instruction. Encourage the learner to discuss what they see, write on materials as appropriate, note any errors, etc. The ID can engage the learner in dialog to elicit feedback on the materials and clarity of instruction.
Small group: Small group evaluation is used to determine the effectiveness of changes made to the instruction following the one-to-one evaluation and to identify any additional problems learners may be experiencing.
In the small group evaluation, the instructor administers the instruction and materials in the manner in which they are designed. The small-group participants complete the lessons as described. The instructional designer observes but does not intervene. After the instructional lesson is complete, participants are asked to complete a post-assessment designed to provide feedback about the instruction.
Field trial: After the recommendations from the small group evaluation have been implemented, it is time for a field trial. A field trial is conducted exactly as you would conduct the program of instruction. The selected instruction should be delivered as close as possible to the way it is designed to be implemented in the final instructional setting. Instruction should occur in a setting as close to the intended setting as possible. Learners should be selected that closely match the characteristics of the intended learners. All instructional materials for the selected instructional section, including the instructor’s manual, should be complete and ready to use.
Data should be gathered on learner performance, attitudes, the time required to use the materials in the instructional context, and the effectiveness of the instructional management plan. During the field trial, the ID does not participate in delivery of instruction. The ID and the review team will observe the process and record data about their observations.
After each phase, the ID considers the results of the evaluation and meets with project stakeholders to make design decisions.
Summative Evaluation
Summative evaluation is called summative because it is intended to summarize the results of an instructional program after it has been run several times. There are many reasons for conducting a summative evaluation, but it is primarily to determine whether the problem that precipitated the need for instruction has been resolved. Summative evaluation, as a process, also establishes the basis for judging whether the program of instruction has met its intended goals according to an agreed upon standard.
For example, if a program of instruction seeks to prevent workplace personal injury or death, it would be expected that 100% achievement of this objective would be the only acceptable outcome. Likewise, it would be reasonable to expect that a benchmark 90% pass rate of students taking a college Statistics course for the first time would be considered a successful outcome. The evaluator negotiates with stakeholders to establish both the criteria for success and the mark at which the outcome is considered successful.
IDs conducting a summative evaluation are tasked with constructing an inquiry strategy that will produce data from which a summative interpretation can be made. The inquiry strategy includes the following elements:
Audiences: An evaluation report is intended to convey meaningful information to stakeholders about what transpired in the instructional program. The ID must establish who comprises the audience for the evaluation report since the outcome of the evaluation should take into consideration who is affected by its conclusions.
Evaluation questions: The evaluator must formulate questions that produce answers relevant to determining the effectiveness of instruction. Naturally, the design of the questions must be taken with great care so that they produce useful responses. Questions can include a range of interests such as learner expectations, how learned skills/knowledge have been used in authentic conditions, areas of subject matter that were not addressed, etc.
Data collection procedure: The evaluator will determine who needs to be queried about the outcome of instruction, which systems to extract data from, and the means by which this data will be obtained. This part of the plan may also include listing specific individuals or groups to interview or survey.
Analysis procedure: Once data has been gathered, the evaluator will develop a method of analysis to interpret the findings. Analysis methods may include comparing data prior to instruction against data after instruction, or qualitative analysis that looks for indicators in the data that fit a particular theme, such as satisfaction, meeting expectations, comfort level, readiness, etc.
Evaluation criteria and judgement: The evaluator needs to establish the basis of judging the performance of the program of instruction and then setting a mark that determines whether the program has met the mark. For example, if a program of instruction is intended to improve workers’ ability to produce an assembly in a production line over a period of six hours, the evaluation criteria (informed by the Learning Outcomes) would be stated as, “Assembly workers should be able to produce 10 assemblies in six hours.” The mark for success to judge the success of this training program might be set to 90%, meaning that 90% of workers who have completed the training should be able to produce 10 assemblies in six hours. The mark for success can be any reasonable threshold established by the ID and stakeholders under typical conditions.
Below is an example of an Analysis and Interpretation Plan for a summative evaluation that includes the elements described above. The scenario relates to a program of training related to HIPAA medical information privacy compliance.
HIPAA Training Program: Formative Evaluation Plan
The training program will be developed in time to conduct a formative test prior to launch as follows:
One-to-one: The ID and trainer/expert will convene to review all course material and activities and conduct a mock walkthrough of the training program to determine whether any additional resources will be needed. The multimedia content will be tested with a selected volunteer from the Acme Family Medical Center staff to obtain user experience responses, programming accuracy, and any other questions or concerns about the content. Results from these formative activities will be applied into program and content revisions.
Small group: Upon completion of revisions, a small group will be convened to retest the program. The group will be comprised of one member of Acme Family Medical Center supervisory staff, one member of the working staff, and the trainer/expert. A walkthrough of the program will occur, except for the discussions. Feedback and revisions will be applied to program and content revisions.
Field trial: Given the relatively small group of trainees and the timeframe for completion, we recommend bypassing field trial and moving on to full launch, with a planned debrief of staff participants following completion of the program to offer feedback and improvements for a possible future HIPAA training program, if needed.
HIPAA Training Program: Evaluation Analysis and Interpretation Plan
Goal: The goal for this summative evaluation is to determine whether the HIPAA training program was effective in reducing instances of HIPAA violation among medical staff involved in the handling of client records. The primary focus of this study is to determine whether the training program caused the reduction of HIPAA violations among trainees who participated. The secondary focus is to determine whether the training program was perceived as well-designed and supported with appropriate resources.
By evaluating the factors that contribute to these outcomes, stakeholders will be able to determine whether the program should be formatively improved and whether additional HIPAA training programs should be implemented.
The client for this evaluation is Dr. Joyce Jones, Executive Director of Acme Family Medical Center PLC whose responsibility is to oversee company operations.
Audiences: The audiences for this report also include:
- Management executives for the Acme Family Medical Center PLC
- Supervising staff
- Trainees
Interested stakeholders for this report include:
- Legal council for Acme Family Medical Center PLC
- Clients and caretakers of Acme Family Medical Center PLC
Research Matrix for the Analysis and Interpretation Plan
| Evaluation Questions | Importance | Collection Procedure | Analysis Procedure | Evaluation Criteria | Procedure for Making Judgment |
| Did the HIPAA training program achieve its goal? | This question addresses whether the training program was effective. | Data collection of HIPAA violation reports. | Compare number of HIPAA violation reports before and after training. | Research should indicate a reduction of HIPAA violations. | There should be an 80% reduction of HIPAA violations. |
| Do trainees feel confident about managing client medical records according to HIPAA rules? | This question addresses whether trainees feel they are able to apply their learning in the workplace context. | Interviews with trainees. | Code references in training interviews for feelings about the training. | Research should indicate improvement in trainee confidence in managing medical data according to HIPAA rules. | 90% of trainees should express confidence in managing medical data according to HIPAA rules. |
| Was the instructional media sufficient? | This question addresses whether the instructional media was useful for the instructor and learners according to their needs. | Interview with instructors. Interview with trainees. | Code references to feelings about the instructional media. | Research should indicate that instructors and trainees felt the instructional media was useful for their learning needs. | 90% of responses should indicate that the media used in the program was sufficient for learning. |
| Was the HIPAA training instructor engaged and helpful? | This question addresses whether the teacher-trainer was effective in their instructional engagement | Interviews with trainees. | Code references to feelings about the instructor and learning experience. | Research should indicate that trainees felt the instructor was a positive factor in their learning experience. | 80% of trainees should express positive feelings about the instructor. |
The sample Analysis and Interpretation Plan includes both quantitive (numeric data) and qualitative methods (participants’ feelings) for evaluating the outcome of the training program, though each project may emphasize one method more than another, depending on the goals of the evaluation.
For example, stakeholders may be more interested in the reduction of HIPAA violations as an outcome than they are with how many trainees passed the training. If all trainees passed the course but HIPAA violations remained unchanged, then the scores of the training wouldn’t mean much since the trainees did not transfer their learning to job performance. It is also possible that, in such an instance, the means of assessing trainees’ learned skills was flawed in a way that falsely indicated proficiency.
Constructing meaningful evaluation questions
The construction of an evaluation question requires unpacking the stakeholders’ goals for the evaluation so that the research questions you propose will produce answers that are meaningful. As you will see, assembling the right combination of questions is a bit of an art form.
In the HIPAA training example, the goal is to determine whether the HIPAA training program was effective in reducing instances of HIPAA violation among medical staff involved in the handling of client records. The line of questioning, therefore, should address the following:
- Did the training cause a change in the post-training performance outcome? If not, why not?
- What evidence can be used to support a claim one way or the other?
- What evidence can be gathered to affirm or validate your findings?
Let’s review the first evaluation question: Did the HIPAA training program achieve its goal?
What evidence would support a claim that it did? In this case, the data related to HIPAA violations before training in comparison to violations after training may correlate to this claim.
The second question is related to the first question in a more qualitative orientation: Do trainees feel confident about managing client medical records according to HIPAA rules?
While the data collected in the first question supports a claim of causation, the answers to the second question suggest that, if trainees felt well-prepared to manage client records according to HIPAA compliance, then the data is more reliable. It also suggests that the improved level of performance is likely to be sustained.
However, if trainees do not feel that they are well-prepared despite the training, then it calls into question the validity of the claim made in the first question. Perhaps there were other factors that contributed to the improvement of performance shown in the data? In anticipation of this possibility, the evaluator would want to gather information to explain why trainees felt they were insufficiently prepared.
The third question, relates to the second: Was the instructional media sufficient?
If trainees indicated that they did not feel confident in their abilities and they indicated that the instructional media was insufficient in some way, it suggests to the ID and stakeholders that a cause of their deficit may be in the media selected as part of the instructional material. If trainees felt the instructional media was sufficient, but they still did not feel well prepared, then perhaps there was another factor.
The fourth question relates to the third: Was the HIPAA training instructor engaged and helpful?
If trainees did not feel confident in their abilities but felt that the instructional media was sufficient, then perhaps the cause of their deficit was in the way the course was taught.
As you can see, the lines of questioning are interrelated so that the evaluator is able to triangulate the data to explain the results, whatever they happen to be.
References
Dick, W., Carey, L., & Carey, J. O. (2009). The systematic design of instruction. Upper Saddle River, New Jersy: Pearson.