As schools begin to move their teacher evaluation systems beyond compliance, we need to remember to examine inter-rater reliability within the system. Inter-rater reliability serves to decrease biases and increase transparency in the evaluation process.
But what is inter-rater reliability, and how can we make sure it is executed in our schools? In simplest terms, inter-rater reliability requires multiple observations by various individuals who have been trained in the evaluation process. Development of inter-rater reliability occurs when leadership teams track, analyze, and compare evaluation data.
Elements of Inter-rater Reliability
Continued nurturing of the following elements of inter-rater reliability will allow a teacher evaluation program to exist without bias and remain focused on professional growth:
- Training—We train new administrators in our evaluation processes, but to ensure inter-rater reliability, training of evaluators must be ongoing. Assigning a numerical value to teacher effectiveness with a pre-determined rubric helps to decrease bias in the evaluation process, but only when evaluators are familiar with the components of the rubric.
- Multiple evaluators– This is challenging for small schools with one administrator bearing the sole responsibility for employee evaluations. Leadership may need to consider using external evaluators to ensure inter-rater reliability and safeguard against bias in the evaluation process.
- System of documentation– Regardless of the school district’s individual steps that have been established for employee evaluation, all correspondence and steps in the process require thorough documentation that is available for review by all parties.
- Identify trends–Leadership teams should distinguish areas of strengths and weaknesses among evaluators and define characteristics present in specific instructional domains among teachers. For example, a large number of ineffective ratings in a specific indicator in a building signals a need for school wide professional development in this area.
- Communication– What effective instruction looks like remains a matter of local control. Frequent conversations among leadership teams and professional learning communities on this topic are helpful to the development of inter-rater reliability. Norming activities that allow evaluators to co-evaluate either teaching videos or actual classroom visits and then compare results help evaluators become more consistent in their practices and answer the question, “What are we looking for in this domain?”.
As educational leaders, we have a responsibility to ensure that evaluation practices are fair and consistent in order to establish trust among those being evaluated. As part of this process, we must continually ask ourselves:
- Are we analyzing evaluator data to ensure that evaluators are consistently providing appropriate feedback?
- Are we establishing district wide norms for what effective instruction looks like among our evaluators?
Meeting these criteria helps us to move beyond compliance and provide fair, consistent evaluation support and feedback for our teachers and ultimately improve instruction for our students.