Family lore has it that my uncle was influential in instituting what is now a fixture in college education: student evaluation of college instructors. He was class president at the University of Washington in the 1960s, when tensions between students and the school administrators were high, and he suggested implementing one of the first student course evaluation systems in the nation as a way to address the problem. Needless to say, the idea caught on.
While college faculty complain unceasingly about the fairness of the now nearly universal student course evaluation system (I did it myself, back when I taught college courses), it has in general been shown to be a relatively reliable indicator of teacher effectiveness, correlating positively with other measures such as faculty and administrator evaluation, as well as actual student learning.
From the teacher’s perspective, however, the students can’t possibly have enough information to make an effective evaluation of their teaching. A college course represents just a tiny sliver of the total knowledge in a discipline, and even after a semester in a college course, students are in no position to make judgements that will impact a faculty member’s entire career.
A 1993 study by Nalini Ambady and Robert Rosenthal found just the opposite: students actually need much less information to make judgements that accurately predict end-of-semester evaluations.
Ambady and Rosenthal extracted 3, 10-second video clips of 13 teachers from tapes of entire class sessions. These 39 clips were randomized and presented without sound to 9 female college students, who rated them on a scale from 1 to 9 for a variety of behaviors, including “attentive,” “confident,” and “supportive.” The ratings were highly consistent between judges, with a global reliability measure of .85 overall.
These teachers were rated by their own students at the end of the term on a general effectiveness scale. I’ve created a table below to show the correlation of the ratings of the 30 seconds’ worth of clips with the end-of-semester rating:
The significant correlations — 9 of the 15 measures — are in boldface type. Concerned that their measure may only reflect a cursory evaluation of the physical attractiveness of the teachers, Ambady and Rosenthal had separate judges rate the teachers for attractiveness based on still photos. Even after controlling for physical attractiveness, the correlation between student ratings and the video clip ratings was still significant. Apparently after seeing just 30 seconds of nonverbal behavior, we can reliably predict teaching ability.
Not satisfied with comparing results only to student evaluations, Ambady and Rosenthal repeated the experiment with videotapes of high school teachers, and compared them to effectiveness ratings provided by the school principal. The results were comparable.
So how thin a slice of behavior is needed to accurately predict teaching ability? The researchers had an assistant unfamiliar with the task randomly select 5- and 2-second clips from the original 10-second clips. They repeated the rating task with a new group of female college students. The ratings for these shorter clips were less reliably correlated with the teacher effectiveness ratings, but amazingly, the 2-second ratings for college teachers were still significantly correlated with overall end-of-semester effectiveness ratings. Though these were the only short clips that were significantly correlated with effectiveness, neither the 5-second or 2-second ratings were significantly different from the 10-second ratings. What’s more, if the short ratings for college and high school teachers are combined, they do significantly predict effectiveness ratings.
So we do appear to be quite effective at making judgements about teaching ability even after viewing only a total of 6 seconds of actual teaching, and without even hearing the teacher’s voice.
So what does this suggest about my uncle’s system of student teacher evaluations? Not much, directly, but it does suggest that students who choose courses by visiting lots different classes during the first week may not be any less rational than those who pore over the student-produced faculty guidebook.
Ambady, N., & Rosenthal, R. (1993). Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of Personality and Social Psychology, 64(3), 431-441.