Suppose your organization is interviewing candidates for an important job. Would it be better for one trusted person to have an extended interview with them, or for several people to talk to them for less time? How many people would you need to conduct the interviews? Would three be enough? Would ten be too many? If ten is good, wouldn’t twenty be even better?
We’ve discussed thin-slicing studies before — the idea that a few brief exposures to an individual can give just as accurate an impression of key traits as much more extended interactions. For judging sexual preference in men, a 10-second exposure to pictures of faces isn’t any better than a 50-millisecond exposure. For evaluating teaching ability, a few 10-second movie clips are nearly as good as an entire semester in class.
But these studies didn’t vary the number of times judges were exposed to the images or video clips. Could seeing more small bits of information about an individual could help people make more accurate judgments? A team led by Peter Borkenau recognized that the vast quantities of data collected in the 1990s for the German Observational Study of Adult Twins (GOSAT) could be used to answer that question. The GOSAT recruited 300 pairs of twins, who underwent detailed personality and intelligence testing, and were also extensively interviewed and videotaped. Borkenau’s team wasn’t interested in the twins’ similarities and differences, so they analyzed the data twice, once for each group of 300 unrelated individuals, then averaged the results together.
Each twin’s personality was also rated by two acquaintances, the experimenter who guided their session, and a confederate who had participated in six videotaped sessions with them. The twins were videotaped for a total of fifteen sessions, doing things like introducing themselves, recalling objects they had just seen, telling jokes, and reading newspaper headlines.
Each these 15 video clips were then shown to judges who rated them on 20 personality traits and intelligence, using a five-point scale (e.g. 1 = “unintelligent” and 5 = “intelligent”). Each judge saw only one clip from each individual, and each clip was viewed by four judges. Altogether, 1.26 million ratings were made by the judges. So does the rating of just one clip of an individual correlate to that person’s actual personality score from the personality tests? Yes it does, but the strength of the correlation varies depending on what trait is being measured. This graph shows the results:
The horizontal axis shows the number of different clips being analyzed, while the vertical axis shows the correlation between the judges’ ratings of personality categories and the individual’s actual score on the personality test. So when just one clip was shown, there was a correlation, but it was quite low — less than .05 for “conscientiousness,” and .21 for the best correlation, “extraversion.” The researchers then randomly selected additional judges’ reports to average together for viewings of multiple clips. So for 2 clips, you’re looking at an average of the ratings for a variety of different combinations of judges’ ratings — joke-telling and introductions, or object recall and headline-reading, for example. And as you might expect, the correlations get stronger as more clips are analyzed. But once about 6 clips are analyzed, the correlation doesn’t get significantly stronger. There doesn’t appear to be much reason for the judges to watch more than 6. It’s important to point out that even the strongest correlations here aren’t especially strong. A correlation of .3 is probably best characterized as “moderate,” and acquaintances did better, averaging about a .45 correlation with the twins’ personality test results.
But remember, judges also were asked to rate the apparent intelligence of the participants, who had also taken two different intelligence tests. This graph shows how well the judges’ scores compared to the actual test scores:
APM and LPS are simply two different types of intelligence tests. These correlations are stronger than the correlations for personality — the correlation between judges’ ratings and actual test scores reached .53 for the LPS test. However, test scores were also correlated with gender and age. Once these effects were controlled for, the LPS correlation dropped to .46 — still a moderately strong correlation.
As with the personality measures, once the ratings of more than six clips were combined, there wasn’t much of an improvement in the correlation between the test scores and the ratings.
So using a bunch of judges to watch short clips of behavior can be a good way to judge personality and attention — but only up to a point. Once you exceed six judges, there isn’t much improvement. And watching these thin slices of behavior doesn’t work as well for every personality trait: while it works pretty well for intelligence, it’s not so useful for judging conscientiousness. In our hypothetical job-interview situation, you’d probably be better off just checking the candidates’ references than making a guess based on what you see in the interview.
Borkenau, P., Mauer, N., Riemann, R., Spinath, F., & Angleitner, A. (2004). Thin Slices of Behavior as Cues of Personality and Intelligence. Journal of Personality and Social Psychology, 86 (4), 599-614 DOI: 10.1037/0022-35126.96.36.1999