When we are trying to understand what someone is saying, we rely a lot on the movement of their face. We pay attention to how their faces move, and that informs our understanding of what is said. The classic example of this is the McGurk effect, where the same sound accompanied by different facial movements gets interpreted differently.
Take a look at this short video clip (QuickTime required) of me talking, with my voice muffled by what sounds like cocktail party conversation:
Can you understand what I’m saying? What about after I stop moving? Can you understand me in the second part of that clip? Go ahead and replay the video to see if you can hear it the second time through.
That’s right, I said two three-word phrases, not just one. If you’re like me, you only heard background noise during the second part of the clip. In fact, I’m curious as to whether anyone can understand me at all. Let’s make this one a poll:
I’ll play the video with me actually moving at the end of the post, and we’ll see if the results change.
Since the McGurk effect, researchers have studied precisely where we look when we watch someone speak, and found that we’re not always looking at the mouth. Indeed, we look at speakers’ eyes more often. Even more striking, we tend to look disproportionately at the right side of a speaker’s face. Why the right side? Several studies have found that the right side of most speakers’ faces are more expressive than the left side, so we appear to be focusing on the side of the face that offers the most information.
But what if the left side of a particular face was actually offering more information? Would we switch our focus to that side? A team led by Ian T. Everdell showed 28 college students a series of videos similar to the one I presented above. The students’ eye movements were monitored with a tracking device. Speakers uttered one of six phrases, and, as above, sometimes their faces were static and sometimes they were moving. In addition, some of the time the faces were flipped, so what appeared to be the right side of the face was actually the left side in the original.
As expected, viewers could understand the moving faces more often (90 percent of the time) than the static faces (60 percent of the time). Also as expected, for the non-flipped faces, viewers indeed focused more of the time on the right side of the speakers’ face. This picture shows the results for one typical viewer:
Some viewers did focus more on the left than the right, but the vast majority of viewers focused on the right side of the speaker’s face. So what about when the faces are flipped?
As you can see, there’s practically no difference in the results. Whether the faces were presented in the original or mirrored form, nearly all viewers focus on the right side of the face. Whether they focused on the left or right side of the face, viewers were consistent — left-focusers focused on the left side for both normal and mirrored faces.
Everdell’s team argues that if we focus on the left side of the face because there’s more information available to help us understand speech, we’re not able to adapt very quickly to different speakers. When we’re confronted with someone who’s more expressive with the left side of their face, we’re not able to instantly adapt and focus on that side of their face.
Oh, one last thing. Were you wondering what I said in the second half of that clip? Here’s the unaltered original video:
Record your answers below. Let’s see if we get a different result now.
Everdell, I.T., Marsh, H., Yurick, M.D., Munhall, K.G., Paré, M. (2007). Gaze behaviour in audiovisual speech perception: Asymmetrical distribution of face-directed fixations. Perception, 36(10), 1535-1545. DOI: 10.1068/p5852