Cognitive Daily

i-d675c6aae0fd965c0e805ba4149a21e5-exaggerate1.jpg Disney’s purchase of Pixar makes it clear that computer-generated (CGI) animation appears to be the wave of the future in movies. But one difficulty with CGI animation is conveying realistic emotions. While film animators (whether they use computers or not) can use artistic license to achieve the desired effects, when “emotions” are generated exclusively by computer, it can be difficult to identify the key factors in conveying that emotion.

We’ve discussed avatars, for example, as one way that computers can automate human interaction. Artificial intelligence — lifelike simulators of human responses — will also need to mimick emotions convincingly in order to interact effectively with real people. Harold Hill, Nikolaus Troje, and Alan Johnston have investigated two aspects of how CGI animations can effectively express emotion.

Hill’s team had ten volunteers say four short sentences (e.g. “What are you doing here?”) while expressing four different emotions (angry, happy, sad, and neutral). By filming from eight different angles and using reflective markers on their faces (see the photo above), they could generate three-dimensional computer models of each person. These models were then averaged together to create a “grand average” for each sentence spoken. Click on the link below to see the grand average animation for “I’m almost finished.”

1. Average of all emotions and faces

Next, average faces were computed for each individual emotion. Then several different levels of exaggeration were created in two different ways. The first type of exaggeration compared the position of the reference points on each animation to the grand average. For example, in a happy expression, the corners of the mouth are raised slightly compared to a neutral expression. The 2X exaggeration would double this distance, while the 0.5X exaggeration would cut this distance in half. Here’s a sample of the different levels of exaggeration used for the “happy” version of “look at that picture.”

2. Exaggerating the facial expression

The second type of exaggeration didn’t change the expression, but instead modified the amount of time between each peak in the movement of the mouth. So slow movements were made slower, and quick movements were made quicker. Here’s an example of this type of exaggeration for the “sad” version of “look at that picture.”

3. Exaggerating the timing

Next, a new set of volunteers evaluated each of these animations (a total of 112 different sequences, with no sound), rating them on a scale of 1 to 9 for how well they conveyed “angry,” “happy,” and “sad.” Here’s how the ratings broke down for one of these emotions, happiness:


Notice first that at an exaggeration of 1, the ratings didn’t differ significantly — this represents the same animation, only viewed by a different group. When the movement of the face was exaggerated, however, the ratings became increasingly intense for each level of exaggeration. When the timing of the movement was exaggerated, there was no significant difference between the different animations. Similar results were found for anger and sadness. So for these emotions, the position of the face matters more than the timing of the changes in facial expressions. These results held even when the animations were played in reverse!

A control experiment compared the original videos to the computer animations. While the unexaggerated animations generally did not convey the intended emotion as effectively as the videos (probably due to the fact that certain aspects of the facial expression, such as eyes narrowing in anger, were not translated to the computerized versions), the exaggerated versions were always at least as effective, and often more effective than the original.

Hill et al. argue that these exaggerations of facial movement are like caricatures. Much as caricatures often produce more recognizable images of people than photos, so do exaggerated animations produce more effective expressions of emotion. The key is to exaggerate along the dimension of movement, rather than timing.

Hill, H.C.H., Troje, N.F., & Johnston, A. (2005). Range- and domain-specific exaggeration of facial speech. Journal of Vision 5, 793-807 (link)


  1. #1 Harold Hill
    January 26, 2006

    Hi all, and thanks again for your interest in the paper. Just wanted to make the point that we are not saying that timing is unimportant, just that the method used does not capture timing for faces. It works well for recognizing people from the way that they drink! Please see for that and other closely related research.
    PS Please remove Z for mail

  2. #2 Nadia Alaskari
    July 1, 2009

    Hello, I’m a final year student doing a 3d animation, y focus is on conveying emotions through a 3d character.
    research areas are
    comparison of 2D and 3D examples of mood change
    and more can anyone help with good place do research, good site’s or journals anything… thank you

New comments have been disabled.