Cognitive Daily

ResearchBlogging.orgYou may have heard of the idea that people can only remember seven things at a time — a seven-digit phone number, a license-plate, etc. While the size of working memory actually varies from person to person (it usually ranges from 6 to 8 items), and while people can use strategies like “chunking” to remember more, this observation is basically true.

Except when it’s not true. In the 1970s, researchers found that there are actually at least two different and distinct areas of working memory, each with its own separate capacity. One is called the “phonological loop” and is used for recall of sounds, and the other is the “visual spatial sketchpad” and is used for vision. If a person’s working memory capacity for sounds is maxed out at 7, she can still retain more in her working memory — as long as it’s something visual, like a shape or the location of an object.

But sounds also have locations in space, and we can easily locate a sound based on where the noise comes from. So what do we do with these memories? When we remember the location of sounds, do we use the same area of working memory previously thought to be only for visual images? Do we use the phonological loop? Or is there an entirely different resource we use to recall the location of sounds?

Günther Lehnert and Hubert Zimmer showed volunteers pictures of 30 different objects in groups of four, six, or eight. The pictures were displayed one at a time, for two seconds each, in a random corner of a large screen. Then the viewers were tested on the location of the objects they had just seen. The objects were chosen to be things that made distinctive sounds — animals, tools, or musical instruments. The same viewers also listened to the sounds each object made, played through speakers located in each corner of this screen, and again tested on where the sounds came from. They were also tested on a mixture of visual and audio versions of the objects and tested on those.

If there are separate areas of working memory for audio and visual locations, then viewers should be able to remember more locations when the auditory and visual items are mixed compared to when they are shown separately. Amazingly, they were, although the effect was small: 73 percent accuracy for the mixed sets versus 69 percent for the pure visual and auditory sets. Could there really be a separate area of working memory just devoted to audio location? A closer look at the data reveals a curious pattern:


The graph shows the accuracy for auditory and visual location memory depending on how many items the viewers had to remember. While overall visual memory was significantly better than auditory memory, the key to this graph is the difference between memories when pure audio or visual locations are tested and when mixed audio and visual memories are tested. For 6 or 8 items, there was no difference. Only when there were just 4 items to be remembered was the auditory location memory better in a mixed presentation. If there was really a separate working memory for auditory locations, you would expect to see an improvement for a mixture of audio and video in all three conditions.

Lehnert and Zimmer suspected something else was going on here. If you’re only memorizing four items, and your memory for visual items is better than auditory items, then once you’ve eliminated the two corners where the visual items appeared, there are only two possible corners where a sound may have played. The visual memory makes it easier to guess where the sound was.

So they repeated the experiment, but this time, they used only sets of four items each time. In one condition, the first experiment was repeated exactly, and as before, auditory memory was better when audio and visual objects were combined. In another condition, however, the researchers eliminated the memory advantage of the visual objects by Xing out two possible corners when viewers were being tested. Now two corners were eliminated in both conditions, and so there was no memory advantage for viewers when mixed auditory and visual objects had been presented. Here are the results:


This time there is no advantage for the mixed condition, whether viewers were remembering locations of visual objects or sounds. So this is convincing evidence that there is no separate memory function for locations of objects, and it’s quite likely that the same function is being tapped whether we’re remembering the location of a visual object or a sound.

Lehnert, Günther, Zimmer, Hubert D. (2006). Auditory and visual spatial working memory Memory & Cognition, 34 (5), 1080-1090


  1. #1 Luci
    September 3, 2008

    Out of the lab and into everyday life, the experience of mutual cooperation between sound and image for mnemonic enhancement seems obvious. Context would also seem to be highly relevant here – as we get instant prompts for more familiar objects on the one hand, and the surprise impact of novelty on the other. Working memory use both. The graphs are interesting – as if sound location mattered less than visual location. What happens when the sounds chosen are more contextually stimulating than the images. Is the location of a siren or a scream targeted as a ‘remember this’ object vs. a purring cat or a brief melody?

    Am I wrong in remembering that auditory and visual memory both hang out in the right temporal lobe area?

  2. #2 Ian Tindale
    September 3, 2008

    Could this be related to that?

  3. #3 aiqin
    September 9, 2008

    there is no separate memory function for locations of objects, and it’s quite likely that the memory association with visual and auditory performed consciously by using mnemonics.
    A visual mnemonic might be an image of something that might look or sound like the thing that is trying to be memorized. when u hear a behind the times’old song that you can recall the place ,evironment,people activity…so the memories all consist the auditory and visual.