What can you remember in a glimpse?

The text below will bring up an animation. Just look at it once -- no cheating! A picture will flash for about a quarter of a second, followed by a color pattern for a quarter second. Then the screen will go blank for about one second, and four objects will appear. Use the poll below to indicate which object (#1, 2, 3, or 4) appeared in the picture.

Click here to view the animation!

I'll let you know which answer was correct at the end of the post, but this test approximates the procedure of an experiment conducted by Kristine Liu and Yuhong Jiang, designed to measure the capacity of visual working memory. If we get enough responses, we can see if our results match theirs.

Previous studies have had conflicting results, with some indicating we can remember a large number of details of a briefly presented scene, and others suggesting that we don't notice differences between scenes even when we look at them for several seconds (see this Cognitive Daily article for one example).

Liu and Jiang noticed that one problem with some studies is that they asked participants to recall the names of objects, or when they asked participants to choose among possible objects, it was easy to guess. Liu and Jiang avoided these problems by including two entirely different sets of objects in each scene they used. In their first experiment, viewers briefly glimpsed one scene with 10 different objects in it, then had to choose the objects that were in the scene from a set of 20 objects. The 10 "incorrect" answers were the correct answers for the other scene. Half of the viewers saw each scene, and the results were combined:


Responses were better than chance for only the first object selected -- just 75 percent accuracy. Apparently a brief glimpse of a scene isn't enough to introduce many details into working memory.

But perhaps viewers simply forgot the other objects while they were working on the choosing task. To address this issue, a new experiment was designed corresponding to the above animation. A scene was flashed for 250 milliseconds, then viewers had to choose between four possible objects. One object was correct, one was a different object of the same type, and the other two were different objects of the same type (though still plausible for the scene -- and as before, one of these objects was used in a second version of the scene). A second group of viewers was allowed to view the image as long as they desired before moving on to the memory quiz. In addition, some of the time viewers saw the scenes with a background context (as in the animation above), and sometimes they were just shown isolated objects on a white background. Here are the results:


When images were briefly flashed in context, accuracy was just barely better than the 25 percent chance rate. Accuracy improved noticeably when viewers saw the objects on the white background. Those who had unlimited viewing time (they viewed the scenes an average of 13.7 seconds) were very accurate, and they, too, improved when they saw the objects on their own.

Liu and Jiang used these results to calculate how many objects, on average, were retained in visual working memory. For the display with context, people can remember about 0.67 of 10 objects when briefly displayed, and 5.33 out of 10 when they have unlimited time. When the objects are shown without a background, these numbers jump to 2.1 and 7.41.

Liu and Jiang argue that their results demonstrate that there are at least two ways our visual system perceives the world: a "fast" route which uncovers the gist of a scene and a "slow" route through which we can process specific details.

So were we able to replicate their experiment? See for yourself -- the correct answer to the poll is 2.

Liu, K., & Jiang, Y. (2005). Visual working memory for briefly presented scenes. Journal of Vision, 5, 650-658.

More like this

Considering I'm running on very little sleep, I'm totally impressed that I got that right. Very cool! :)

I would like to see a detailed piece such as this on the differences in which men look at a scene from women. ("It's right in front of you, dear.")

I have a couple of thoughts as to why we're not replicating the results. Anyone want to take a stab at it before I chime in?

Also, I tried a different poll format this time (see this post for another format). The new format was due to complaints that it wasn't working in Safari. I think this format is more reliable, but much less cool. However, I've now gotten the old format to work in both Safari 1.5 and 2.0, so I'd appreciate any feedback on a preferred poll format.

I didn't know for sure, but I really had a strong leaning/feeling? to pick 2, so I did. Either I did actually see it, or I'm a really good intuitional guesser...

I also picked 2 for the similar reasons to Jenna but it was color that made the strongest impression on me.

I also picked 2 for the similar reasons to Jenna but it was color that made the strongest impression on me.

i just was pretty sure i saw a tall object, so i picked 2.

By Christian (not verified) on 16 May 2006 #permalink

I didn't even get my eyes focused on the scene, much less recognize anything. The switch from blinding white to dark colors was so abrupt, I was dilating and adjusting for the entire quarter second. I tried watching it again, and I STILL couldn't make out #2 even though I had been told it was there.

Since I have very bad eyes, I suppose visual memory studies are just not for me. Either that, or all the grading has made me too tired to function.

I wanted to pick either #1 or #2. In the end I picked #1 - but I believe now (postrationalization alert!) that it's because it fit the overall colorscheme of the scene better.

A couple of comments:

1. The group with unlimited viewing in context was only 70% correct in a recognition task? Does that seem a little low to anyone else?

2. These results are probably specific to the task. In my dissertation, I found that participants were extremely accurate at determining category membership of objects in complex scenes (black and white photos) after only a 6ms duration, followed by a noise mask. This result, and others from the ultra-rapid visual categorization literature are consistent with the interpretation that there are two "channels" that process visual information -giving us a quick and dirty representation and followed by a more detailed, "slow" one. Ties in nicely with what we know about the M and P visual pathways as well.

Man. I was sure it was 2. But then I decided that I only thought that because it was the same color as the dominant color of the photograph (which I took to be wooden cabinets in a kitchen) and selected something else instead.

"1. The group with unlimited viewing in context was only 70% correct in a recognition task? Does that seem a little low to anyone else?"

If you take a look at their choices, some of the objects are amazingly similar. Here's a link to the article. Take a look at Figure 3: respondents had to choose between two nearly identical Listerine bottles. I think this is one reason we didn't replicate the task (our respondents are about 40 percent accurate) -- our distractors are too different from the depicted objects.

And according to Figure 1, it's appropriate to store a bottle of red wine in the fridge! Pah!

Wow, that was neat. When I saw the group of four, my eyes immediatly went to 2, then I started second guessing myself. Looking at them closely, I realized I had no idea, so I decided I might as well pick 2!

Wow thats cool! I looked at the screen carefully but it was too quick for me to catch any image. So when I was choosing out of 4 objects I was only guessing. And when I chose the second object and looked at the results I was amazed seeing that so many more people actually chose the object number 2 so it must be the right answer. It was so weird how I was inclined to choose the number 2 even though I thought I didn't remember the image at all!

Just wanted to drop in to say I picked #2, and I was totally right... but that's because I'm on dialup and I got to look at the picture for about 15 seconds. I think you need to warn dialuppers to let it load without looking, then refresh it!

Cool. I picked #2 as well, but it felt more like guessing than remembering.

It reminds me of this time I was walking down a street, glancing at a front door and the nametags next to it. I was sure I read a familiar name, although I was sure too that I hadn't had time to really read it... so I walked back and it was there, allright.

That's cool, I picked #2 but it also felt like a guess, but looking at the results it seems that most people "guessed" right. Is this something to do with the ~0.5s delay in consciousness?

I picked #2. I thought I recognized the color, but that could have been a rationalization. It might have been something else that tipped me off but it was below the conscious level. Then there's pure luck.

I picked #2 based on color.

I picked #2 because it's actually a fairly unique looking object. The scene gave me the impression of some kind of Turkish kitchen, and so did object #2. Plus I think I remembered the shape.
I wonder how the results would appear if the scene contained items more 'mundane' so an American audience, for instance?

Yes, number 2. Part intuition, part light and shadow: It's the only which actually looks like a cutout from the photo.
By the way: If there was supposed to be a way to input an answer, it didn't work on this particular Firefox.

As is presented in this animation, I would expect there to be big variability in the responses and not much accuracy. The presented scene is complex and the time is very short. My subjective feeling is I barely had a chance to register anything. I saw perhaps a blender, and a countertop or such, but no confidence in that.

Performance in this task is affected by the randomness of saccadic movements of the eyes, initial fixation point (there is no fixation crosshairs here), whether there was a blink to throw out that trial, object recognition times, perceptual memory duration, working memory for multiple objects, fine discrimination of objects that happened to fall outside of foveal vision, etc. It relies on too many processes.

Famous faces confinded to a small radius around a fixation crosshairs and in isolation even at 250 ms would give results less variably and robustly above chance.

I almost immediately went to number 2, then realised that I didn't remember seeing it in the picture...I'd just focused long enough to get an impression of the type of image, and to focus more strongly on the second image shown.

Having not seen any of the others, though, I decided to pick 2 for the heck of it.