Multiple object tracking: How pilots can fly, and how we can possibly play Asteroids

i-23fe74059abb7b7aabb8a589216139d8-tracking1.jpg

i-eca0cf2af9fc3ac4445c7dff7d8aab70-research.gifLast year, my dad got his pilot's license. He took me up with him a couple months later, and while the view was spectacular, the most surprising aspect of flying is how much of a pilot's time is spent avoiding other aircraft. You might think there's plenty of room up there, and you'd be right, but it also means you have to scan a vast space to locate other planes. Once you spot one, you need to keep track of it to make sure you're not on a collision course. Sometimes, you'll need to track four or more other planes. Is there a limit to how many objects we can track? And how, exactly, do we keep track?

The simplest explanation would use retinal coordinates. When we see something, it's because the image is focused on the retina at the back of the eye; the photoreceptors in the retina send signals to the brain, which processes the information. It might seem that the best way for the brain to track multiple objects is to keep a record of their location on the retina. That way, processing would need to occur only when the object moves.

Yet our eyes are constantly moving around, saccading from point to point, while the head turns back and forth, up and down. Perhaps instead, the brain keeps track of a visual region, regardless of where the eyes are pointed. That way, the brain wouldn't have to process a change in position every time we looked in a different direction.

So how do we know which method the visual system uses to track objects? A team led by Geniva Liu has designed a very cool set of experiments to try to answer that question.

The basic multiple object tracking experiment looks like this:

i-772effc9ed92f29c14eb34b2fa727a81-tracking2.gif

You are shown a set of objects. Then some of the objects are highlighted with red circles. Your task is to follow only these objects when the entire set begins to move. At the end, one object will again be highlighted, and you have to say if it was one of the objects that was originally highlighted. You can click on the picture above to play the movie. This one, with the objects moving relatively slowly, is fairly easy.

Using this test, Liu's team was able to confirm that most people can track up to six slowly moving objects with reasonable accuracy. When the speed of the objects is increased, accuracy decreases. But this still doesn't answer the initial question: which process does the visual system use to track the objects -- retinal coordinates, or relative position within a region? To start to answer the question, they developed a new experiment. This time, the entire region in which the objects were moving gets taken on a "wild ride," moving and rotating rapidly across the display. Try this task:

i-777e4342da57214b21b2631087c51586-tracking3.gif

The task seems incredibly difficult -- and it is, because the objects are moving 6 times faster within the region than in the first task, and because the video quality of this clip isn't as good as what the study participants experienced. In fact, the results of this experiment showed something remarkable:

i-ea9b6e9c864487b50be4bae25889b9bd-tracking4.gif

This graph shows the accuracy in the tracking of six objects, moving either at a slow speed (1 deg/sec) or fast speed (6 deg/sec). While participants were significantly more accurate tracking the slow objects compared to the fast ones, there was no significant difference in accuracy due to the speed of the wild ride. This is a strong indication that we're not tracking the objects using just retinal coordinates: after all, in this scenario, even during the "slow" condition, during the fast wild ride the objects are sometimes moving faster than during the "fast," no wild ride condition. Surely if we were using just retinal processing, the fast wild ride would be even more difficult than the "fast" condition with no wild ride.

Liu's team conducted four additional experiments, each one tackling a different dimension of the problem. I'm not going to summarize them all here, but you can see demos of each experiment on their web site. The team's conclusion: we don't use retinal coordinates to track multiple objects; instead we recall the relative positions of objects within a region.

The team points out that this study can have important implications for aviation. New standards for air traffic controllers will require that pilots and controllers view air traffic from the same perspective on computer monitors in the plane and in the control tower. Their research suggests that as long as the display smoothly rotates from one perspective to another, pilots and controllers won't have any difficulty continuing to track airplanes in that space while the display changes.

Liu, G., Austen, E.L., Booth, K.S., Fisher, B.D., Argue, R., Rempel, M.I., & Enns, J.T. (2005). Multiple-object tracking is based on scene, not retinal, coordinates. Journal of Experimental Psychology: Human Perception and Performance, 31(2), 235-247.

More like this

I haven't read the original paper, but it looks like there's a trend towards better performance with more motion of the frame, exactly contrary to expectations. I wonder if that's because the rotation of the frame allows the objects to be tracked in 3D space instead of 2D space. In 3D space, objects would be farther from each other than they would be in 2D space, making the task of keeping them separate that much easier.

There is a trend that way, but just when 6 targets are being tracked. It's not significant, and when 2 or 4 targets are being tracked, it's not present, so I'd say it's just a coincidence.

Yeah, and after getting to a computer where I can see the animation, it doesn't look like the boxes are rotating much; mostly shrinking and moving, which doesn't help 3D perception at all.