Attention: It's Not a Big Truck (What Does N-Back Training Actually Do?)

Klaus Oberauer has a fascinating paper from 2006 which seems to have been ignored by the cognitive training community. Oberaurer demonstrates how improper counterbalancing, ignorance of the power-law of practice, and confounds in the design of memory load tasks can substantially misconstrue the real effects of training on performance. This work has implications for the interpretation of improvements in n-back performance as a function of training, which has become a feature of several new websites in the wake of a study showing fluid intelligence is enhanced after n-back training.

In 2004, Verhaegen et al demonstrated that 10 hours of practice on a simple n-back task - where subjects must say whether a probe item is the same or different from one presented n items ago - allowed subjects to expand the number of items to which they could simultaneously attend.

Many theories posit that attention is the central bottleneck in human cognitive performance. It has been repeatedly demonstrated that the focus of attention is normally limited to just a single item (but see this discussion). Verhaegen et al. believed practice on the n-back could alleviate this bottleneck for a few reasons:

1) Prior to practice there is a large jump in the reaction time (RT) of subjects completing the "n>1"-back relative to the 1-back task. However, there are no differences between RTs of the "n>1"-back trials themselves. Verhaegen et al argued this step-function in RT between n=1 and n=2 reflects the time cost for switching attention from a single item to a content-addressable search of memory.

2) By the end of practice, this step-function became linear: that is, each additional n up to 4 incurred a fixed additional cost, but there was no stepwise jump in reaction time at any point for n

3) Verhaegen et al. also observed a nonlinear increase in RT as n increased from 4 to 5, thought to reflect a different bottleneck - that of the capacity of working memory. In other words, subjects suffered a reaction time cost because they could not maintain all of these items in memory.

In his 2006 paper, Oberaurer demonstrates with three experiments that these data and conclusions are faulty.

The primary reason is that Verhaegen et al had included no baseline for assessing attentional switch costs for n>1. Trials where n>1 always require switching attention among the multiple representations held in memory, in Verhaegen et al's paradigm. To assess attentional switching independent of general set size effects, Oberaurer included a proper baseline for each value of n. Attentional switch costs no longer disappeared with training, indicating the focus of attention had not been expanded.

Further, Oberaurer demonstrated that the switch to a linear function of RT over n may partially reflect faulty counterbalancing. The counterbalancing scheme used by Verhaegen et al would interact with the long-established nonlinear power-law of practice, such that training-related improvements which appear selective to certain set sizes might actually reflect general improvements in speed. Using a latin-square design more robust to nonlinear order effects, Oberaurer observed a linear increase in reaction time with n even among subjects who had undergone no training. This is expected of attentional switch costs if, for 1n

Finally, Oberaurer ran a third experiment suggesting that the apparent RT step-function between n=1 and n=2 might reflect a strategy for balancing speed and accuracy peculiar to those conditions, in which all probes are rejected if they match any item stored in memory.

In summary, the reported expansion of the focus of attention as a result of n-back training actually reflects general speed improvements (as described by the power law of practice), in combination with speed-accuracy tradeoffs, and not an elimination of attentional switch costs with training.

The results imply that the attentional switch costs in n-back cannot be eliminated, at least with this level of practice. These switch costs are not simply a matter of how much attention can hold ("it's not a big truck") but rather how the efficiency of attentional refocusing inversely scales with the number of items that are being selected among.

More broadly, the results demonstrate a lack of reliability in differences between pre- and post-test scores on non-randomized and imperfectly counterbalanced criterion tests. In other words, nonlinear improvements in speed as a function of practice can yield apparently selective improvements on tasks where the trial order is not perfectly counterbalanced.

More like this

old post, but i just wanted to say this is a really good write up.