Does test-taking help students learn?

Blogging on Peer-Reviewed ResearchDuring my brief tenure as a high school teacher, one common suggestion I got from supportive colleagues was to "make your tests teaching tools." "That's often the only time you've really got your students' attention," they suggested, "so don't neglect the opportunity to teach them something."

What they meant is that you shouldn't use misleading or false information in tests as a "trick" to make sure they grasp the material: your test might be the only thing students remember from a unit.

But there's another reason testing is important for learning. For decades researchers have known that more is often learned during testing than traditional "learning." If, for example, students must learn 20 spelling words for a test, in many situations they'll remember the 10 words they were *actually* tested on better than the others.

If I quiz Jim on his Spanish vocabulary words every day, he does better on tests than if he studies on his own. This might be more of a reflection of the quality of his study time than a testing effect, but it still demonstrates the power of testing in aiding learning.

But how exactly does the testing effect work?

One hypothesis, supported by several studies, suggests that we learn from the specific cues in the practice test. If the final test uses similar cues, then we'll do better. If the practice test is multiple choice, then we'll do better on a multiple-choice final than a fill-in-the-blank test.

But one experiment from a 1989 study by John Glover found a different result. When the practice test was a free-retrieval test (participants were asked to recall as many ideas as possible from a reading passage), they did better on the final, no matter what type of test they were given.

Recently, Shana Carpenter and Edward DeLosh took a more systematic look at this phenomenon. They asked psychology students to try to memorize sets of 8 words by studying them one at a time for 3 seconds each. After being distracted by a brief math problem, they were tested in one of three ways, or given a chance to study the words again. After repeating this 12 times, they were tested again the entire set of 96 words. Here are the results:


As before, in the free recall, students just listed as many words as they could on a blank sheet of paper. In the cued recall test, the students saw the first letter of the word and had to fill in the rest. In the recognition test, they had to circle the correct 8 words from a list of 16. As you can see, no type of practice test -- including recognition -- led to significant improvement on the recognition test final. However, for both free recall and cued recall finals, the free recall practice test offered the best results (however, the free recall practice test was no better than the control group on the free recall final). Though the results aren't crystal clear, they certainly don't support the notion that taking a similar sort of practice test always leads to better results on the final exam.

Instead, Carpenter and DeLosh speculate that more elaborate retrieval processes during a practice test lead to better results on a final test. In other words, the more a test-taker relies on her own wits to generate the answers for a practice test, the better she'll do on the final. To test this notion, they developed a new experiment. As before, students memorized sets of 8 words, but this time, participants were given 1, 2, 3, or 4-letter cues during the practice test. (A one letter cue for "cognition" would be "C _ _ _ _ _ _ _ _", while a four-letter-cue would be "C O G N _ _ _ _ _"). During the final, all the students simply wrote down as many of the words as they could remember. Here are the results:


The results are statistically significant: the fewer the letters in a cue, the better the score on the final test. Carpenter and DeLosh argue that these results support the notion that more elaborate retrieval processes during practice tests lead to better results on final tests. Practice tests need not duplicate the format of the final test. Instead, practice tests should require as much effort as possible from the test taker. If the goal is long-term retention, final tests should also be in a free-recall format rather than, say, choosing from a list of possible answers.

There are some limitations to this study. Each experiment was administered over a period of about an hour -- the results might not hold over the long term. That said, the Glover study forming the basis for this new study was done over four days, so Carpenter and DeLosh can also be said to have expanded the basis for their conclusions.

These results also gel nicely with the anecdotes I've heard from teachers. And they make it clear that I'll probably be Jim's Spanish quizmaster for a long time to come.

Carpenter, S.K., DeLosh, E.L. (2006). Impoverished cue support enhances subsequent retention: Support for the elaborative retrieval explanation of the testing effect. Memory & Cognition, 34(2), 268-276.

More like this

I remember when I was studying for Step I of the medical Boards. Step I is the first of three very large tests that you have to take to become a doctor. This first test comprises everything you learn in the first two years of medical school, and it can in theory include the pathology and…
One of my near obsessions in cognitive science is the recovered memories debate. Not only has it been one of the most contentious debates in the field over the last 2 decades, but its practical implications are some of the most profound. There are people in jail right now largely as a result of…
[This is our synchroblogging post in honor of PLoS ONE's second birthday. Why not write your own?] Ever wonder whether it's better to study all night before a big exam, or to get a good night's sleep, but maybe not have a chance to go over all the material? We know that memory consolidation can…
Yesterday we discussed several experiments offering converging evidence that exposure to the color red, even for brief periods before taking a test, can result in lower achievement. It's startling research, but as my daughter suggested at breakfast this morning, maybe people are just intimidated by…


Did the students know what type of test they'd be taking?

I know when I was an undergrad, I would study differently for essay tests, for example, than I'd study for tests that were multiple choice.


Good question; no they didn't. It would be interesting to repeat the study with this manipulation. But still, the fact that the practice test doesn't need to match the final test is interesting. Perhaps we only *think* it's important to know the type of test we're studying for.

I think it's interesting that the control group had better results in both cued/free than the recognition group (and better than cued recall than the cued recall practice!). This reminds me of a midterm exam I created for a graduate course last year. The students asked for a practice exam so I haphazardly threw one together that ended up being way too easy. I think this set the wrong tone for them and they ended up doing very poorly on the actual exam (or maybe the exam was just too tough :-p) This year I resisted the call to create a practice exam for the midterm though.

Roediger and Karpicke (2006) have a review article on this so called 'testing effect'. Personally, I don't find the testing effect particularly 'surpising' (as the authors of the article describe it) given that it seems to be the equivalent of saying (all things being equal), assigning extra homework/tests, etc. will result in better performance than not assigning any of these things prior to a major test. Common sense alone would suggest more practice is better than less practice.

The novelty of this research really seems to be in the identification of effective types of homework and testing. However, Craik and Lockhart's (1976) levels of processing framework already seems to make suggestions as to which ones will be effective. Namely, they concluded that there are two basic strategies for encoding information: maintenance rehearsal and elaborative rehearsal. Elaborative rehearsal is better because it allows for encoding the meaning of information, while maintenance rehearsal results in shallow learning because it engages focus on sensory and perceptual aspects of information. So, anything that shows learning benefits is likely the result of the activation of elaborative rehearsal strategies.


Roediger, H.L., & Karpicke, J.D. (2006). The power of testing memory: Basic research and implications for educational practice. Perspectives on Psychological Science, 3, 101-210.

By Tony Jeremiah (not verified) on 03 Jan 2008 #permalink

Indeed this isn't surprising. Let me rephrase: "Students tend to recall things when asked to recall things. Students tend not to recall things they were not asked to recall."

Actually, I've found quite a few ways to learn by myself as well as from recall help from an instructor.

I learned from the way he taught me to study and did very sucessfully from the learning in a totally different course! Practice and recalling at environments that are not at school (like the shower, for example) eventually made me just roll the information right out.

What he did was ask me a lot of questions on the subject and eventually I knew what he was looking for in terms of an answer. He made me do the questions on the spot, off the top of my head -- it was a little daunting at first, but then I just took right to it. This eventually alleviated quite a bit of test stress, which was something I was coping with as well. Of course, not all of what he asked was there and some stuff he didn't ask was there as well... but it made me realize in what he was looking for in the answers given and helped me study more for important points and the general idea with a little detail.

Just recently I've helped a few fellow students taking the course I once did that well and taught them the same way. I still know quite a lot of the information that was taught to me. It worked extremely well and will be using the same techniques.

I can't see the results when the final test was recognition - although you do say they were not significant. Am I missing something?


By Mark Tyrrell Frank (not verified) on 03 Jan 2008 #permalink

This is really interesting. I will keep it in mind when I am given tests at school, since I am only 17 this gives me some movitation to actually revise for my AS levels I hope.

When I was teaching (university, operations research), my most useful classroom ploy for quick one-question tests on assigned reading was to have students pair up and work out the answer jointly. I believe that the active discussion between the students helped to register the knowledge.

By Judith S. Liebman (not verified) on 04 Jan 2008 #permalink

I believe that this works under the conditions that you've stated, however, I think that the study that you cited has more to do with the way people learn and read words as opposed any other type of information. I used to be more well-versed in this topic, but am out of practice now. Can anyone corroborate?

I think it's interesting that the control group had better results in both cued/free than the recognition group. It is also intresting to see if taking tests really makes kids learn. Although, I don't really think the test itself makes the kids learn, but it is more of a motivation for making them learn. If there were no grades, then no one would try. Tests are not a way to learn, but more of a reason to learn. Tests might very well not be helful, but until they are taken away students will have trouble overcoming tests. Here is a link to get any students reading this article into good study habits.

The comment about trick questions being counterproductive for learning is interesting. Do you know of any research specifically on this? I would love to show it to a certain instructor I know.