Cognitive Daily

We’ve written before about how stereotypes can impair performance on math tests: for example, when women are told they are taking a math test for a study about gender differences in math ability, they perform more poorly than men. However, if they are first taught about how stereotypes can impair performance, their scores rise to equality with men.

But what about the other side of the stereotype spectrum? When people are expected to perform better due to a stereotype, how do those expectations affect performance? One possible answer is that they will perform even better. Another possibility is that they’ll choke under the pressure of living up to their reputed ability.

To try to differentiate between these two possibilities, Sapna Cheryan and Galen Bodenhausen tested three groups of Asian-American women, all college students who had indicated that it was important for them to do well in math. Each group took the same math test composed of questions from the Graduate Record Exam (the GRE—a standardized admissions test for U.S. graduate schools). However, before the exam, the groups were administered three different questionnaires, designed to focus participants on one of three aspects of their identity: ethnicity, gender, or individual identity. The questions, while not specifically invoking stereotypes such as the idea that Asian-Americans are better than average at math or that women perform poorly in math, certainly invited participants to invoke those impressions, with statements like “overall, my race is considered good by others.”

Here are the results:


Asian women who took the survey focusing on ethnicity performed significantly worse on the math test than those who took the individual-focus questionnaire. There was no difference between the individual survey group and the gender survey group.

Cheryan and Bodenhausen also found that the ethnicity group had a significantly lower ability to concentrate than the other groups in the study—in fact, this difference explained most of the difference between the ethnicity group and other groups. Cheryan and Bodenhausen claim that Asian-Americans’ “model minority” status—as a minority group that fits in and even excels in a Caucasian-dominated society—often leads to overwhelming pressure to succeed. They cite research by Ho, Driscoll, and Loosbrock, which found that math problems were given fewer points by graders for Asian-American students than European-Americans who gave the same answer. Since the penalty for failure appears to be larger for Asian Americans, it’s no wonder that they have difficulty concentrating when their ethnicity is highlighted.

The impact of stereotypes clearly is complex—we’ve reported on positive, negative, and neutral effects (as in the case of gender here). Perhaps this experiment’s findings on Asian-American women won’t be replicated with other groups. What’s certain is that stereotypes do have an important impact on performance. It’s possible that the most important reminder this study may offer is not to put too much faith in the results of a single test, whether it purports to measure math ability, IQ, or some other skill.

Cheryan, S., & Bodenhausen, G.V. (2000). When positive stereotypes threaten intellectual performance: The psychological hazards of “model minority” status. Psychological Science, 11(5), 399-402.


  1. #1 Scott Reynen
    December 16, 2005

    “There was no difference between the individual survey group and the gender survey group.”

    It appears to me that the two bars have different heights in the graph, which I think most people would read as a difference. Do you mean there was no /significant/ difference, or does the word “difference” mean something other than the colloquial in this context?

    I hate to be the reader who always points out problems. In general, I love this site. But every time I read a description of data, I immediately look at the related graph to see how this data is displayed, and I have trouble continuing if the two appear to conflict.

  2. #2 CT
    December 17, 2005

    Isn’t that the point of a statistical test though? While it’s true that lack of statistical significance may not *always* translate to a lack of a ‘real’ difference, it may be misleading to state that there was no “significant difference” between the 2 groups, especially since the summary is intended for laypeople, and not necessarily individuals who have had training in statistical analysis. (i.e., the term “significant” in colloquial use has a rather different meaning than it does in formal academic use.)

    According to the article, there was a 2% difference in scores of the Individual Survey group versus the Gender Survey group. The p value for this difference was greater than 0.65, indicating that this 2% difference could be expected to occur due to chance (‘random’ extraneous factors) alone more than 65% of the time. Given such a small difference and such a large p value, I don’t think it is unreasonable to say that there was no difference between the performance of those groups.

    I appreciate the need to use precise language in academic writing. However, since these summaries are intended as “stories” to present to the public at large, I don’t think the ommision of the word “significant” in this case is disastrous. Indeed, it may help the summary be more accessible to the general population by reducing the use of lingo which may be intimidating and/or confusing.

  3. #3 Scott Reynen
    December 17, 2005

    I am a layperson, so you don’t need to tell me what would be confusing to laypeople. I have only a vague notion of what “p value” means. I just know when I see a picture of two bars of different lengths, I consider that to be different, and when you say it’s not, I don’t understand what you’re saying. I don’t know what exact language you should use to qualify “not different” and make it both academically and colloquially meaningful, but with no qualification it is confusing to this layperson, which is apparently the opposite of your intent.

  4. #4 Dave Munger
    December 17, 2005

    Scott, I appreciate the feedback. It sounds like I probably should have said “significant difference,” as I generally do. It’s always a delicate balance writing these articles, since on the one hand I don’t want to introduce too many terms that laypeople don’t understand, or to make the language too daunting, but on the other hand I want to show the real data. I’ll try to do better in the future.

  5. #5 CT
    December 17, 2005

    Apologies, Scott—I clearly made the wrong assumptions about where you were coming from. As researchers, it was beat into us early and often that if a difference wasn’t statistically significant, then there IS no difference. So, by in large, the terms are interchangeable. To me, anyway. :)

    I need to better keep in mind that my own familiarity with the lingo may cloud my perceptions of what will and won’t be clear to the general public. Thanks for clarifying your point & setting me straight!

  6. #6 Manas
    December 22, 2005

    I am “General Public” for purposes of this site. (not a research type or psychology student type).

    And as a member of the “general public” who spends a lot of time working with other people, the premise and results of this research seems kind of obvious to me…
    I live and experience it everyday.

    Positive stereotypes sometimes result in people performing better … and sometimes they choke up. (And which result occurs is really a funtion of so many other complex variables in the environment).

    And I am also curious why such a large number of these studies always seem to pick women / race and math performance / IQ to test the premise… Surely, life is more deliciously complex and textured than that…!!

    I am still to be convinced that true understanding lies in micro slicing problems and studying them in sterile conditions.

  7. #7 Lucian Smith
    October 26, 2006

    OK, a year late, but I was just pointed to this post.

    Since we’re harping on the graph, I’ll note that two things could help it be more realistic: the inclusion of error bars (which would clear up the ‘significant/non-significant’ problem above), and making it zero-based, which would make the observed 10% difference *look* like a 10% difference, instead of looking like a 60% difference.