Adventures in Ethics and Science

Girls, boys, and math.

You’ve probably already heard the news last week that a study published in Science indicates that the gender gap between girls and boys in mathematical performance may be melting faster than the polar ice caps. The study, “Gender Similarities Characterize Math Performance” by Janet S. Hyde et al., appears in the July 25, 2008 issue of Science (behind a paywall). [1]

Hyde et al. revisit results of a meta-analysis published in 1990 (J. S. Hyde, E. Fennema, S. Lamon, Psychol. Bull. 107, 139 (1990).) that found negligible gender differences in math ability in the general population but significant differences (favoring boys) in complex problem solving that appeared during the high school years. This 1990 meta-analysis drew on data collected in the 1970s and 1980s. The present study asked whether more recent data would support the same findings.

In previous decades, girls took fewer advanced math and science courses in high school than boys did, and girls’ deficit in course taking was one of the major explanations for superior male performance on standardized tests in high school. By 2000, high school girls were taking calculus at the same rate as boys, although they still lagged behind boys in the number of them taking physics. Today, women earn 48% of the undergraduate degrees in mathematics, although gender gaps in physics and engineering remain large.

The researchers used as their data results from the state assessment tests mandated by No Child Left Behind given to students in grades 2 through 11. Specifically, they drew on results from California, Connecticut, Indiana, Kentucky, Minnesota, Missouri, New Jersey, New Mexico, West Virginia, and Wyoming (since these states were able to break down the results by grade level, gender, and ethnicity), a pool of more than 7 million students.

In this data, Hyde et al. found no evidence of difference in performance on the mathematical assessments between boys and girls on average.

Effect sizes for gender differences, representing the testing of over 7 million students in state assessments, are uniformly <0.10, representing trivial differences. Of these effect sizes, 21 were positive, indicating better performance by males; 36 were negative, indicating better performance by females; and 9 were exactly 0. From this distribution of effect sizes, we calculate that the weighted mean is 0.0065, consistent with no gender difference.

Even if there is no gender difference on average, it has long been claimed that males show more variance in mathematical abilities — in other words, that more boys will be found far below and far above the average, while girls will be more tightly clustered around the average. This claim of greater variance is sometimes invoked to explain gender disparities in science and engineering careers, where presumably you need people at the very top of the range of intellectual abilities, and there just happen to be more males that fit the bill.

So Hyde et al. examined the variance in their data:

The variance ratio (VR), the ratio of the male variance to the female variance, assesses these differences. Greater male variance is indicated by VR > 1.0. All VRs, by state and grade, are >1.0 [range 1.11 to 1.21]. Thus, our analyses show greater male variability, although the discrepancy in variances is not large.

At the top of the range on the test results (in data from Minnesota 11th graders), what kind of gender differences do the data show?

For whites, the ratios of boys:girls scoring above the 95th percentile and 99th percentile are 1.45 and 2.06, respectively, and are similar to predictions from theoretical models. For Asian Americans, ratios are 1.09 and 0.91, respectively.

If these differences in the top echelons of math ability are being used to explain the gender imbalances in science, engineering, and mathematics as courses of university study or career paths, do these numbers support the explanation?

If a particular specialty required mathematical skills at the 99th percentile, and the gender ratio is 2.0, we would expect 67% men in the occupation and 33% women. Yet today, for example, Ph.D. programs in engineering average only about 15% women.

Not to mention, we would expect more Asian American women than Asian American men in these engineering programs if these educational choices simply followed ability. So at the very least, greater variance among males is not sufficient to explain the current gender breakdown in math, science, and engineering. There must be other factors at work.

Earlier studies had reported gender differences in complex problem solving, favoring boys. (They also reported gender differences in computation and grasp of concepts, favoring girls, although these differences never seemed to play much of a role in the coverage of these studies in the popular press.) It seems at least possible that some of this was connected to coursework — if you’re in a course that introduces you to complex problems and strategies for solving them, you might then be more likely to perform better in complex problem solving tasks.

Hyde et al. tried to see what their more recent data indicated about complex problem solving and gender:

[W]e coded test items from all states where tests were available, using a four-level depth of knowledge framework. Level 1 (recall) includes recall of facts and performing simple algorithms. Level 2 (skill/concept) items require students to make decisions about how to approach a problem and typically ask students to estimate or compare information. Level 3 (strategic thinking) includes complex cognitive demands that require students to reason, plan, and use evidence. Level 4 (extended thinking) items require complex reasoning over an extended period of time and require students to connect ideas within or across content areas as they develop one among alternate approaches. We computed the percentage of items at levels 3 or 4 for each state for each grade, as an index of the extent to which the test tapped complex problem-solving. The results were disappointing. For most states and most grade levels, none of the items were at levels 3 or 4. Therefore, it was impossible to determine whether there was a gender difference in performance at levels 3 and 4.

Here’s where we get into the debate about whether NCLB’s linkage of test results and school funding has been a positive or pernicious influence on test design, what our kids are learning, etc. Let me just reiterate what teachers should already know: the assessment tells the students what they really have to learn. If you think a particular competency or bit of knowledge is important but you don’t assess it, your students are smart enough not to waste their effort absorbing it.

I suppose this means that even if you’re not trying to teach to the test, to the extent that the students have reasonable information about what the test items will be like and what they will cover, the students are going to “learn to the test”.

Also, to the extent that complex problem solving is a competence that it would be good for students to develop prior to starting a college level program in mathematics, science, or engineering, if the NCLB tests are the assessments driving secondary math education, we may be in trouble. I’m hopeful that there are still old school teachers giving weekly quizzes focuses on complex problem solving, and still significant populations of kids preparing for Advanced Placement exams or math team competitions. But it sure would be nice to raise the bar to a level including complex problem solving — and figure out how to get more kids over that bar — rather than acquiescing to high stakes tests that apparently view complex problem solving as a frill.

So, in summary, when girls are taking the same math coursework as boys, the gender differences in mathematical performance seem to shrink to insignificance. And, NCLB assessments may be dumbing down math for everyone.

What impact that will have on careers in math, science, and engineering remains to be seen.

——-
[1] Janet S. Hyde, Sara M. Lindberg, Marcia C. Linn, Amy B. Ellis, Caroline C. Williams, “Gender Similarities Characterize Math Performance.” Science 25 July 2008:Vol. 321. no. 5888, pp. 494 – 495.

DOI: 10.1126/science.1160364

Comments

  1. #1 Super Sally
    July 30, 2008

    Oooooooooooooooooooooh, I’d love to do C. Pine’s item analysis on these test with over half a million students to determine what the upper level math test items are testing in the various cohorts.

    When I was last doing that item analysis on NJ Basic Skills tests given to students entering all levels of state funded colleges (and some private ones) I was working with 10k students a year (over 8 years). I don’t know that any of our M/F analysis was ever published, but even back then when we controlled for HS math taken, girls were performing even with or better than boys. Actually, in the earlier years (late ’70s) girls who took the same higher level courses performed better than boys, and by the mid-’80s that was down to even with boys, as more girls were taking upper level HS math. Note that the NJBS tests of that era only covered math through the 8th grade curriculum and and the full range of 1st year algebra. Like the tests under discussion we did not have extended complex tasks. It was interesting that we did see significant correlation between high performance on the more difficult equation solving items and high scores on the essay portion of the tests.

    Back in the mid-’80s when I left that work to ETS, the analysis was running on main-frames. It could probably be done nicely on a PC now, even for student numbers like these.

    Maybe we have a retirement project here…Some of us are just data junkies, particularly with data on a topic about which we are passionate.

  2. #2 Academic
    July 30, 2008

    Nice synopsis. I blogged about this study yesterday as well. Naysayers are already coming out of the woodwork.

  3. #3 Tony Jeremiah
    July 30, 2008

    1. negligible gender differences in math ability in the general population but significant differences (favoring boys) in complex problem solving that appeared during the high school years.

    2. We computed the percentage of items at levels 3 or 4 for each state for each grade, as an index of the extent to which the test tapped complex problem-solving. The results were disappointing. For most states and most grade levels, none of the items were at levels 3 or 4.

    3. when girls are taking the same math coursework as boys, the gender differences in mathematical performance seem to shrink to insignificance. And, NCLB assessments may be dumbing down math for everyone.

    Taken together, these statements seem to suggest that the observed attenuation of gender differences at the secondary level may be due to tests at the secondary level having a low item-discrimination index (d).

    d is defined as the difference in the proportion of high scorers answering particular items on a test correctly, and the proportion of low scorers answering the same questions correctly. For example, assuming math questions have some type of gender-bias, and, a typical math question on a grade 12 exam asks something like, 1+1=?; that question will have a low d since presumably, any person making it to a grade 12 math should have no difficulty answering that question. However, if a typical question is something like, 1+Number of times the Yankees have one the World Series=?, the d should be more substantial (assuming males have a greater fascination with sports trivia than females).

    The above example is probably a ridiculous exaggeration, but is meant as a concrete analogy to statement #2, which suggests that at the secondary level, tests across various states appear to be assessing lower level (e.g., recall) and not upper level (e.g., problem-solving) cognitive skills. If a gender-difference exists in problem-solving (due to genetic or likely environmental factors; see Dave Munger’s Why aren’t there more women in math and science for a more detailed commentary), and, that is really the primary gender difference, it won’t be detected by secondary school math tests since they appear to have a low d as it concerns math-related, problem solving skills.

    If this goes undetected at the secondary level, it may show up at the post-secondary level (which appears to be the focus of Dave’s posting on this issue) as a gender difference, especially if tests at the post-secondary level are primarily focused on assessing problem-solving ability.

    This ultimately holds some great significance for your last comment: What impact that will have on careers in math, science, and engineering remains to be seen.

  4. #4 Matt Brodhead
    July 31, 2008

    I would imagine that behavioral history is the main factor, not gender as the significant variable. However, the Onion ran a fantastic article about the subject: http://www.theonion.com/content/amvo/girls_boys_in_math

  5. #5 Tony Jeremiah
    July 31, 2008

    I would imagine that behavioral history is the main factor, not gender as the significant variable. However, the Onion ran a fantastic article about the subject:

    Gender and behavioral history are probably confounded/intertwined though, especially when one considers genderschema theory in conjunction with reciprocal determinism.

  6. #6 Dylab
    July 31, 2008

    Super Sally: I had a thought on controlling for classes taken. Is it possible that the change in the statistics is a result of self-selection rather than an effect from the classes. If a smaller group of girls decide to take increased math course it doesn’t seem paticularly implausible to me that they would come from an higher end of the intellegence spectrum of the same female cohort relative their males who decide to take extra course work.

    I’m curious what male/female differences would be where math courses are less optional.

  7. #7 Alice
    July 31, 2008

    Nice post, Janet. Thanks for blogging about it.