A bunch of people were talking about this Nature Jobs article on the GRE this morning while I was proctoring the final for my intro E&M class, which provided a nice distraction. I posted a bunch of comments about it to Twitter, but as that’s awfully ephemeral, I figured I might as well collect them here. Which, purely coincidentally, also provides a nice way to put off grading this big stack of exam papers…
Anyway, the thrust of the article is that the GRE is a bad thing to be using as an admissions criterion for graduate school in science and engineering, because it has large disparities in scores for different demographic groups (the key graph is reproduced above). The authors push for a more holistic sort of system including interviews rather than an arbitrary-ish numerical cut-off on GRE scores, which, purely coincidentally, is what they do at a program they help run that has a good track record.
Some miscellaneous thoughts on this:
— It should be noted up front that this is very much an op-ed, not a research article in the general Nature mode. There are no references cited, and only occasional mentions of quantitative research. This is kind of disappointing, because I’m a great big nerd and would like to see more data.
— I am shocked– SHOCKED!– to hear that the GRE isn’t a great predictor of career success, which they attribute to William Sedlacek, probably alluding to research in this paper (PDF) among other things.
— The bulk of the piece is devoted to talking about race and gender gaps in GRE scores, focusing on the math section of the test. They repeatedly mention the use of a cut-off around 700 on the math section, and point out that this is a problem given that very few women and underrepresented minorities score this high.
This seems like an especially dubious technique, given that math scores would be only a small part of what you would like to measure for graduate admissions. And while I’ve heard rumors of GRE cut-offs in graduate admissions, I’ve always heard them discussed in the context of the subject tests more than the general exams. They don’t have data on those, though, probably because the statistics would be much worse due to the smaller number of test-takers.
(It should also be noted that this score cut-off is another unsourced number, and honestly, I’d be a little surprised if it could be definitively sourced, because that seems like lawsuit fodder to me. I doubt you’ll find any graduate school anywhere that will admit to using a hard cut-off for GRE scores– which doesn’t mean that it isn’t used in a more informal way, of course, but I’d be very surprised if anybody copped to using a specific number.)
— When I initially looked at the graph above, the thing that struck me was not just the big gaps between groups, but the fact that those gaps seemed to be smaller in physical sciences than in life sciences. That seemed surprising, as the life sciences are often held up as doing much better than physical sciences in terms of gender equity in particular.
On closer examination, the difference isn’t as big as my first look seemed to suggest– pixel-counting in GIMP puts the largest racial gap at 190-ish score points for life science and 180-ish for physical science, and the gender gap at 60 and 70, respectively. Not a significant difference.
— I will note, however, that biologists suck at math– average math GRE scores for life sciences are something like 110 points lower than those for physical sciences, comparable to the racial and gender gaps that they find. Which might in some sense mean that the gaps are worse (as a fraction of the overall score) in life sciences. But then, I’m not sure that’s the right thing to look at.
— When talking about the use of GRE scores as a cut-off, they write:
This problem is rampant. If the correlation between GRE scores and gender and ethnicity is not accounted for, imposing such cut-offs adversely affects women and minority applicants. For example, in the physical sciences, only 26% of women, compared with 73% of men, score above 700 on the GRE Quantitative measure. For minorities, this falls to 5.2%, compared with 82% for white and Asian people.
The misuse of GRE scores to select applicants may be a strong driver of the continuing under-representation of women and minorities in graduate school. Indeed, women earn barely 20% of US physical-sciences PhDs, and under-represented minorities — who account for 33% of the US university-age population — earn just 6%. These percentages are striking in their similarity to the percentage of students who score above 700 on the GRE quantitative measure.
That last sentence reads as slightly snotty to me, in a “I’m going to insinuate a causal relationship that I can’t back up” kind of way. So I will point out that the percentage of women earning Ph.D.’s in physics is also strikingly similar– according to statistics from the American Institutes of Physics– to the percentage of women earning undergraduate degrees in physics (both around 20% in the most recent reports), and for that matter the percentage of women hired into faculty positions in physics (around 25%). Just, you know, as long as we’re pointing out numerical similarities.
— I would be all in favor of de-emphasizing the GRE in favor of other things, not least because our students tend not to score all that highly on the Physics subject test. (And, for that matter, I didn’t score all that highly on it, back in the day– I was in the 50th percentile, far and away the lowest percentile score I ever had on a standardized test.) Their call for considering a wider range of factors is something eminently sensible, and I would be all for it.
The trouble with this is buried in a passing remark in talking about the score disparities:
These correlations and their magnitude are not well known to graduate-admissions committees, which have a changing rota of faculty members.
Unlike undergraduate admissions, which is generally handled by full-time staffs of people who do nothing but make admissions decisions, graduate admissions decisions are more likely to be made by faculty who are stuck reading applications as part of their departmental service. And I think that, more than “a deep-seated and unfounded belief that these test scores are good measures of ability” (quoting from the penultimate paragraph), explains the use of the GRE. To whatever extent GRE scores get used as a threshold for admissions, it’s not because people believe they’re a good indicator of anything, but because it’s quick and easy, a simple numerical way to winnow down the applicant pool and let the faculty get back to doing the things they regard as more important.
(To be fair, my knowledge of the details of the graduate admissions process is very limited, as we don’t have a grad program. I’m going off conversations I’ve had with people at larger schools, which may be skewed by people who are blowing off steam by kvetching at conferences.)
Ultimately, I think that’s going to be the sticking point when it comes to implementing alternative measures. A more holistic approach would take more time and effort, and that’s not going to be popular with a wide range of faculty. You can probably implement that sort of thing in departments where enough people buy in to the idea, but that doesn’t tend to be either scalable or sustainable.
Then other approach is to emulate undergrad admissions, and have full-time dedicated staffs for this kind of thing, shifting away from the “changing rota of faculty” to people whose job depends on doing admissions properly. But that requires a significant committment of resources to a non-research activity, which again is kind of a hard sell.
Anyway, that’s what I occupied myself with when I was bored during this morning’s exam, and typing this out has gotten me clear through to lunchtime without having to grade anything. Woo-hoo!