There was a brief flurry of discussion yesterday kicked off by Matt Yglesias posting People Don’t Major in Science—Because It’s Hard, which more or less says what the title would lead you to believe (either title, since he’s blogging for Slate where they like to give pages titles that don’t match the post titles…). This was inspired by a National Bureau of Economic Research paper, the full text of which seems to be paywalled, sort of– they emailed it to me at my work address for free. And since I could get it, I figured I should dig into it a bit to see what it really said.
I’m not going to do this in the humorous Q&A format like I do for physics papers, for a bunch of reasons. Primarily that, on a quasi-aesthetic level, I found this pretty awful. This is mostly a matter of differing norms between fields– I’m sure that it’s perfectly within the normal style range for the intended audience, but coming from physics, a whole bunch of things about this made me twitch. I particularly hate the use of multi-letter variable names in ALL CAPS with sub- and superscripts: E(AGPAt**i,SCI) in the screenshot above, and that kind of thing. This somehow manages to combine the complicated index structure of high-level physics with the clumsy pseudo-word labeling of college frosh (I kept having flashbacks to lab reports that give Newton’s second law as “FORCE = Mass X A” and that sort of thing). Further heightening the first-semester-lab-report feel was the stubborn insistence on using quasi-mathematical notation in places where words would be clearer: “those who have STATEi =SCI and j*=SCI, those who have STATEi =SCI but j* ≠ SCI, and those who have STATEi ≠ SCI.” would be better written “Those who enter intending to major in science and graduate with degrees in science, those who enter intending to major in science but graduate with non-science majors, and those who enter intending to major in a non-science subject and graduate with a non-science degree.” But then, I say that as an outsider to the field– presumably, people working in this field are conditiond to expect the features I find awful, and would find my preferred versions painful to read.
Because of the dense notation, I am less confident that I’ve correctly understood this than with most of the physics papers I write up, and there’s only so much time I can put into deciphering this stuff. So this will be a little shorter and less conversational than my usual research write-ups. But we’ll do at least a few question-style headers for this, because I don’t want to completely subvert expectations…
What did they do?
This is an analysis of survey data from several years ago, taken from two classes of students entering Berea College, 655 of them in all. They were asked on entering college to predict the field of their eventual major (majors being grouped into seven categories), the likelihood that they would graduate with a major in each of the seven categories, the likelihood that they would drop out before graduating, their GPA in courses for each of the seven categories, and their annual income at age 28. These questions were repeated every semester after that, and the results correlated with the actual GPA and major of the students in question.
This analysis (they’ve published other stuff from the same data set) was to track the way students move between majors. To that end, they construct a mathematical model that takes the various survey responses as inputs, and uses them to generate aggregate probabilities for various major categories, that they can compare to data. Using that model, they can then try to tease out the effect of various factors.
What are the basic results of the survey?
They find that, in general, students entering college are roughly evenly distributed among the different major categories, in terms of what they say they intend to major in and what probability they assign to graduation in each category. There are two areas where their predctions are way off: Only 5.1% of the entering students expect that they’ll drop out before graduating (an average predicted probability of 13.4%), while 37.5% of them do, in fact drop out; and while 19.8% of entering students expect to major in a science field, only 7.4% of them graduate with science majors.
These two are not all that directly related, in that the dropout rate for students who enter expecting to major in science is not any higher than the dropout rate for students intending to major in other areas. The disappearing science majors are disappearing into other fields: while students entering with a stated intent to major in science say there’s a 60% chance they’ll graduate with a science major, the actual number staying in science is only around 45%. This isn’t made up by students switching into science, either– students with non-science intended majors give about a 6% chance of switching to science, while the actual number is around 2%.
This switching also happens pretty quickly. By the middle of the sophomore year, 90% of the students who will eventually graduate as science majors are saying that they’re going to major in science, and only 3% of those who will graduate with majors in something else say they’re going to major in science. Because of that, they take the midpoint of the sophomore year as the decision time when they draw up their model of student decision-making.
When they dig a little deeper, what do they find?
They look at the influence of a couple of factors to determine the effect they have on students’ decision to change majors or not: the average GPA and the average expected income. Most of the effort goes into talking about grades, and the income picture is kind of muddy (largely because they can’t match the predictions to actual income the way they can match grade predictions to actual GPA), so I’m going to mostly ignore the income effects.
What they see is that the estimated GPA for science courses drops dramatically from the start of the first semester to the middle of the second year: the grade students expect in science courses decreases by 0.18 grade points for the entire sample. For those who enter with an intent to major in science, this is even more pronounced: the average expected grad drops by 0.30 grade points over that same span.
This leads to what struck me as the clearest demonstration of anything in the paper, which is explained in the cryptic text screen-capped above: over the first year and a half, the students who leave science become much more like the students who never intended to major in science, in terms of GPA expectations. Students who enter saying they expect to major in science and eventually graduate as science majors come in expecting a science GPA of 3.63, and that prediction drops by 0.27, to 3.36. Students who enter saying they expect to major in something other than science and graduate in something other than science expect a science GPA of 2.77, and that drops by 0.066, to 2.71. Students who enter expecting to major in science and graduate as majors in something other than science start out expecting a science GPA around 3.53, and end up dropping by 0.61, to 2.92. Their expected grades go from pretty much the same as the eventual majors to pretty much the same as that of those who were never science majors, in just three short semesters.
So, this is what they use to say that students drop science because it’s hard?
It’s not the only thing, but it’s probably the cleanest single effect. They spend a lot of time putting togther a model and finding coefficients and doing t-tests. In the end, they conclude that the GPA estimate has the biggest effect on the shift.
The other big point in favor of that explanation is that once they have the model, they can input different initial grade expectations and see what effect that would have. If they replace the distribution of expected GPA that they see in the survey data from the start of the first semester with that from the midpoint of the second year, the fraction of simulated students saying they’ll major in science drops to 10.4%, much closer to the actual figure of 7.4%. This is also suggestive of grades being the main factor.
So, what causes the bad grades?
They attribute it to innate ability, more or less, casting it as a process of students learning that they’re not actually good at science after taking a couple of science classes and getting bad grades in them. They aren’t really able to do more than that with the data they have– they can’t tell whether it’s a matter of students losing interest in science because of lackluster teaching or bad experiences with faculty and other students, or any of the other factors people like to talk about. All they know is what the students predict for their grades, and what they actually get for grades.
This lack of information, along with the relatively small size, is probably the biggest weakness in terms of finding out what you’d really like to find out regarding major switches and the like. But that kind of fine-grained information would be really, really hard to collect, so it’s not too surprising.
Any findings that they spend a bunch of time on that don’t seem that interesting to you?
They go on at some length about the significant underestimate of the likelihood of dropping out, which I found kind of mystifying. Not the underestimate, but the amount of time they spend discussing it. It seems crashingly obvious to me that students entering college aren’t going to assign a high probability to dropping out– if they expected to drop out, why would they be there in the first place?
They also spend a bunch of time looking at the correlations between GPA expectations in different major areas, finding that students expectations regarding their science GPA is not particularly well correlated with their expectations about their GPA in other classes. I’m guessing this is important for consistency checking or something, because it doesn’t seem all that surprising or interesting otherwise.
Any features that jump out at you that they didn’t talk up?
Two things: 1) They have a table showing the probability of graduating with a particular major broken down by initial prediction of their major, where unsurprisingly, they find that on the whole, the most likely eventual major is the one that students predicted initially. This is particularly strong for the humanities, though– students who enter saying that they plan to major in humanities have a 56% chance of doing just that, and a 34% chance of dropping out. The chance of ending up in any of the other categories is minimal. This is far and away the strongest of all the categories they measure.
2) Scientists get cocky. When they calculate the grade expectations for the three groups of students mentioned above, they calculate the expected grade for all of the major areas. Students who intend to major in science and graduate with science degrees decrease their expected GPA in science classes, but increase their expectation in everything else. Meanwhile, students who switch out of science decrease their expected GPA across the board, and students who never intended to major in science in the first place don’t show much of a pattern in the changes of their expectations.
I thought that was kind of interesting. I’m not sure it really supports the make-Steve-Hsu-happy conclusion that scientists are just smarter across the board than everybody else (which I’m sure it will be used to claim by somebody), but it was something I noticed.
So, bottom line: Do students drop out of science because it’s hard?
That’s maybe a little stronger than is really justified by the data they present. I think they have a decent case that students drop out of science majors because they find that they’re getting bad grades in science. I don’t think that you can really use this to make a solid claim that this reflects a lack of ability or preparation on the part of those students, though. Not without looking more closely at the individual students and why they switched, anyway. I suspect it’s the most likely explanation, but you can easily construct horror stories that would explain the motion between majors without invoking innate ability or prior preparation.