We use four years of introductory astronomy scores to analyze the ability of the current population to perform college level work and measure the amount of grade inflation across various majors. Using an objective grading scale, one that is independent of grading curves, we find that 29% of intro astronomy students fail to meet minimal standards for college level work. Of the remaining students, 41% achieve satisfactory work, 30% achieve mastery of the topics.
Intro astronomy scores correlate with SAT and college GPA. Sequential mapping of the objective grade scheme onto GPA finds that college grades are inflated by 0.2 for natural sciences majors, 0.3 for social sciences, professional schools and undeclared majors), 0.5 for humanities majors. It is unclear from the data whether grade inflation is due to easier grading curves or depression of course material. Experiments with student motivation tools indicates that poor student performance is due to deficiency in student abilities rather than social factors (such as study time or decreased interest in academics), i.e., more stringent admission standards would resolve grade inflation.
Yeah, that won’t be controversial at all. The seriousness of this contribution is accentuated by the submission note on the arxiv, “not to be submitted to any journal.” While this is most likely due to the use of student grades as a dataset, and the difficulty of getting permission to use those data (nothing in the paper allows you to identify specific students, but as I understand it, unless you have approval in advance, you can’t usually use class assignments for research purposes), it makes it look like the author is more interested in scoring cheap rhetorical points than making a serious contribution to an intellectual debate by sending this out for peer review.
Appearances aside, what does the paper say? Well, pretty much what’s in the abstract: the author taught introductory astronomy for four years running, using exactly the same quiz and exam questions, and piled up a big data set. He then looks at correlations between the grades students received in his class and their SAT scores and overall GPA. The claim of grade inflation is based on the fact that students who performed at a C level in these classes tended to have GPA’s slightly higher than a straight C (averages range from 2.2-2.6 for different majors), and so on up the grade ladder.
Does he have a valid point? Really, the entire thing hinges on the definition of the “objective grading scale” used for this study:
Knowledge testing was the domain of the multiple choice exams. Three exams are taken each term(covering only the previous 1/3 of the course material). Each exam is composed of 100 multiple choice questions. Each exam is divided into three types of questions. The ﬁrst type tests knowledge of factual information (e.g., what is the color of Mars?). The second type of questions addresses the students ability to understand the underlying principles presented in the course (e.g., why is Mars red?). The third type of questions examines the ability of the student to process and connect various ideas in the course (e.g. why is the soil of Mars rich in heavy elements such as iron?).
Excellence in answering questions of the first type represents satisfactory grasp of the courses objectives, i.e. a ‘C’ grade. High performance on questions of the second type demonstrate good mastery of the course material, i.e. a ‘B’ grade. Quality performance on the top tier of questions would signify superior work, i.e. an ‘A’ grade. While the design of the various questions may not, on an individual basis, exactly follow an objective standard for ‘A’, ‘B’ or ‘C’ work, taken as a whole this method represents a fairly good model for distinguishing a students score within what most universities consider a standard grading scheme. Certainly, this was the original intent of the ABCDF grading scheme, not to assign a grade based on class rank or percentage, but to reflect the students actual understanding of the core material.
The A/B/C lines are set based on assuming students answer all of the appropriate category or categories of questions correctly, then guess at the answers for the others. So, the minimum score to get a C is set at the total number of points for answering all factual questions, plus 1/5 of the points for the other two categories (exams are five-question multiple choice tests).
If you think that this is a good way to assign grades, then there’s probably something to this– the correlation between class grade and GPA is excellent as far as this sort of study goes. If this sounds like a ridiculous way to assess students’ knowledge of introductory astronomy, then the rest of the argument probably falls apart. I’m kind of lukewarm on the whole thing, to be honest– the slightly combative approach he takes with this puts me off a bit, but the correlations he sees are surprisingly good.
Anyway, I’m sure there are people out there who will have strong opinions on this subject, so I’m passing it along for comment. While I’m tempted to forward it to the local faculty who bang on about grade inflation all the time, I think discretion is the better part of collegiality this early in the year. There’s no reason to start an all-faculty listserv shitstorm in August.