grade inflation

" ...I am going to warn you one last chance I am going to ask I want a better than a B-," the e-mail read. "If I see this [grade] I swear to god I am going to fucking put you in a wheelchair when I see you..."

wrote an undergraduate student to one of my colleagues this week

the student was arrested and arraigned for terroristic threats.

Well, that's different.

I had been worrying that my "curve" was tipping too far, too many As and Bs, but this is not why - in fact student whingeing, if anything, would tip me towards harsher grading.

But, there do seem to be more high grades than I expect and I have been wondering how this came to be. There seem to be three underlying causes: one is that I have absolute grade boundaries interleaved with my curve - this is based on experience and some knowledge of how I set graded tasks. So the "curve" tends to be biased towards moving grades up, but not down.
Secondly, students have become very good at using "drop" and "withdraw" options.
When I were a lad, you took a class, you were in that class, for better or worse (ok, completely different system, but still...). If I fold in the D/W grades and figure they would have ended up in the C/D/F range mostly, the grade distribution flattens and broadens substantially.
The final factor is asymmetric grading error: a student will push quite hard if they see grading error that lowers their grade, particularly one that drops them a letter grade at the end of semester, while they will generally not bother mentioning a grading error that raises their grade. And a few grades moving up to A or B from a B or C can skew a curve after the fact, especially for smaller classes. Assuming grading errors are random and symmetric about zero, this would explain the residual skew.

Bother - I think there is grade inflation, I worry I contribute to it, because I see my grades tip towards more high grades, but the primary cause seems to be secondary effects, with probably the largest effect being the students doing badly being effective in gaming the system to preclude a very bad grade.

Tags

More like this

ruhlman.com: Flour, egg, milk Really simple popovers. (tags: food blogs) Dynamics of Cats : grade inflation "I had been worrying that my "curve" was tipping too far, too many As and Bs, but [threats are] not why - in fact student whingeing, if anything, would tip me towards harsher grading." (…
Over at Unqualified Offerings, Thoreau has a bit of a rant about what students perceive as grading on a "curve": Moreover, many students have only the foggiest idea of what a curve is. Many (though probably not all) of their high schools had fixed grading scales with fixed percentages for each…
A bunch of people have been mailing me links to an article from USA today about schools and grading systems. I think that most of the people who've been sending it to me want me to flame it as a silly idea; but I'm not going to do that. Instead, I'm going to focus on an issue of presentation. What…
I am fortunate in that in general, I deal with very few grade-grubbing students. Way back when I was a brand-new assistant professor at my current institution, I dealt with quite a bit of grade-grubbing, along with a host of other let's-test-this-chick's-authority shenanigans, from my male…

have you ever considered that maybe you're the cause of students gaming the system? if you graded based on whether students learned the material rather than how well they learned it relative to their classmates, they might not have to maneuver around your system. for instance, what if all your students learned all aspects of your topic -- would they all get an 'A'?

do you really want a person to get into medical school because they sabotaged the other students' lab work, or because they know the topic? which doctor do you want to treat your kids when they get sick?

Yes, I have considered that.
I mentioned explicitly that I have rigid grade boundaries that underlie the curve.

The problem is not in "learning the material", the problem is in how well the graded material tests the knowledge of the material - that is a primary reason for curving.
It affects both small classes and large classes.

I do not teach labs, and sabotaging other students work is grounds for expulsion. The pre-med programs are particularly harsh on stuff like that.
Not so much an issue in theoretical cosmology or "stars for poets".

Two further points: students will try to game any system of rules, it is intrinsic to there being a system; secondly, for classes for jr/sr major and grad students I favour a UK style assessment - ideally I would like a student who gets an "A" to show they have progressed beyond the material.
At a university just "learning all aspects of a topic" really should only be a "B" grade - although that is not generally practical or fair for introductory or non-major classes.
University is not high school.

The qualities that make a good medical doctor doing general patient treatment have some overlap with the ability to perform well in undergraduate science classes, and some overlap with the ability to think beyond knowing all the material. Pre-med selection and entry to MD programs is another great issue of academia that is at best partially solved.

What are those drop and withdraw options?

By Who Cares (not verified) on 17 May 2008 #permalink

greggT's suggestion seems impractical on a number of levels. If everyone gets A's then the class is useless in determining who gets into medical school. Before long the med schools will recalibrate your A's into B's, and your students will be at a disadvantage. Of course, it is unreasonable to expect that all the students could possibly really learn the material well enough to get an A. If the class was easy enough so that all the students could manage an A, many would chose to goof off and get a lower grade (and some of these will apparently hope to make up for this by threatening their prof).

In any case, there is really nothing that a physics or astronomy prof could do by themselves to improve the method for selecting medical students.

Dave:

Gregg's suggestion may be impractical according to the current arrangement--that's what makes it a suggestion.

But he's right in several ways that are appropriate to good educational practice (as opposed to educational tradition.)

First, I'd point out that medical schools aren't so shallow as to decide based simply on grades. There are tests, and more importantly there are trends. If students get lots of A's, and they arrive and their performance gives the lie to their A, then the result over time would predictably be to devalue the A from that school. But clearly it's possible for an undergraduate course to succeed in teaching all or most of their students to a high degree, award them A's, and have them go forth and demonstrate that the A was earned.

Indeed med schools place a premium on the capacity to function under stress, and some of that is probably warranted given the way doctors have to work. But some of it is the result of the way medical schools work, and they often work that way because of...competitive grading. To extend that justification backward to the undergraduate programs that feed medical schools is logical if we defend the status quo ante without questioning it. The description of the grading process we see above is measured, intelligent, and seems appropriate...for the situation. Other situations can work, and work well.

But as GreggT suggests, mastery of material is a more valid way to assign grades. In the bell curve the relationship between grade and mastery of the information is arbitrary. That leads to a cascade of consequences unrelated to the basic goal. Some of those are good, such as increased energy and effort (though that has a downside--in competetive situations it's easy to let competition dishearten the weaker students so they do not achieve as much). I think most of those consequences are bad such as a decrease in innovation and risk by students in favor of conservative, cautious responses (again, this may be a good behavior for a doctor, but unless the course is in teaching appropriate doctor behavior it is probably not very good for the student's capacity to think, to experiment, to grow; and it tends to give the teacher much power to control the definition of what is right and what is wrong. The professor is smart, and good at what he does, but he is not perfect, and great ideas can arrive at any time and any place.
Competitive grading tends to flatten information into easily quantified numbers, otherwise it is hard to justify the crisp right/wrong decisions that forestall whinging. That's fine--if the discipline permits. But just to forestall unpleasant grade grubbing is not in my view a good enough reason to oversimplify or segregate information. One of the biggest complaints about doctors today is that they are hidebound, overly structured in their decisions.
Competitive grading puts great pressure on a teacher to be even-handed and fair, and teachers are mainly human, so many problems result. Bias is common in every direction. Teachers are attentive to the relative correctness of answers, rather than the pure correctness (that is, they actually compromise quality throughout a cohort in order to minimize or maximize differences in individual responses.)
Of course the upward movement problem still exists. In grading situations which are not bell-curved, the assumption usually is that the number of A's will increase, so the assumption is that the non-bell curved course is less rigorous. With experience in such situations, teachers can improve rigor by applying individual standards--that is, the best student in the course does not necessarily get an A, and you might have tests or sections in which nobody passes. That's the school I want training my doctor. In a bell curve situation the top grades in a very rigorous evaluation are as likely to be inflated upward as they are downward.
Finally, the bell insulates against crappy teaching. I'm no fan of grade- or test-based evaluation for teachers, but clear trends of too many A's or too many F's are markers for evaluating teachers' quality. What criteria you might apply to such a situation would vary sharply from one situation to the next, but if you take whatever data you have and make a bell from it, every teacher pretty much looks the same...and we teachers know that every teacher pretty much ain't the same.
If avoiding student whinging is your goal, of course, the bell is convenient--the plea includes not just raising my grade, but lowering someone else's, and so is harder for a student to justify. On the other hand, if grades are really the evaluation of a student's work, and the work is complex enough to warrant involved subjective evalution, "whinging" might take the form of reasoned and thoughtful defense of an answer, and educative in itself--both of student and teacher. Clearly such discussion peaks when a higher grade is in reach with a point or two, but I try to differentiate the goal of helping students behave appropriately and helping students improve and master skills.

ice

I suspect that you and all commenters so far are confusing cause and effect.

To be a good teacher, you must have balance between three methodologies:

* Instruction
* Management
* Assessment

Weakness in any one of these considerable reduces the value of the class to the students and, by the way, makes the teacher's job harder and unhappier.

Testing formally (i.e. quizzes, exams) is supposed to be for Assessment, so that the teacher can allocate effort correctly, and appropriately to each individual student's learning style.

Testing for its own sake, to punish students, by ritual, to appease the department, or (in public schools) because of "No Child Left Behind", is less than useless.

An example of an incorrect paradigm is that the teacher is the boss and the students are employees being paid with a grade that depends on how hard they work.

Management does NOT mean that the teacher is a cop.

Assessment is not the end of the process. Logically, it should be the start. That is, start by composing a good Final Exam that allows assessment of whether the student, at the end of the term/semester, has learned according to the standards, according to a rubric that students have been shown, at the desired level of understanding (in, for instance, Bloom's Taxonomy).

Then work backwards. Design a midterm to assess if the student has learned half as much as needed for the Final, or at a lower level which will become deeper.

Then design instructional lesson plans that move inevitably to the midterm and then final exams. Assess additionally as needed to tell if each lesson plan has achieved its objective.

Have a Classroom Management Plan to formalize the Management, and establishes a Social Contract into which every student buys in. If that is done. they will make enforcement so very much easier, by peer pressure and intrinsic motivation.

At least this is what is taught now in Colleges of Education. There is some evidence that it is at least partly valid.

@ Dave- if you really think an individual physics professor has little influence on the selection process for medical school, you are either poorly informed or not thinking. When you have a student, brillant but without talent for physics, who works especially hard and has a passion for medicine, you can take that student (often a B or A- sort of student) and write them a fantastic letter of recommendation. It won't change the whole system, but it'll make a difference to *that* student.

Just for the record - if you're a pre-med student whose sole interest is in maximising their GPA, then do NOT take my class, even if you will in fact ace it.
People should take my class(es) because they are interested in learning the material presented, whether it is because the class satisfies a general education requirement for a non-science degree, or because they want to actually learn astronomy and astrophysics.
Either way is fine.

The problem is that you are using a curve at all. The purpose of a curve is to provide a relative ranking of a cohort. Relative student performance is unrelated to understanding energy conservation, or Newton's laws, the moments acting on the structure you are analyzing, the viscosity of your flow. To anthropomorphize a bit, the I-35 bridge in Minneapolis didn't give a damn what anyone's ranking was, it just knew the stresses it was experiencing better than the engineers and the politicians.

And the rationale for picking whether to use a curve or not needs to be whether is improves student learning and assessment of that learning. A rationale focused on instructors instead of students is less than ideal.

I think a lot of people see the word "curve" and freak out, as if the professor was enforcing a rigid formula to drag all the students' grades up or down. I can't speak for Steinn, but often the purpose of the curve is simply to renormalize the test percentages to correct for the difficulty of a particular test. Even if you are an experienced teacher, it is hard to write a test on which a B student always gets 80% of the questions right. So if your test turns out to be harder than last semester's and the average score is 50% instead of 70%, you renormalize the grades rather than give everyone lower scores than the students who took it last semester.

People who say you should never use a curve are underestimating the difficulty of writing assessments (tests, homework, etc) that are both fair and informative in terms of returning information on how well the students have learned.

When I TA'ed sections of a large intro physics class, we would also do a renormalization to correct for the fact that some TAs graded harder than others. But only up to a limit - we would not force all the sections to have the same average grade, because with 20-30 students per section, some of the sections had more good students and deserved a higher average grade.

No.
The purpose of a curve is to normalise grading between cohorts.
If I could guarantee that the assessment material and grading was uniform then curving would be unncessary, but it is not - I know this because we statistically track the "difficulty" of exams and the variance is significant.

For a "large enough" class, it is almost safe to work completely to a curve, in the limit in which the student intake is approximately constant.
For a small class, it is generally good to find a means of comparing different cohorts (this is why beginning faculty usually grade too harshly, they don't have a normalisation point yet), there the student variance is an issue.
Only for a medium sized class where I knew the intake and pre-req history would I be comfortable putting down rigid grade boundaries from the beginning as I could self-calibrate the grading. Those are also fun to teach.

And can I repeat - "I have absolute grade boundaries interleaved with my curve" - I tend to set "impossible" assignments, in that I ask for material that goes beyond the lectures and syllabus, and I tend to grade harshly.
The "curve" for my class mostly serves to raise grades.

Of course what I should really conclude is that a lot of my students get As because I am a totally awesome teacher and the students absorbed the material like sponges, exuding it back on command when I squeezed.

If grades are performance indicators they discriminate against minorities. The grading curve bottom half is properly reserved for the highest scorers who exercise historic paternalistic White Protestant European oppressive values. Those who hoard achievement against diversity are subject to grade redistribution just as earned wages are reallocated from the productive to the deserving.

Grade credit offsets allow those evincing more success than required (e.g., a gentleman's C) to sell their excess compliance to those difficient in same - for a fair, market-determined price overseen for nominal user fees (20% of traded value).