There’s an interesting contrast between the laws of nature and the laws which constitute our legal system. The laws of nature are compact and precise; written in standard notation without accompanying explanation, the fundamental laws fit on a few pages. The laws of the legal system span thousands of volumes and are frequently ambiguous and ever-changing. On the other hand, we know what the laws of the land actually are. The laws of nature are not completely explored; there’s large regions of the parameter space where we just don’t know the laws at all. Still, in that sense physicists and lawyers could very very roughly be said to be in the same sort of business.
My sister happens to be in the latter class, finishing up her first semester in law school. When the semester ended she had a math question for me: grades were posted (hers were very good) but class percentile rankings were not. Is it possible to estimate the latter given the former?
Normally the answer is no, since only knowing your grades is not helpful without knowing how other students did on average. But law school – or at least this one – has a presumably anti-grade-inflation policy wherein each class must have its grades assigned in such a way as to result in an overall mean GPA of 2.8 for the students. Therefore your grade can at least establish where you are with respect to that average. I presume it’s the mean anyway, I suppose they might require it to be the median.
But unfortunately that’s not enough to allow a good estimate of class rank either. We would need to know how the students were distributed about the mean. We don’t, so I had to admit it was impossible to say anything meaningful. Even if we assumed the grades were normally distributed, we don’t know the standard deviation. But since this is Sunday Function. let’s pretend we did know what the grade distribution was. How can we compute percentile rank?
Assume the grades are normally distributed with a mean of 2.8 and a standard deviation of 0.5. This isn’t quite possible since grades can’t go above 4.0, but all things considered this isn’t a terribly implausible way for grades to be distributed. Given these parameters, the function describing this distribution is just the gaussian function scaled and shifted:
Here mu is the mean and sigma is the standard deviation. Plot it:
The area under the curve in a particular range of grades represents the fraction of the students with grades in that range. The complete area under the curve is exactly equal to 1, representing all of the students. The area between 0 (well, technically negative infinity) and 2.8 is 0.5, meaning half of the students are below 2.8 in their grades.
So we need a function to make this procedure systematic. Given a particular GPA, calculate the area under the curve between zero and that given GPA, which we’ll call x (again, technically between negative infinity and x but that’s pretty irrelevant here since the normal distribution is negligibly small below x = 0).
This is just an integral. Integrate the function f(t) over all t between minus infinity and x, and call that a new function g(x) – our Sunday Function. The jazz with f(t) instead of f(x)is because you’re not technically supposed to have the same variable names in the integral and its limits. They’re the same function though, we’re just giving the argument a different name.
This Sunday Function is so important it has its own name. It’s called the error function. Plug in a GPA, and it tells you how much of the area under the normal distribution is lower than that value. Let’s plot it, with parameters appropriately adjusted to match our particular grade distribution:
So if you had a 3.5 GPA, you’d plug in and see that the error function yields a value of 0.919, leaving you higher than just about 92% of the class.
You do have to do a little work to compute this, your average calculator doesn’t have a button that’ll just do this for you. Generally the procedure is to take the value of the GPA (or whatever else you happen to be calculating), changing variables such that it’s scaled to the normal distribution of mean 0 and standard deviation 1, and plugging that value into the usual error function that I’ve linked above. The error function is tabulated in books, and most scientific and graphing calculators can do it from scratch.
It sounds a little involved, but compared to the rules of civil procedure I imagine it’s pretty much trivial.