NRC Graduate Program Rankings

Later this month, the National Research Council will, finally, release the much awaited and much anticipated Data-Based Assessment of Research-Doctorate Programs.

Every 10 years, roughly, the NRC publishes graduate program rankings, the last having been come out in 1995...

The rankings will be released tuesday 28th September 2010.
Today the participating universities will be told how to get embargoed prior access to the data for their own institutions, so they can prepare for the public release, next week the actual embargoed data will be released to each institution.

4838 graduate program in 62 academic fields at 212 universities.

The rankings are much scrutinized, and used and cited for all sorts of nefarious purposes for the next decade+

Prospective faculty look at them for perceived departmental strengths, as do prospective students and postdocs. Not only are the rankings considered, but so are the changes in rankings, particularly as most who care know the rankings are a lagging indicator.

University administrators also look closely at the rankings, to publicize strengths, to scrutinize weaknesses, and to evaluate trends.
Right now, the rankings are more important than ever: universities are hurting and looking to cut, not trim, and graduate programs ranked low, or which fall sharply in the rankings may face consolidation or elimination.

The rankings have, historically, had a major flaw: they are primarily reputational rankings.
This has always been a known flaw, the NRC would basically call up their senior colleagues and ask who they thought rocked.
This provides an acutely lagging indicator, and one with some slight propensity to non-objective analysis.

This time it will be different. This time an astrophysicist was in charge.
I sat through several presentation on the methodology, discussed parts of it with some people involved, and actually read most some of the methodological description...

I will now attempt to summarize, any error or misinterpretation is mine. The final statistic to be presented has also been changed at least twice that I know of, due, I gather, to feedback from the participating universities.

The ranking, as before, is from individual academics rating programs in their field.
Each person is also asked to state their weighting of a number of quantitatively assembled criteria in three different categories, as to what is important in their estimation of a graduate program in their field.

Now two ranks are formed: the first is the reported weighted ranking from the respondents;
the second, takes a random subset of the respondents and generates a synthetic ranking based on the respondents actual reporting of what they think ought to be weighted in the ranking, this is then iterated, thousands of times, for different random subsamples of respondents for each academic field, generating a Monte Carlo sampled distribution of rankings as they would have been if respondents ranked programs according to how they claimed the weight indicators.

Phew.

So now there are two statistics: the actual reported ranking, and the distribution of rankings Monte Carlo'd from the stated preferences.

This is, or was, combined into a weighted total ranking statistic, which is stable to perturbations in weights, in some formal sense.

What is then reported, is the confidence interval of the rankings.

Now, last I heard, the statistic to be reported was the 50-percentile interval, so a program would not be ranked 17/54 but {13-24}/54

I also heard, that the universities wanted 90-percentile rankings.
Which is probably useless for most cases.

I have seen sample anonymized actual data, and the 50-percentile ranking seem to give sensible spreads, some unique top/bottom ranks, some reasonable looking straddles.
Most physical scientists who have seen it, and I've heard from, apparently liked it.
The 90-percentile rankings I have not seen, but I infer that in many cases you can drive a truck through them, they may carry relatively little information.

I actually suggested, since they had run the numbers, that both statistics be published, as that'd be even more useful than either number alone...

Either way, this will be interesting.

So here is what I heard, anecdotally, third hand, as it were:

1) individual rankings by respondents correlate poorly with their self-reported assertion of what is important in a program - ie individual reputational ranking doesn't work because people don't actually look at hard numbers, yet they seem to work in aggregate...

2) the rankings did not control for (or collect?) the BSc/BA or PhD institutional information for respondents (where they got their degrees) - so there will still be sampling bias, based on lagging production of PhDs that go into academia to be respondents decades later. You can estimate the sign of that bias for yourself.

3) the three categories, or "dimensions", contribute differently to the rankings, in a well quantified and apparently very robust way, and this will likely generate much controversy and discussion for a number of years

4) the people who ran this and set up the methodology really know what they are doing

5) programs who dropped sharply from 1995, or are in the bottom quartile in their field better spend the week they have thinking of good rationales for how and why

6) a lot of programs will claim that they are not now what they were in 2005 when the data was collected, in some cases this will be very true

7) the NRC is never, ever going to do this again, at least not until they lose institutional memory of how painful it was this time around

It is a classic Red Queen Race.

NRC
Assessment of Research Doctorate Programs: FAQ, who is ranked and methodology

Chronicle article on NRC rankings (subscription)

IHE on methodology and anecdotal grumbles

More like this

@Dave - individual institutions get their own data on monday,
I expect to hear that afternoon, but I won't leak - they know where I live.
Now, if someone else wants to brag pass informed word around about their program, I'll be happy to tell the world on their behalf. Or they can wait till the 28th.

@Lab - they do - that is where it gets interesting...
the reputational rank, vs the stated importance of objective data, vs the "how would you have ranked if you actually took note of your own stated weighting of the data"
What is also interesting is what does not correlate.

This can only end in tears....

My first few years at Vanderbilt, we were falling all over each other in angst over the fact that our Physics graduate program ranking wasn't higher than it was. Eventually we got over it, because it became obvious even to us that we were nowhere near like what we'd been in 1995, so we should stop getting all worked up about it, and we should stop trying to figure out which of our colleagues had to be jettisoned in an attempt to raise the rankings.

There was a year when Vanderbilt hired 5 people. I hope that was before the 2005 assessment, or else, yeah, Vanderbilt's going to come out a lot lower than they really ought to. (Never mind the two very good astro types hired in 2007, the year I left.)