NRC: the rankings

By catdynamics on September 28, 2010.

The NRC rankings are out.
Penn State Astronomy is ranked #3 - behind Princeton and Caltech.
W00t!

PSU doing the mostest with the leastest.

The Data Based Assessment of Graduate Programs by the National Research Council, for 2010, is out, reporting on the 2005 state of the program.

The full data set is here

EDIT: PhDs.org has a fast rank generator by field.
Click on the first option (NRC quality) to get R-rankings, next button ("Research Productivity") to get the S-rankings, or assign your own weights to get custom ranking.

Astronomy S-Rankings:

Princeton
Caltech
Penn State
Berkeley
Chicago
Washington
Ohio State
UCSC
Columbia
Harvard
MIT
Arizona
Cornell
Wisconsin
Johns Hopkins
Texas Austin
Virginia
Michigan State
Michigan
New Mexico State
UCLA
Arizona (Planetary)
Boston
Maryland
Yale
Colorado
Hawaii
Minnesota
Indiana
UCLA (Planetary)
Illinois
Florida
Georgia State

This is sorted on the 5th percential S-ranking, straight from the table,
and then double sorted on the 95th percentile.

Astronomy R-Rankings:

Berkeley
Caltech
MIT
Princeton
Texas Austin
Johns Hopkins
Harvard
Colorado
Arizona
Penn State
Chicago
Cornell
Arizona (Planetary)
Maryland
UCLA
Washington
UCSC
Virginia
Wisconsin
Ohio State
Columbia
Indiana
Minnesota
Hawaii
Illinois
Michigan
Boston
New Mexico State
Yale
UCLA (Planetary)
Michigan State
Florida
Georgia State

Same with R-rankings - sort on 5th percentile and then double sort on 95th percentile to stack the confidence intervals.

Comparing the R and S rankings, you can see the reputation factor, if you look with an experienced and jaundiced eye.
Clearly the S-Rankings are much better and more objective...

Penn State Astronomy and Astrophysics Graduate Program Press Release

So, yes, I like the methodology.

How did Penn State end up 3rd?
Well, only a few of the 20 metrics weigh signficantly in the S-Ranking: namely productivity, citation and funding.

We ranked #1 in publications per faculty, by a comfortable margin.
We were 6th in the citation rate, and 8th in number of faculty with grants.

(I actually think an error overstates our grant percentage - it is shown as 99.3% and clearly should be ~~93.75%~~ 95.3% (yes, I ran into the "who is counted as faculty ambiguity") - we don't have 172 faculty... - so there are 6 institutions with 100% grants per faculty and we should be in the next bunch, not in between - this did not substantially shift our rankings, it might flip us and Berkeley in the S-Rankings.)

In Astronomy, essentially all the weight in the rankings is on research activity, the student aspects and diversity components have negligible statistical weight.

So we win on research productivity.
And, we do so going up against institutions that are much better funded than we are.

I haven't got the oomph to autogenerate the full rankings, so here are the physics highlights:

Physics S-Rankings:

Harvard
Princeton
Berkeley
MIT
UCSB
Harvard
Hawaii
Penn State
Caltech
Chicago
U Penn
Columbia
Boston
Yale
Cornell
Stanford
UC Irvine
Tulane
Caltech
Colorado
CMU
Michigan
UCSC
Rochester
Rice
SUNY Stony Brook
Michigan State
Illinois...

Double counting is from physics/applied physics being counted separately.

Physics R-Rankings:

Harvard
Berkeley
MIT
Illinois
Caltech
Princeton
UCSB
Chicago
Texas Austin
Maryland
Oklahoma
Harvard
Cornell
Stanford
Colorado
UCLA
Rochester
Columbia
Penn State
Boston
SUNY Stony Brook
Wisconsin...

Again, reputation rankings show the lag.

The 2009 Methodology Report

The Questionnaire Used for Faculty (NRC link)

NRC-Program_Quality_Questionnaire (pdf)

All the different Questionnaires used in the study (NRC link)

More like this

The link "Penn State Astronomy and Astrophysics Graduate Program Press Release" points to this blog - unintentionally self-referential, I presume ?

apologies - the press release was delayed till 2 pm, hence the blank link, it is there now

Where are the rest of the Astro/Physics Ratings? I can't get more than ~6 MB of the data table.

The table is monolithic - 35Mb - completely insane
Puffs to 1.4Gb when loaded in xls format.

It has taken me a while to convert to numbers format and trim out the auxillary data.
I now have astro and physics cut out and am trying to make sense of the overall rankings.
Excel really sucks.

The website graduate-school.phds.org has a useful and interesting form that lets you look at the rankings.

You can slice and dice on a number of criteria. So, for example, based on ONLY student outcomes, UW is the best and PSU is second. Whereas ONLY research output puts PSU 3rd (after Princeton and a technical school in Pasadena) but UW is seventh. So you can see what actually went into the different final rankings.

Amusingly my PhD institution seems to be in about the same place it was in 1993. Benign neglect of graduate seems to be a winning strategy.

The Chronicle of Academic Whining Higher Education has a tool up that is somewhat useful: http://chronicle.com/page/NRC-Rankings

And look at that: nearly the entire CS citation data is N/D. Random error, I guess ;)

That link to the Chronicle sounds interesting, but does not work just now.

I think this whole exercise is a lot of wasted effort. Many of the rankings are non-sensical, and are clearly a product of arbitrary weighting schemes convolved with inconsistent definitions of who counts as real faculty. The difference between R and S rankings clearly shows that what people think is not what they think they think when asked to assign weights to each category. I personally think that Penn State has an excellent program, but my opinion is independent of this exercise in statistical masrturbation.

Dood,
It is my job to brag about this.
So I will.

And I am so not going to dis Jerry O. on the methodology...

Seriously, I spent a frightening amount of my time digging into the methodology and, as statistical masturbations go it is very robust.
The outcome is that papers published, citations and funding are what counts, and that peoples opinions lag changes in programs.
Not very controversial, for the physical sciences.

The other data is valuable, but irrelevant to overall perceived program "quality" - which doesn't mean those factors are irrelevant, just that having a student lounge doesn't affect the academic quality of the program.
Either way.

I don't understand the metrics for some other fields, but then I haven't spent the time to dig into them.
Maybe they have no measure.

If Penn State is so good, why it hasn't produced a SINGLE Hubble Fellow over the last 20 years???

@Bacon - ok, I don't get what UW is complaining about.
I just used the Chronicle tool to compare UW and CMU, which has good comp sci - they are head-to-head, similar publication rate, funding etc.

Yeah, citations are incomplete, because the NRC used Thomson citation base - guess they got a deal, or wanted a homogenous citation base - and they note it doesn't cover all disciplines.

You're 24th in R and 36th in S if I read it right.

The Chronicle has a disgruntled comment that U Delaware mathematics are ranked higher than Chicago, which can't be right...
But, looking at them - Delaware has a small, but active applied math department, where publication rates are higher and people get grants, whereas Chicago is very classical math.
Maybe Chicago faculty write more important papers, but Delaware has higher citation rates...

puzzled -- Penn State has a huge research enterprise going in x-ray astronomy and instrumentation. They've produced a strong number of Chandra and Einstein Fellows as a result.

@puzzled - good question.
We have produced Chandra Fellows, and we don't have a big presence in optical extragalactic.

One can also contemplate the history of "new rules for how Hubble Fellows are assigned this year" to get some insight into possible confounding factors in those awards...

Several of our faculty are former Hubble Fellows.

I think I partially count as a Hubble Fellow from Penn State! I started grad school at Penn State, but finished at Stanford because my advisor moved during my 3rd year. Actually I probably wouldn't have been accepted to Stanford! Penn State was the school that gave me the opportunity to start research of my liking, and I credit the PSU department for my journey to become a Hubble fellow.

@puzzled
Note that, in addition to the Chandra and Einstein Fellows already mentioned, we also have at least one Spitzer Fellow, a Royal Society Research Fellow, an STFC Fellow, an A.J. Cannon winner, etc., etc.

Sorry I messed up the Chronicle's address. It's http://chronicle.com/page/NRC-Rankings/321/

Not only is citation data for CS messed, but it appears they also used journal articles for publications per faculty. CS doesn't use this metric, they use conference publications. UW is listed as having a faculty of size 91 in 2006 which is off by a factor of 2....I think that is what they are complaining about (Compare US Snooze and No Reports grad student rankings and you'll see why the gnashing of teeth.)

But enough about CS, for physics the fun things I found are the 70 percent externally funded grad students at UC Irvine, the 10 years to get a Ph.D. at Catholic University of America, DC. Ph.D. comparable to time between NRC reports is....awesome?

Hm, I wonder if UCI is counting Cal resident tuition waivers as external funding or something.

Record for time to PhD is apparently 13 years at Washington University St Louis, in Music.

Is the UW publications per faculty twice as high as reported?
ie did the faculty number propagate into the other sections?

It is interesting to see other fields react - Leiter is in uproar of "faculty quality" - which is a low weight in our field, and even then the weight is higher than I expected - this is as measured by awards and reputational stuff, not publication/citation/funding

It is interesting to see the manifestations of the old collegialism - acdemia is so amusingly clubby sometimes.

I am puzzled by your last comment. You wrote, above, that "papers published, citations and funding are what counts," and I assume you mean that counts in telling you something meaningful about how good the faculty is. The concern in philosophy we had no adequate qualitative measure. There was no citation measure used, and per capita publications are absolutely meaningless in philosophy without a qualitative control, and there was none: crap published anywhere, in quantity, counted. (Some of the most influential philosophers--as measured, e.g., by citations--publish very little.) So that means in philosophy, there was only one qualitative measure, awards & grants (the latter being few and far between in the humanities), which are interesting but wholly inadequate for the reasons I noted. You might take a look at how we do the PGR surveys: 300 philosophers evaluate current faculty listings, and rank faculties overall and in three dozen specialties. In essence, with its R-Ranking, the NRC tried to do that, but with only 47 faculty evaluating each program.

I appreciate your reasons for not wanting to diss your fellow astrophysicist, and it may well be there is some value to this data for the physical sciences. But the R- and S-Rankings in philosophy are quite obviously silly, and I suspect the same is true in other humanities fields.

Bragging is fine - Penn State has a great astronomy department. But this sort of ranking is an exercise so nearly devoid of meaning as to be nearly useless. Perhaps even dangerous, as university administrators might invest their resources accordingly.

Robust statistical masturbation is merely vigorous, not virtuous. I do not doubt the good intentions of those who did the analysis, but astronomy (like physics) has become too diverse for this to be meaningful, as you imply in your preceding posts. Compare X-ray to planetary astronomy. Citation rates are way different. Does that mean one is intrinsically more important? NASA ought to seriously realign its budget if so, and not along the lines suggested by the latest decadal survey.

I would say this is a matter of comparing apples and oranges, but really it is more like averaging oranges and cats. You can call the 5th percentile the head and the 95th percentile the tail, but neither end purrs and both smell funny.

Brian -- The failure modes for Philosophy are much less perilous in astronomy or astrophysics. The vast majority of publications by US astronomers are in one of two high quality journals. There are a few smaller journals, or larger international journals that are used as well, but these too are of high quality. In addition, grants are typical rather than rare.

The obvious failure mode I see is the one that Stacy brings up -- the difference in citation rates as a function of subfield. In astronomy, you can write a pretty silly paper on dark matter (as I have done!), and it'll get a hundred citations. You can likewise do superb work on an important subfield, but will not garner more than a respectable few dozen citations, solely because the size of the community is small. Thus, the citation rate metrics will be dragged around somewhat by the relative numbers of faculty in different subfields.

Glad to see PSU is finally getting the recognition it deserves-- hopefully this will see the grad enrollment increase (I've been successfully recruiting for PSU). Also happy about Berkeley, although I can't objectively compare to other top departments.

These things have a lot of silliness in them. I remember a few years ago when Anne Kinney published that paper ranking depts by the normalized h-index. CWRU was too small to be included in the depts that were analyzed, so I did the calculation myself using her metrics applied to our department. We would have come out #1, above Caltech, Princeton, etc. Now while I think we are doing good stuff over here, even I, if pushed, would begrudgingly admit that ranking us ahead of Caltech and Princeton might be a bit of a stretch. So... yeah, as Stacy says, there is a good bit of delusion behind these kind of numbers, and it gets worrisome when people start taking them too seriously.

oops, after re-reading what i wrote, it sounds overly harsh on you and others who are (rightfully) pleased with your good recognition. I certainly dont mean that I think the any of the institutions listed are undeserving of the recognition. All those institutions will give a strong graduate experience. I just meant that overanalysing the metrics and the relative standings is an exercise that can generate significant heat but limited light....

I basically agree with Stacy. I think these numbers are like physics GRE scores - they have some gross overall correlation to something like quality or ability, but they're sort of useless numerically for actually predicting what one cares about. I knew a professor who thought that the only objective way to admit graduate students was to rank order them by physics GRE and admit the top N, which is idiotic and shows the problem with producing a table of numbers - people will use it well beyond its realm of applicability.

Honestly, if Dean X can't figure out whether a department is doing well from the actual record of a department's publications, external funding, and student placements, how is this number going to add anything? Is it not just a crutch for people who can't be bothered to read the actual record?

Then you have weird idiosyncracies, like some physics and astronomy departments are in the list and some aren't. I can think of one large astronomy program that is not in the list and don't know any reason why.

@Leiter - your argument is contradictory.
Awards, in my experience, are either proxy measures of productivity - they go to people who have either done a lot or done something with high impact; or, they are clubby pats on the back and measure social networking skills.
Generally though they are a lagging indicator. You want some faculty who will get the awards in the future, not those who got it for past work.

The NRC rankings do measure citation impact, it is a major factor in astro rankings. The problem in a lot of the humanities, I gather, is that the citations are not consistently measured and the NRC declined to use partial or inadequate citation metrics. In astro, citations are autogenerated on the fly, and are near complete.
Problem here is not the NRCs.

Finally, the whole point of statistical sampling is the realization that you can in fact get good estimates of complete rankings, like the PGR with a smaller sample, as long as biasing errors are minimized. That is the whole point of the exercise.

Finally, the NRC exercise looks for the surprises - are there biases in the reputation rankings (answer seems to be yes, but nothing startling); are there factors we are not properly accounting for (annoyingly, apparently not, the boring minimal obvious metrics are what counts); and, what did we not know about ourselves?

Now there are potential metrics not yet measured, mostly because of lack of time resolution - eg future career of graduates must matter, but takes a decade or more to discover. And contributes to the lagging indicator bias, we don't much care how past graduate did but how future graduates might do.
Alternatively, does faculty turnover affect program quality? Yes, buying in big names boosts reputation, but does it reduce performance due to disruption to work and reshuffling of resources? I don't know, the argument is plausible either way, but the point is that it can be measured, in principle.

@stacy, hos and ben - wow. you doods are totally harshing my mellow here...

Seriously - the NRC process has flaws and omissions, but it was a good faith effort and the raw data is valuable.
Yes, citation statistics are biased by subfield - I know that, I mostly publish in small subfields - and raw numbers of paper are not everything (money is, or so I hear from above... )

But: 1) you know some programs are better than others, for some things at least, because when a student asks we will guide them to different programs based on their strengths, both the program and student academic strengths.
Attempting to measure this is not a bad thing, it can help us rethink our own lagging reputation biases, it can help us realise our and other strengths, and it can guide us to what works and what does not.
I'm a theorist, some data is better than none, usually.

2) we did well. Better than we expected. I knew that, which is why I put the effort in to build up the NRC rankings outcome. That is my job.
Others did less well than they hoped or expected, so they rationalise and de-emphasise and point out the flaws.
That is cool, so we pushback both ways.
But, we will still ruthlessly sell our new high ranking, because it has some truth to it - we think we are better than our reputation, we really are a productive department and we are going to push that out there.

3) Most Deans know which departments are doing well and which are sinking. They see, or ought to see, the various periodic faculty performance reports and internal metrics.
Hell, they see the grant flow and net overhead income.
(Though never underestimate the capacity of some Deans for self-delusion or reputation bias lag - the senior faculty are generally their friends).
But, and I use that word a lot, there are going to be some nasty harsh decisions on closure coming down the pipe, because something has to give and the universities are out of resources.
A low NRC rank will be used as a shield by Deans and Provosts rationalising cuts - you did badly, justify yourself or die. PhD programs are expensive, and they have been multiplying, some are going to be pruned, and external validation is just what the administration needs.

oh i don't meant to harsh your mellow, don't be mad at me. you've got a good program -- nobody's saying otherwise. the only thing i'd really disagree with in your last comment is the idea that those questioning the ranking system are motivated by a bad showing. in our case, we don't show up simply because we are too small to be included, not because we scored badly. remember, good people can disagree without having ulterior motives.....

@hos - dood, I'd never get mad at you, except maybe at Hallowe'en or thereabouts... and this is just childish pique.

I will pushback though with thinly disguised jokes.

There are a lot of astro programs not rated because of size, that is an arbitary NRC choice. To answer your previous question - take publications per faculty, citations per paper and percentage of faculty who are PIs on external grants.
That will get you the ranking. I think the pubs are refereed only and the citations are undiluted, but would have to check methodology handbook.

There is a confirmation bias on attitudes to NRC programs, in that there is disincentive to knock them down if you did well and an incentive to speak out if you disapprove on principle and also happened to have "your" program do badly.
Program you are part of now, program you were part of, etc it is an insidious bias.

Stacy of course argues on principles, because he likes to ;-)

Have to say, as someone who got their PhD in Penn State in '05, this makes me happy. I did have a quick question. How was grad student led funding counted, i.e. I'm thinking of the HST program we had that you were financial PI on. Did that count towards or against such metrics?

Which is a primary reason for "raising awareness" of the ranking - get the "ah, yes PSU, they were really highly ranked" buzz when someone looks at our grads.

No idea how multi-PI grants were handled - the metric that popped out was the "has grant/has no grant" percentage.
Too many metrics to have pondered all of them yet, and the methodology requires digging in, it is all documented somewhere in the 200+ pages and 35Mb of spreadsheet

The only real world situation where such a ranking might come into play is in the competition to woo prospective grad students. In that sense, the R-rankings correspond roughly to my subjective sense of the competition. A student accepted into one the programs ranked 1-7 will likely choose them over
anyone else. Programs 8-21 will have a reasonably even competition for a prospective's attention. Programs 22+ would have a tougher time. Of course, one would hope a prospective would do some real homework, rather than just use this list. But it does probably represent the current state of play.

Brad --

The problem is that the "reputationally weighted" R-rankings are often based on very thin information. I guarantee that the survey includes many people who have never actually had any substantive interaction with the majority of the programs they're ranking (i.e., esteemed Ivy League pundits who never visit midwestern state schools, or excellent scientists who simply don't have the inclination or opportunity to travel extensively). We may "know" that program X "is good", but if we've never spent time in residence, or hired their graduates, or sent our graduates there, it's hard to know what goes on day-to-day.

As a result, I'm actually finding that the S-rankings are better at capturing my subjective rankings, for the subset of programs I know well. For example, I've been incredibly impressed with what they're doing at OSU, largely as a result of visiting there, and talking with my former undergrads who've gone there, and the graduates of their program. Their S-ranking reflects this (7th overall, and 4th among public institutions), whereas their R-ranking has them at a mind-boggling 20th. Yes, there's the effect of Steinn's "lagging indicator" bias, but OSU has had been doing great things on the faculty and student level for at least a decade. I think it's just that some large fraction of the community hasn't paid enough attention yet, and having some ranking based on quantifiable metrics at least calls some well deserved attention to them, which will maybe get them some reputational credit down the road for work they've been doing all along. I'm sure Steinn feels the same.

In light of the Steinn vs Everyone Else discussion, I think it comes down to the fact that it probably does indeed mean something when you do well at across a number of the metrics used by the NRC. However, it doesn't mean all that much when an institution doesn't wind up high. To use Ben's GRE analogy, scoring 800 on the Physics GRE is a pretty good sign that an applicant has a good grasp of some basic physics problem solving. Scoring 650, however, does not necessarily demonstrate that a student _doesn't_ have a decent grasp of the same.

My point was more about relevance. Perhaps the S-rankings agree with your perceptions, but I see no real-world impact
of these rankings, apart from the aforementioned competition
for grad students. In that arena, the R-rankings represent the true nature of the competition at present (and OSU being 20 might as well be 10 - as I said, we have similar win/loss records against everyone in that range).

Steinn, I didn't mean to harsh your mellow either, I think it's great your program is highly ranked and it probably does reflect some aspect of quality. I have no ax to grind on behalf of my current department as it did okay and this is above my pay grade anyway.

I agree with JD that imo, there is bogosity in reputational ranking and R-rankings appear to show this as well. There is one program very high in the R-rankings at a name school that I simply don't think is a top 10 astro program (I won't name it, I don't need any more enemies).

What bothers me are two issues where I'll pick on quotes from Steinn, although not meaning to single you out.

"some data is better than none, usually" - But I think bad easily summarized data drives out good ambiguous data, or disguises the truth when something just can't be measured accurately. This is another reason there is so much cr-p in the journals, as well. The report appears to try to compensate for this by giving broad confidence intervals, but I suspect its readers are going to ignore those, even if the difference between 10th and 25th is not statistically significant.

"A low NRC rank will be used as a shield by Deans and Provosts rationalising cuts" - That's a lot of the problem, Deans and Provosts should be able to justify the cuts with reference to basic data or facts, rather than using the ranks as a crutch. Astronomy is in an especially strange position because having a department already makes you above the median of astronomy programs, since many are subsumed into physics departments and some of those are tokens. In the end I suspect programs that are buddies with the Provost are still in good shape and programs that aren't are not.

Brad -- If people actually use the http://graduate-school.phds.org/ site, the choice of one ranked list won't matter as much. It's actually nice, in that you can weight different factors according to their importance to you. The Chronicle's visualization tool is also pretty informative, though I doubt many grads will use it.

However, I still think that most prospective grad students apply to where their advisors suggest. If those advisors are influenced by the NRC rankings, then maybe the mix of schools applied to might change. The question is then whether the pool of advisors is more influenced by S- or R- rankings. Oy.

@jd: of course, the most important ranking of all is the F-ranking, ie, how well a school's football team is performing. Particularly as quantified in the Big 10 standings. While I will admit that Ohio State currently is the favorite in this category (blegh), I'm sure we can all agree on a beautiful new world where Ohio State rapidly sinks to the bottom (in the F-ranking, I mean....).

I'm wondering what fraction of the committee has actually watched a football game in the last 10 years. Wagner, yes. Big 10, no.

Aha!
Down to essentials.

So can we agree:

Princeton invented the game.

Caltech, at a key time in history, actually had the team with the best record. I kid you not.

Big-10 and Pac-10 are much better than their reputation index indicates, and we'll have the courtesy to not comment on the Big-12 or SEC. (Over.Rated)

We all quietly agree not to discuss Boise or other li'l ones, or those nice but strange places that only play Ultimate or some such.

And, do not forget!

PSU was #3 in 2005!

We can now return to arguing about Exoplanets and whether Gl581g is real.

I'm wondering what fraction of the committee has actually watched a football game in the last 10 years. Wagner, yes. Big 10, no.

Yet another reason why these committees and rankings are always biased against state universities.

Have you noticed that everyone offering an opinion here is at a state uni, except for Mihos and Stacy? The Ivy Leaguers don't have to dirty their hands with such things. They're probably using the Chronicle of Higher Education article as a coaster while their grad students bring them tea and coffee-cakes.

Just kidding, Ivy Leaguers! I wuv you and your coffee-cakes!

should be "except for Mihos," sorry Stacy!

This is not an active discussion anymore, but I found my way here from a recent Cosmic Variance posting. I wanted to clarify for the record (with regard to comments 16/17), that 70% of UCI physics grad students are indeed externally supported, from a very generous GAANN grant. This does not reflect the Cal tuition waivers.

I am highly biased, but I do feel that the S-rankings are showing a strong program at UCI which the R-rankings completely miss. Of course, I agree with the general comments that these numbers need to be interpreted with great care.

More like this

QRT

A missing piece of the puzzle

Glöggt er gests augað

Jólasveinar og Jólakettir

Last minute stocking stuffers for nörds

Sand Seas of the Solar System

While Orac's away, the Breatharians will play...

Messier Monday: A Big, Blue, Bright Baby Cluster, M47