Your H-score

Interesting conversation at lunch today: topic was academic performance metrics and of course the dreaded citation index came up, with all its variants, flaws and systematics.

However, my attention was drawn to a citation metric which, on brief analysis, and testing, seems to be annoyingly reliable and robust. The H-score.

The H-score, takes all your papers, ranked by citation count; then you take the largest "k" such that the kth ranked paper has at least k citations.

So, you start off with a H-score of zero.

If your 5th highest cited paper has 5 citations but your 6th highest cited paper has 4 citations then your H=5.
If your 10th highest cited paper has 11 citations, but your 11th highest cited paper has 9 citations, then your H=10.
And so on. High H is better.

As you publish, your H-score should grow, but it becomes harder to get a higher H-score; in particular having one highly cited paper doesn't help much, you need many papers cited a lot to get a good H-score. Publishing a lot of poorly cited papers doesn't help either.

Crudely, the total number of citations increases as H2, with the normalization depending on the size of your sub-field (to get a lot of citation there must be a lot of papers published in your field!).

I was told that a mid-career theorist in my field typically has an H-score of 20-30, and sure enough I do, as do those of my near contemporaries that I quickly sampled. In fact the sample of half dozen recently tenured faculty theorists had remarkably tightly clumped H-scores.

A small sample of senior hot-shot faculty (full professors, well past tenure, at top ranked research universities) showed H-scores typically up around 50-60, breaking 30 is a big deal, seems to "get the ball rolling" where you have so many well cited papers that everyone starts citing you, because everyone else is citing you.

The highest H-score I found, searching sparsely on people I knew had phenomenal publication records, so not a complete sample, was 88.
Haven't seen anyone in astro with H over 100, although I am sure someone in particle physics or some bio-subfield has done it.

Oh, and word is that people are using this metric to look at hires, faculty research progress, promotions etc. Certainly some people are...

This is an interesting issue - everyone agrees that citation impact is a very important metric.
Having highly cited papers is important, having lots of papers is important, and having many total citations is important.
But citations can be gamed; there is of course self-citation, which is easily filtered; there is citation-by-progeny - researchers who produce many students or postdocs tend to build up a loyal citation base. There is the issue of normalized citation - the total citation divided by the number of authors, which is probably over-harsh, but a lot of the most heavily cited papers are from large teams or surveys with large co-author groups.
There are definitely citation circles out there, mostly informal or spontaneous; and there are anti-citation groups - authors who refuse to cite relevant papers by competitors for a variety of interesting reasons.

Because of all this, most citation measures are very frustrating; but of the ones I have heard of to date, the H-score at a quick glance seems the most robust and interersting.

PS: I hear the concept was invented by a (physical?) scientist at a UC, possibly UCSB.
Someone tip me to who it was if they know so I can cite them!
Google didn't help, there is some sort of protein match test which has a "H-score" which swamps and search pattern I've thought of.

Ah, here we go: from Lattice QCD in the comments:
"The originator of the h-index is Jorge E. Hirsch of UCSD, a condensed matter theorist. The paper in which he proposed the h-index is arXiv:physics/0508025"

The Hirsch index.

Ed Witten has H=110, Heeger has H=107

Just to give a sense of what is involved: in my field, a paper by Frazer Pearce a few years ago estimated that about one published paper in 100 has 100 citations in the first five years after publication.

Tags

More like this

The originator of the h-index is Jorge E. Hirsch of UCSD, a condensed matter theorist. The paper in which he proposed the h-index is arXiv:physics/0508025 (anti-spam won't let me link to it).

You may wish to take a look at arXiv:physics/0608183 too, by Casey W. Miller. Title: "Superiority of the $h$-index over the Impact Factor for Physics"

By dileffante (not verified) on 26 Oct 2006 #permalink

Check out http://en.wikipedia.org/wiki/Bibliometrics and scroll down to the h-index: "The h-index is a number suggested by Jorge E. Hirsch in 2005 for the quantification of scientific output of individual scientific authors."

By WiseWoman (not verified) on 26 Oct 2006 #permalink

Yeah, someone suggested this as a useful index for hiring purposes around here, so I did
a quick calculation of the H-index for each faculty member of our department. Basically it correlates with age. Except for one or two underperformers, everyone lay close to the correlation, whether they be middle-of-the-road or in the National Academy. My conclusion was that the H-index is utterly pointless for hiring purposes. The only people who make a significant deviation are blindingly obvious people to hire - the trick is not
selecting those people, but getting them to choose you.

High astro H-scores - Sandage=99, Gunn=94

For an active researcher, the H-score (or h-index) tends to grow linearly; if it stops growing with time, then it is an indication that research is tapering off, overgeneralising here of course.
So H-score comparison needs to be made on a peer basis. Don't know how it compares with "most cited" or "total cites" for picking out young researchers; there will be stochastic issues with any metric for people who do not have a long track record.
Any such metric makes me a little bit uneasy, it is so easy to find an exception - the person who does poorly by the metric but is agreed to be good or influential, and the person who scores highly, but nobody really cares...
Problem is committees and administrators like quantitative "objective" measures, so it'd be better to have robust measures than not.

I always thought that the H-index was particularly succeptible to self-citation. Anyone who writes N papers and cites all previous papers will have an H-index approximately N/2 or exactly
0.5[N + (-1)^N -1]
without any external citations. Taking out self-citations would greatly improve it.

I have also wondered if there should be a way to cite negativley. A poorly written paper that slips by the referee and sparks multiple rebuttals, corrections, and so on shouldn't help an author. In hyperlinks, one can always use rel="nofollow" to prevent Google from enhancing the linked page's PageRank.

The H-index got quite a bit of play around here when the article first came out. But I have to agree with Brad that it doesn't add much if any information. Citation analysis of any sort in astrophysics is an awfully crude tool, with variations caused as much by size and culture of subfield as anything else. Dividing my own papers into those above my h-index and those below (which we are to interpret as relatively high and low "impact"), I find a bunch of review-style papers near the top that didn't add much to the field but were convenient for others to quote, and I find some of the most innovative and IMHO interesting work near the bottom -- including some demonstrably influential papers that have led directly to well-funded research projects by multiple groups.

I am frequently shocked in idle discussions in faculty meetings by just how seriously some very senior faculty take citation analysis. Fortunately, when push comes to shove in a hiring or promotion decision, people sit down and read the papers.

By astroprof (not verified) on 27 Oct 2006 #permalink

Some people sit down and read papers, particularly at some well administered places.
Other people count beans.

I agree with your point on what gets cited - it is catalogs, reviews and certain method or survey papers.
Many papers that actually trigger new research directions get only modest citations, particularly if they lead to long development efforts with the actual results more than five years down the line (since we all know that graduate students do not read papers more than five years old... ;-)

Anyway - there have been at least two occasions on which a senior colleague referred me to "this hot new paper in that field you said you were doing something in", not realising it was my paper; in one case me as first author paper.
I only bothered to point this out the first time it happened...
Faculty always grow greener on the other side of the fence, or some such.

Some people sit down and read papers, particularly at some well administered places.
Other people count beans.

Well, the job of a good faculty includes training mediocre administrators in what sort of beans they should be counting. Raw citation counts and H-scores may sometimes be the first choice of the lazy, but are more likely to the last refuge of someone trying to make sense of a file outside his or her field who isn't guided to something more useful by those who know better.

By astroprof (not verified) on 27 Oct 2006 #permalink

I am frequently shocked in idle discussions in faculty meetings by just how seriously some very senior faculty take citation analysis.

There are a number of statistics faculty take way too seriously.

I suspect that the root cause is that outsiders view that as a way of measuring the department's profile or quality. Almost certainly citation counts go into things like NRC rankings, and faculty are obsessed with NRC rankings. (Much as universities as a whole have become obsessed with US News & World Report rankings.) Indeed, the ideal faculty member isn't one who does good work, is a good teacher, and who mentors grad students well, but is one who looks to the world like he's doing all of that. Image is everything.

-Rob