Despite some (well founded) criticism as to its usefulness, the h-index seems to be with us to stay. In a couple of posts I’ve articulated some of its advantages and disadvantages – see for example What’s the point of the h-index? and How does a scientist’s h-index change over time? – and it’s clear that more and more funding agencies are using it to evaluate the track record of applicants. Just this afternoon I finished the second of a couple of grant reviews in which the applicant was asked to state their h-index. What they were not asked was which h-index they should state, i.e. the source of the value, though I think that this is important information. Why? Because it varies so much depending on where the it comes from. I’ll give you an example – here’s my own h-index values taken from a few different sources:
Google Scholar: h = 39
ResearchGate: h = 36
ResearchGate (excluding self citations): h = 34
Web of Science (all databases): h = 34
Web of Science (Core Collection): h = 29
Scopus: h = 29
There’s a 10 point difference (almost 25%) between the largest and the smallest values. So which one should I cite in grant applications, on my CV, etc. Well the largest one, obviously! Right? Well maybe, but not necessarily. In fact none of these values are completely accurate, though some are more accurate than others.
Google Scholar and Web of Science include papers and book chapters that don’t belong to me, and I can easily shave a couple of points off that top value. Some of these mis-attributions are chapters from a volume that I co-edited. Some are papers that I edited for PLoS ONE and which have been assigned to my record. Others are for the two or three other researchers named “J. Ollerton” who are out there. Then there’s some which are just bizarre, such as “The social life of musical instruments” by Eliot Bates, which Google Scholar seems to think I wrote and has credited me with its 102 citations. I wonder how often similar mistakes with regard to citations are made?
Web of Science and Scopus don’t pick up as many citations in books or reports as does Google Scholar which is a deficiency in my opinion. Being cited in a peer-reviewed journal is often thought of as being the gold standard of citation but frankly I’m very happy to be cited in government and NGO reports, policy documents, etc., which themselves may often be peer reviewed, just by a different type of peer.
Poised in the middle of this range, ResearchGate may be most accurate but it lacks transparency: as far as I can see there isn’t a way to look at all of your citation data per paper in one go, you have to look at each publication individually (and who has time for that, frankly?)
As far as calculating an accurate h-index is concerned I don’t think we will ever come to an agreement as to what should be considered a publication or a citation. But systems like Google Scholar and Web of Science should at least try to be accurate when assigning publications to an individual’s record.
So which h-index should you use? In the interests of accuracy and honesty I think it’s best to state a range and/or add a proviso that you have corrected the value for mis-attribution of publications. In my case I’d say something like:
“Depending on source my h-index lies between 29 (Scopus) and 37 (Google Scholar), corrected for errors in attribution of publications”.
If the h-index is to have any value at all (and there are those who argue that it doesn’t and shouldn’t) then it requires us as scholars to at least try to make it as accurate as we can. Because frankly I don’t think it’s going to go away any time soon.