Saturday, January 6, 2018

How to Count Citations If You Must

That is the title of a paper in the American Economic Review by Motty Perry and Philip Reny. They present five axioms that they argue a good index of individual citation performance should conform to. They show that the only index that satisfies all five axioms is the Euclidean length of the list of citations to each of a researcher's publications – in other words, the square root of the sum of squares of the citations to each of their papers.* This index puts much more weight on highly cited papers and much less on little cited papers than simply adding up a researcher's total citations would. This is a result of their "depth relevance" axiom. A citation index that is depth relevant always increases when some of the citations of a researcher's less cited papers are instead transferred to some of the researcher's more cited papers. In the extreme, it rewards "one hit wonders" who have a single highly cited paper, over consistent performers who have a more extensive body of work with the same total number of citations.

The Euclidean index is an example of what economists call constant elasticity of substitution, or CES, functions. Instead of squaring each citation number, we could raise it to a different power, such as 1.5, 0.5, or anything else. Perry and Reny show that the rank correlation between the National Research Council peer-reviewed ranks of the top 50 U.S. economics departments and the CES citation indices of the faculty employed in those departments is at a maximum for a power of 1.85:

This is close to 2 and suggests that the market for economists values citations in a similar way to the Euclidean index.

RePEc acted unusually quickly to add this index to their rankings. Richard Tol and I have a new working paper that discusses this new citation metric. We introduce an alternative axiom: "breadth relevance", which rewards consistent achievers. This axiom states that a citation index always increases when some citations from highly cited papers are shifted to less cited papers. We also reanalyze the dataset of economists at the top 50 U.S. departments that Perry and Reny looked at and a much larger dataset that we scraped from CitEc for economists at the 400 international universities ranked by QS. Unlike Perry and Reny, we take into account the fact that citations accumulate over a researcher's career and so junior researchers with few citations aren't necessarily weaker researchers than senior researchers with more citations. Instead, we need to compare citation performance within each cohort of researchers measured by the years since they got their PhD or published their first paper.

We show that a breadth relevant index that also satisfies Perry and Reny's other axioms is a CES function with exponent of less than one. Our empirical analysis finds that the distribution of economists across departments is in fact explained best by the simple sum of their citations, which is equivalent to a CES function with exponent of one, that favors neither depth nor breadth. However, at lower ranked departments – departments ranked by QS from 51 to 400 – the Euclidean index does explain the distribution of economists better than does total citations.

In this graph, the full sample is the same dataset that Perry and Reny used in their graph. The peak correlation is for a lower exponent – tau or sigma** – simply because we take into account cohort effects by computing the correlation for a researcher's citation index relative to the cohort mean.*** While the distribution across the top 25 departments is similarly to the full sample, with a peak at a slightly lower exponent that is very close to one, we don't find any correlation between citations and department rank for the next 25 departments. It seems that there aren't big differences between them.

Here are the correlations for the larger dataset that uses CitEc citations for the 400 universities ranked by QS:

For the top 50 universities, the peak correlation is for an exponent of 1.39 but for the next 350 universities the peak correlation is for 2.22. The paper also includes parametric maximum likelihood estimates that come to similar conclusions.

Breadth per se does not explain the distribution of researchers in our sample, but the highest ranked universities appear to weight breadth and depth equally, while lower-ranked universities do focus on depth, giving more weight to a few highly cited papers.

A possible speculative explanation of behavior across the spectrum of universities could be as follows. Lowest-ranked universities, outside of the 400 universities ranked by QS, might simply care about publication without worrying about impact. Having more publications would be better than having fewer at these institutions, suggesting a breadth relevant citation index. Our exploratory analysis that includes universities outside of those ranked by QS supports this. We found that breadth was inversely correlated with average citations in the lower percentiles.

Middle-ranked universities, such as those ranked between 400 and 50 in the QS ranking, care about impact; having some high-impact publications is better than having none and a depth-relevant index describes behavior in this interval. Finally, among the top-ranked universities such as the QS top 50 or NRC top 25, hiring and tenure committees wish to see high-impact research across all of a researcher's publications and the best-fit index moves towards. Here, adding lower-impact publications to a publication list that contains high-impact ones is seen as a negative.

* As monotonic transformations of the index also satisfy the same axioms, the simplest index that satisfies the axioms is simply the sum of squares.

** In the paper, we refer to an exponent of less than one as tau and an exponent greater than one as sigma.

*** The Ellison dataset that Perry and Reny use, uses Google Scholar data and truncates each researcher's publication list at 100 papers. With all working paper variants, it's not hard to exceed 100 items. This could bias the analysis in favor of depth rather than breadth. We think that the correlation computed for researchers with 100 papers or less only is a better way to test whether depth or breadth best explains the distribution of economists across departments. The correlation peaks very close to one for this dataset.


  1. This is really dumb. For god sakes, if you want to evaluate a researcher, read the papers.

  2. You have 400-500 applicants for a job - are you going to read all their papers? Didn't think so. Yes, for the short-listed candidates, you should read their research - but how do you get to a short list? Actually, in our paper we are agnostic about how to evaluate people, we just ask whether Perry and Reny's claims are correct or make sense.

  3. also, economists (in particular) like to create equations for everything.