Patent citations reexamined

Date01 March 2020
Published date01 March 2020
DOIhttp://doi.org/10.1111/1756-2171.12307
AuthorJeffrey Kuhn,Kenneth Younge,Alan Marco
RAND Journal of Economics
Vol.51, No. 1, Spring 2020
pp. 109–132
Patent citations reexamined
Jeffrey Kuhn
Kenneth Younge∗∗
and
Alan Marco∗∗∗
Many studies rely on patent citations to measure intellectual heritageand impact. In this article,
we show that the nature of patent citations has changed dramatically in recent years. Today,
a small minority of patent applications are generating a large majority of patent citations,
and the mean technological similarity between citing and cited patents has fallen considerably.
We replicate several well-known studies in industrial organization and innovation economics
and demonstrate how generalized assumptions about the nature of patent citations have misled
the field.
1. Introduction
Scholars often use patenting as an empirical measure of innovation. Seminal work by
Griliches (1981) measured patent counts, and later work weighted such counts by the number
of forward citations received by each patent on the view that more important patents receive
more references (Trajtenberg, 1990a). Later studies used patent citations to measure private
patent value (Lanjouw and Schankerman, 2001; Harhoff et al., 1999), firm market value (Hall,
Jaffe, and Trajtenberg, 2005), cumulative innovation (Caballero and Jaffe, 1993; Trajtenberg,
Henderson, and Jaffe, 1997), geographic spillovers (Jaffe, Trajtenberg, and Henderson, 1993),
technology life-cycles (Mehta, Rysman, and Simcoe, 2010), social importance (Moser,Ohmstedt,
and Rhode, 2017), originality (Jung and Lee, 2016), and technological impact (Corredoira and
University of North Carolina at Chapel Hill; jeffrey_kuhn@kenan-flagler.unc.edu.
∗∗ ´
Ecole Polytechnique F´
ed´
erale de Lausanne; kenneth.younge@epfl.ch.
∗∗∗Georgia Tech; alan.marco@pubpolicy.gatech.edu.
Wethank Ian Cockburn, Rui de Figueiredo, Jeff Furman, Michelle Gittelman, Dietmar Harhoff, Robert Merges, Michael
Meurer, Atul Nerkar,Gaetan de Rassenfosse, Katja Seim, Tim Simcoe, Neil Thompson, Tony Tong,Andrew Toole, Brian
Wright, Noam Yuchtman, and anonymous reviewers for helpful comments. We also thank seminar participants at the
2016 Searle Center Conference on Innovation Economics, the 2017 Boston University LawSchools Technology Policy
and Research Initiative Conference, the 2016 Munich Summer Institute, the United States Patent and Trademark Office
(2015), Skema Business School (2015), the 2016 DRUID Conference, and the 2016 Academy of Management Annual
Meeting. The authors thank the United States Patent and Trademark Office for providingaccess to data. The authors also
thank Google for a generous research grant of computing time on the Google Cloud.
C2020, The RAND Corporation. 109
110 / THE RAND JOURNAL OF ECONOMICS
FIGURE 1
TOTAL CITATIONSMADE BY YEAR, DIVIDED BY NUMBER OF BACKWARD CITATIONS MADE BY
CITING PATENT
Banerjee, 2015). Overall, a search for “patent citations” on Google Scholar (conducted January
1, 2018) returns over 21,000 results.
Recent research, however, has begun to call into question a straightforward interpretation of
patent citations. Abrams, Akcigit, and Popadak (2018) find an inverted-U relationship between
patent citations and market value, suggesting that high citation counts may indicate strategic use
of the patent system (instead of impactful innovation). Evidence also has emerged that the search
for and disclosure of prior art varies between applicants in systematic ways (e.g., Sampat, 2010).
Citations may suffer from significant noise and measurement error (Gambardella, Harhoff, and
Verspagen,2008; Roach and Cohen, 2013), and the comparison of patents between cohorts can be
problematic because citation counts have inflated substantially over time (Marco, 2007). Failing
to correct for time period, technology, and geographic region can introduce significant bias into
an analysis (Lerner and Seru, 2017). Correcting citation counts to return to the original goal of
devising a measure of innovation activitythat is broadly comparable across contexts, however, is
problematic due to endogeneity in the pendency, citation lags, and filing years of a given sample
(Mehta, Rysman, and Simcoe, 2010).
In this article, we highlight an important change in the data generating process of patent
citations that changes the statistical nature of patent citation measures and the appropriate use
of patent citations in current and future research. Specifically, we observe a dramatic increase
in the number of citations generated per year and relate that change to a small proportion of
patents flooding the patent office with an overwhelming number of references. Figure 1 shows
the number of backward citations over time, segmented bythe number of citations made by each
patent. In recent years, 46.8% of references derive from less than 5% of patents with more than
100 backward citations.1
To examine how the aforementioned changes in citation patterns affect the information
content and quality of citation-based measures, we use a vector space model based on Younge
and Kuhn (2015) that compares the text of every patent to every other patent granted by the
1Conversations with examiners suggest that reviewing more than 100 citations is extremely difficult, given time
constraints.
C
The RAND Corporation 2020.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT