Policing by Numbers: Big Data and the Fourth Amendment

Publication year2021

POLICING BY NUMBERS: BIG DATA AND THE FOURTH AMENDMENT

Elizabeth E. Joh(fn*)

INTRODUCTION

The age of "big data" has come to policing. In Chicago, police officers are paying particular attention to members of a "heat list": those identified by a risk analysis as most likely to be involved in future violence.(fn1) In Charlotte, North Carolina, the police have compiled foreclosure data to generate a map of high-risk areas that are likely to be hit by crime.(fn2) In New York City, the N.Y.P.D. has partnered with Microsoft to employ a "Domain Awareness System" that collects and links information from sources like CCTVs, license plate readers, radiation sensors, and informational databases.(fn3) In Santa Cruz, California, the police have reported a dramatic reduction in burglaries after relying upon computer algorithms that predict where new burglaries are likely to occur.(fn4) The Department of Homeland Security has applied computer analytics to Twitter feeds to find words like "pipe bomb," "plume," and "listeria."(fn5)

Big data has begun to transform government in fields as diverse as public health, transportation management, and scientific research.(fn6) The analysis of what were once unimaginable quantities of digitized data is likely to introduce dramatic changes to a profession which, as late as 1900, involved little more than an able-bodied man who was given a hickory club, a whistle, and a key to a call box.(fn7) Real-time access to and analysis of vast quantities of information found in criminal records, police databases, and surveillance data may alter policing(fn8) in the same way that big data has revolutionized areas as diverse as presidential elections,(fn9) internet commerce,(fn10) and language translation.(fn11) Some have even heralded big data's potential to change our assumptions about social relationships, government, scientific study, and even knowledge itself.(fn12)

In the private sector, retailers have harnessed big data to produce some seemingly trivial but surprising changes to their practices.(fn13) A much discussed example stems from Target's extensive use of data analytics to identify certain purchases, such as supplements commonly taken during pregnancy, to know whether a customer is pregnant, without the woman disclosing the pregnancy herself.(fn14) For a retailer, pregnancy is a prime opportunity to target a consumer when shopping habits change and expand. An irate father allegedly complained to Target that his daughter was unfairly targeted as a pregnant woman with coupons only to discover, to his chagrin, that Target was better informed than he was.(fn15) Similarly, Walmart, through its computerized retail tracking, has discovered that Strawberry Pop-Tarts and beer sell as briskly as flashlights when hurricanes are forecast. These products were quickly shipped to Florida Walmart stores in the path of Hurricane Frances in 2004.(fn16)

Yet unlike the data crunching performed by Target, Walmart, or Amazon, the introduction of big data to police work raises new and significant challenges to the regulatory framework that governs conventional policing. From one perspective, the Fourth Amendment has proven remarkably flexible over time. Constitutional law has governed ordinary policing whether the crimes involved bootlegging,(fn17) numbers running,(fn18) marijuana farming,(fn19) or cell phones.(fn20) As the sophistication of criminals has increased, so too have the tools of the police. In the twentieth century, perhaps no two tools have been as revolutionary to modern policing as the two way radio and the patrol car.(fn21)

In this century, big data-in a variety of forms-may bring the next dramatic change to police investigations. One researcher has concluded that it will soon be technologically possible and affordable for government to record everything anyone says or does.(fn22) How well will the Fourth Amendment's rules pertaining to unreasonable searches and seizures adapt to the uses of big data? Scholars have widely discussed the shortcomings of applying Fourth Amendment doctrines, once adequate for a world of electronic beepers, physical wiretaps, and binocular surveillance, to rapidly changing technologies.(fn23) But big data may magnify these concerns considerably.

This article identifies three uses of big data that hint at the future of policing and the questions these tools raise about conventional Fourth Amendment analysis. Two of these examples, predictive policing and mass surveillance systems, have already been adopted by a small number of police departments around the country. A third example-the potential use of DNA databank samples-presents an untapped source of big data analysis. Whether any of these three examples of big data policing attract more widespread adoption by the police is yet unknown, but it likely that the prospect of being able to analyze large amounts of information quickly and cheaply will prove to be attractive. While seemingly quite distinct, these three uses of big data suggest the need to draw new Fourth Amendment lines now that the government has the capability and desire to collect and manipulate large amounts of digitized information.

I. THE RISE OF BIG DATA

What is big data? While not everyone agrees on a single definition of big data, most agree that the term refers to: (1) the application of artificial intelligence (2) to the vast amount of digitized data now available.(fn24) From this basic definition, a few observations emerge about what is distinct and significant about big data.(fn25)

First, big data alerts us to the sheer amount of information that is being produced rapidly every year in digital form.(fn26) The turn towards digitized information has been rapid and dramatic. As recently as the year 2000, only a quarter of the world's stored information was digital; the majority of it was on film, paper, magnetic tapes, and other similar non-digital media.(fn27) Today, the opposite is true; nearly all of the world's stored information is digital: about 2.7 zettabytes in 2012.(fn28)

Digital information continues to grow at a rapid pace. According to IBM, ninety percent of the world's data has been generated in the past two years.(fn29) The Executive Chairman of Google has claimed that we now create as much information in two days as we did from the beginning of human civilization to 2003.(fn30) Some have suggested that we may run out of ways to quantify numerically the amount of data generated.(fn31)

Nearly every piece of information today is capable of digitization and storage, including Internet searches, retail purchases, Facebook posts, cellphone calls, highway toll usage, and every last word in books.(fn32) Cheap, small, and sophisticated sensors and tracking devices have been built into every sort of product and object: smartphones, cars, toll transponders, library books, and internet use.(fn33) The city of Santander, Spain is a prototype of the coming "smart city," with 12,000 sensors buried underground that measure everything from air pollution to free parking spaces.(fn34) The resulting data doesn't disappear; it ends up in "data barns" that store the ever-growing amount of information.(fn35) Wal-Mart handles more than a million customer transactions every hour, resulting in databases storing more than 2.5 petabytes of information.(fn36) In 2008, Facebook boasted storage of 40 billion photos.(fn37) The Library of Congress decided in 2010 to archive every public "tweet" generated on Twitter: about 170 billion tweets (and counting) in January 2013.(fn38)

Second, because the term also refers to the artificial intelligence applied to these huge data sets, the big data phenomenon also suggests a change in the way we understand our world. If conventional scientific research begins with a hypothesis or question that then shapes the collection of the relevant data, the big data phenomenon turns such conventions upside down. Because data is being collected and stored all of the time, research questions do not have to shape or limit data collection at all.(fn39) Researchers need not limit themselves to data sampling, either. Big data permits the study of a phenomenon where the set is nearly everything that is possible to study (another way of stating that we are approaching n=all).(fn40) The existence of these massive data sets permits sifting and resifting of the information therein for multiple purposes.(fn41) Thus, the Library of Congress's continuous collection of "tweets" has interested researchers with questions as diverse as the role of public responses to smoking ads, changes in investor sentiments, and real-time hurricane analysis.(fn42)

Such massive quantities of information also suggest that the very kinds of questions posed by researchers will be different in the big data context. With so much data available, Viktor Mayer-Schönberger and Kenneth Cukier argue that two conventions of traditional research-a working hypothesis and the search for causality-are no longer necessary given the insights that can be derived from correlations found in big data.(fn43) The existence of huge amounts of data permits research into correlations that don't require an underlying hypothesis. For instance, Google's mathematical models have identified the forty-five search terms (e.g. "medicine for cough and fever") most strongly identified with historical flu data.(fn44) The resulting Google Flu Trends has proven to be remarkably...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT