Busting the Black Box: Big Data, Employment and Privacy.

AuthorWilson, Rebecca J.

WE live in an era of big data. Our increasing reliance on digital communication coupled with the technological ability to capture, collect, and analyze ever-growing volumes of data has led to the application of predictive analytics techniques to many of the most important facets of our lives, including healthcare, education, and employment. (1) The ubiquitous nature of big data raises questions about "the relationship between individuals and those who collect and use data about them." (2) In the seminal Harvard Law Review article "The Right to Privacy," Samuel D. Warren and Louis D. Brandeis wrote of the need for the law to adapt to address new intrusions on the right to privacy occasioned by social and technological change. The article opens with the following observation:

That the individual shall have full protection in person and in property is a principle as old as the common law; but it has been found necessary from time to time to define anew the exact nature and extent of such protection. Political, social, and economic changes entail the recognition of new rights, and the common law, in its eternal youth, grows to meet the demands of society. (3) These prescient words apply as forcefully today in the age of big data as they did in 1890 when Warren and Brandeis first discussed "the right to be let alone." (4) Inherent in the traditional view of the right to privacy is the generally accepted principle that each individual has "the right of determining, ordinarily, to what extent his thoughts, sentiments, and emotions shall be communicated to others." (5) Yet in our wired world, individuals passively communicate information about themselves each day with little knowledge about or control over how the information is transmitted and the purposes for which it is used. Big data raises concerns about not only the individual right to privacy, but also whether it creates "such an opaque decision-making environment that individual autonomy is lost in an impenetrable set of algorithms." (6) Existing legal frameworks may prove insufficient to address novel privacy concerns raised by big data, and the time might yet again be upon us to consider the scope of the right to privacy and the legal mechanisms required to protect it. This article focuses on the use of big data in the employment context. Big data can be used by employers in many positive ways, including eliminating irrational or even discriminatory biases in the hiring process; identifying unique and unexpected sources of talent; promoting employee wellness; reducing healthcare costs; and increasing worker efficiency. Critics of predictive analytics in the workplace decry the fact that data from digital activities can be used by employers to make assumptions about individuals' behavior that impact their livelihood without their even knowing it. (7) Employers and their advisors who seek to realize the potential of big data must navigate largely uncharted territory because big data does not fit neatly within existing legal frameworks that govern the employment relationship. Until employment laws are updated to more directly address big data, counsel advising employers on the use of big data in the workplace must consider how existing legal protections may apply. Many compliance issues can arise and will continue to arise as the technology evolves and new applications emerge. This article seeks to provide employers and their counsel with just a few examples of the impact that big data can have in the workplace and the related compliance concerns.

  1. Defining Big Data

    1. Characteristics of Big Data

      There are many different definitions of big data. In the privacy context, big data has been defined as "data about one or a group of individuals, or that might be analyzed to make inferences about individuals." (8) Perhaps the most commonly-referenced characteristics that make data "big" are the so-called "three V's:" datasets of enormous volume, in an ever-increasing variety of formats, continuously collected at a rapid velocity. (9) This framework recognizes that, first, routine data collection is now deeply embedded in many aspects of our daily lives, and, second, these datasets are ripe for computer-assisted or automated analysis. (10) As scholar and big data ethicist Dr. Solon Barocas puts it, "the distinguishing feature of big data [is] the ability to detect useful patterns in datasets that can inform or automate future decision making...data is big when it can function as the grist for the analytics mill." (11

      )1. The "Three Vs"

    2. Volume

      It can be difficult to comprehend the volume of data created and shared in today's hyper-connected world. For example, it is estimated that in 2016 the amount of data transferred, for the first time, crossed the one zettabyte threshold. (12) If you consider that a byte of information translates to one character of text, Tolstoy's War and Peace, which clocks in at 1,250 pages, would fit into a zettabyte 323 trillion times. (13) Another comparator: consider that if every person in the united States took a digital photo every second of every day for over a month, all of those photos put together would equal roughly one zettabyte. (14) But for as much data as people create--for example, an average of 500 million photos per day and over 200 hours of video per minute shared in 2014 -- that volume is nothing compared with the amount of digital information created about them each day. (15)

    3. Variety

      Big data is varied. It is generated in many different forms, and it is captured and transmitted via an array of applications. Big data sources can be divided into two basic categories: data that is "born digital," meaning that it is specifically created for use by a computer or data processing system, and data that is "born analog," meaning that it originates in the tangible, physical world but can be converted into digital data. (16) Examples of data that is "born digital" include data: contained in emails (including content, frequency, recipients, and read receipts); generated from web browsing; captured by items that make up the Internet of things ("smart" devices such as digital assistants like the Amazon Echo, wearable fitness monitors, or Internet-connected cars); collected through store loyalty programs which track your purchases online and in stores; about your location, gathered from GPS, cell tower triangulation, wireless network utilization, and card swipe security systems; generated and shared on social media; collected from mobile applications; and, in the context of employment, generated by performance on psychometric tests. (17) Some examples of data that is "born analog" but can then be digitized include sound waves in phone calls, content from video footage, and documents that are scanned and run through optical character recognition (OCR) software. (18)

    4. Velocity

      The "velocity" of big data refers to both the swift pace of data collection (19) as well as the continuity of the data stream. (20) For example, mobile mapping applications are useless unless they are constantly harvesting the most current data to show your location as you move. (21) The "continuous collection" aspect of big data has important consequences both for the technology needed to store the data and the ways that the data can be analyzed, implicating considerations of scale, timeliness, privacy, completeness, and accuracy. (22) In fact, velocity may be "perhaps the most challenging component of big data, the ability to manage, and make sense out of information that is continually being collected." (23)

      1. Predictive Analytics

      The almost incomprehensible volume of data that is rapidly generated from a variety of sources on a continuous basis can be harnessed by a process called predictive analytics. (24) in essence, big data has the capacity to reveal patterns and relationships that would not be visible in a smaller sample size. Predictive analytics uses a method known as data mining to identify trends, patterns, or relationships among data, which can in turn be used to develop a model for predicting behavior based on probabilities. (25) Data brokers compile information from multiple digital and analog sources, unbeknownst to their subjects. (26) Data mining algorithms can be trained to find patterns through the process of "supervised learning," in which an example of the pattern to be recognized is introduced to the algorithm, or "unsupervised learning," in which the algorithm attempts to identify related pieces of data. (27) Data mining "automates the process of discovering useful patterns, revealing regularities upon which subsequent decision making can rely." (28) "The accumulated set of discovered relationships is commonly called a 'model,' and these models can be employed to automate the process of classifying entities or activities of interest, estimating the value of unobserved variables, or predicting future outcomes." (29) While data mining can identify relationships between seemingly disparate pieces of information, these relationships do not always establish causality. (30)

      Predictive analytics has many public and commercial applications. Federal, state, and local governments collect data that can then be used to help improve public services (31) or make the public aware of potential hazards such as consumer product recalls or workplace accidents. (32) However, the vast majority of big data is ultimately used for commercial purposes. After being collected from various sources, it is sold by data brokers to companies for marketing and other purposes. (33) "Data brokers gather not only consumers' spending and debt histories, but also much more intimate details of consumers' financial, social, and personal lives. They track where consumers shop, what they shop for, how they pay for purchases, and much more." (34) That information is then often used to predict consumer behavior, segment consumers into categories for marketing...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT