Vehicles with autonomous capabilities could offer society convenience and mobility at an unrivaled scale. (1) As vehicles become more autonomous, or "smarter," they will also generate more data. (2) This data may be utilized in many societally beneficial ways: it could help businesses construct better products, (3) insurance companies better manage risk, (4) governments design better infrastructure, (5) and individuals ease their busy schedules. (6) One estimate suggests that the value of "car data and shared mobility could add up to more than $1.5 trillion by 203 0." (7) In other words, the creation of smart car data will increase the size of the economic pie.
Although smart car data can create value, it can also create opportunities to lose value: identity theft from data breaches, the loss of autonomy through government or corporate surveillance, or annoyance from persistent targeted advertising. (8) Many risks of smart car data are negative externalities primarily borne by smart car consumers. (9) While consumers will bear most of the costs, who will receive most of the benefits? How will society share the newly-minted value embodied in smart car data? According to the Coase Theorem, a principle conceived of by the famous economist Ronald Coase, it depends on who has the right to control, or own, smart car data.
In his influential, Nobel Prize-winning paper, Coase theorized that when two parties bargain, the same allocation of resources will result regardless of which party was initially allocated the property right. (10) To illustrate, imagine the landlord of an apartment building whose tenants suffer from the air pollution caused by the operation of the factory next door. The landlord wants less pollution, but the factory owner wants to continue operating. If the landlord holds the right to clean air, the two parties could strike a deal where the factory owner pays the landlord for every unit of air pollution produced. This will cause the factory owner to internalize the cost of the pollution and she will be incentivized to decrease pollution, at least until her costs of doing so are higher than the costs of paying the landlord. In contrast, if the factory owner holds the right to pollute, the landlord would have to pay the factory owner to persuade her to produce less pollution. The landlord would figure out how much the air pollution is "costing" him (i.e., in decreased rent) and would offer the factory owner up to that amount in exchange for less pollution. The Coase Theorem proves that the amount of pollution produced in either situation is the same. A corollary is that the landlord benefits from the transaction when he was allocated the initial right to clean air, while the factory owner benefits when she was allocated the initial right to pollute.
Analogously, the Coase Theorem suggests that whoever holds initial property rights over smart car data will benefit from the value generated by that data. This concept is echoed by a recent McKinsey report asserting that "consumers will be the ultimate winners" regarding smart car data because consumers "own" the data about them and will be able to "leverage their personal data as currency." (11)
DEFINING A "SMART CAR" AND ITS DATA
Because data collection by cars is not limited to autonomous vehicles ("AVs"), the scope of this Note is also not limited to AVs. Instead, the discussion will encompass any "smart car," which, for the purposes of this Note, will specifically refer to any personal vehicle (12) that has connectivity to the Internet, other devices, or surrounding vehicles or infrastructure, and is equipped with external or internal sensors and a method of recording data. Smart cars may be able to integrate across platforms and applications, perhaps becoming another interface where consumers' digital profiles can be accessed. This definition of a "smart car" is extremely broad and encompasses many existing models of cars. For example, an estimated 86 percent of new cars shipped in 2018 will be equipped with Bluetooth, (13) and an estimated 96 percent of model year 2013 cars are equipped with "black boxes," (14) which record information about the car surrounding the time of a collision. (15) Thus, it is likely that nearly all relatively new cars can qualify as a "smart car" under the broad definition given here. In 2015, there were about 36 million cars with an Internet connection on the road. (16) One study forecasts that that number will grow to 381 million by 2020 and Internet-connected cars will generate a revenue of $8.1 trillion between 2015 and 2020. (17) Smart cars are already here today en masse, and they will only increase in number. The undeniable emergence of smart cars emphasizes the mounting need to understand the applicable law of data ownership and to develop a proper legal regime.
Smart cars will generate and record many types of data. Table 1 presents a simplified way to organize the types of smart car data, their characteristics, and their potential uses.
Alone, most smart car data is not necessarily sensitive. However, smart car data may become sensitive because modern data science is often able to infer sensitive information from non-sensitive information. (18) For example, mega-retailer Target was able to predict whether a customer was pregnant, including which trimester, based only on her purchase history. (19) With the amount of information that smart cars are able to collect about the user's physical behavior (e.g., location data, driving behavior data), the user's digital behavior (e.g., application data), and the outside world (e.g., sensor data), modern data science will likely be able to infer a lot of information, much of it sensitive, about a smart car user. Because smart car data collection is usually imperceptible and constant, (20) this increases the risk that more information about smart car users will be collected than they would like. For example, if location or sensor data shows that a smart car is frequently navigating to a drug treatment clinic, the data could be used to infer that the user is seeking drug treatment, which is more likely to be sensitive information.
Sometimes, the privacy impact of "inferred" information can be mitigated if data is anonymized, but true anonymization is not always achieved. (21) For example, one study was able to re-identify people based on a correlation of anonymous Netflix movie ratings and public IMDB movie ratings, which revealed their entire Netflix histories. (22) Location data may be especially sensitive in this respect because of the uniqueness of an individual's location. A study published in Science showed that by utilizing just four data points of anonymized spatiotemporal points (i.e., a person's location at a given time), an anonymized database of credit card transactions could be used to uniquely re-identify 90 percent of the 1.1 million individuals in the database. (23) Moreover, knowing the price of a transaction increased the likelihood that someone could be re-identified by 22 percent. (24) Knowing just a few data points about someone's location and behavior can completely unravel anonymization.
There is a growing ability in modern data analysis to infer sensitive information from innocuous data, even from anonymous data. Although this characteristic of data may increase privacy risks to users of smart cars, it simultaneously adds value to smart car data. As discussed in Part I, that value will be captured by whichever entity is endowed with the initial ownership of the data, bringing us to the primary inquiry of this Note.
APPLICABLE LEGAL REGIMES
Raw data cannot be "owned" in the same legal sense that traditional intellectual property can be owned, so throughout this Note "ownership" of data will be used as a shorthand way to describe the rights or ability to access, assign, transfer, use, destroy, or exclude others from that data. This section will first discuss why data does not fall into any of the familiar intellectual property regimes. Then, this section will analyze some potential legal structures that could affect the property-like rights surrounding smart car data, including the anti-circumvention provision under U.S. copyright law, industry-specific regulations, and the contracts and privacy policies negotiated by the stakeholders themselves. Moreover, this section will analyze the problems or gaps that these structures may create.
Intellectual Property Law
Existing intellectual property regimes such as patent, trademark, and copyright do not apply well to the ownership of data. Patent law does not apply because data does not fall into the...