Goodbye PII: contextual regulations for online behavioral targeting.

Author:Chung, Yuen Yi
Position:Personally identifiable information

    Internet privacy has become a highly contentious issue. (1) One after another, prominent news media have exposed numerous tricks in tapping consumer data to create personalized advertisements. (2) Shortly after the 2012 elections, consumers grew increasingly concerned when it was revealed that even politicians used behavioral advertising for their campaigns. (3) Despite the Federal Trade Commission's ("FTC") best efforts in balancing the potential benefits of behavioral advertising and privacy concerns by setting self-regulatory principles, there is currently no law in the United States that expressly addresses behavioral targeting. (4)

    Current privacy regulations center on Personally Identifiable Information ("PH"). (5) PII defines the scope and boundaries of many federal and state privacy laws. (6) PII serves as a jurisdictional trigger for these statutes and regulations; without PII, there is no privacy harm. (7) Research has, however, demonstrated that non-PII may potentially turn into PII when additional information is made public or when data is aggregated. (8)

    This note suggests that the Legislature should no longer measure privacy risks based on the distinction between PII and non-PII. Privacy laws should abandon the concept of PII and regulate behavioral targeting based on a contextual continuum of reasonable expectations. (9) Part II describes the evolution of the role of PII in privacy laws. (10) Part III provides a background of online behavioral targeting in relation to PII and illustrates the flawed concept of anonymity with three distinct cases. (11) Part III also introduces European privacy law on behavioral advertising as well as several proposals for regulatory reforms in the U.S. (12) Part IV proposes a new approach to contextual regulations for behavioral targeting and possible challenges. (13)


    In the last century, PII has evolved from an irrelevant issue, to a recognized privacy tort, and then to one of the most fundamental aspects of the current privacy statutory schemes in the United States. (14) Given its importance, there is surprisingly no uniform definition of PII. (15)

    1. The Rise of Privacy Law

      In their much celebrated law review article, Samuel Warren and Louis Brandeis advocated a right of privacy in 1890. (16) Alarmed by tabloid journalism, Warren and Brandeis conceived the right of privacy as a "right of personality" and described privacy deprivation as a form of mental suffering. (17) After seventy years of privacy common law development, William Prosser categorized privacy law into four privacy torts commonly recognized in common law: (1) intrusion upon the plaintiff's seclusion or solitude, or into his private affairs, (2) public disclosure of embarrassing private facts about the plaintiff, (3) publicity that places the plaintiff in a false light in the public eye, and (4) appropriation. (18) Prosser did not explore the idea of PII because his four distinct types of violations require actual injury of an identified person--as do all torts. (19)

    2. Harm Prevention and Significance of PII

      PII first became an issue when the advent of the mainframe computer changed how information could be collected and processed. (20) In the 1960's, public bureaucracies began to computerize citizen records. (21) The public became concerned of such actions because the compilations of data lead to easily accessible, massive databases that offered little protection of sensitive information. (22) In response to the growing privacy concerns, the Secretary of Health, Education and Welfare introduced the Fair Information Principles (FIPS) in 1973. (23) FIPS is a data protection framework that requires, among other principles, notice and consent, access, data integrity, enforcement and remedies. (24) More significantly, FIPS recognizes the "creation of risk that a person might be harmed in the future". (25) Not only does this include permanent harm such as identity theft, but it also covers potential embarrassment and damaged reputation from misuse of information. (26) FIPS has inspired legislation to shift from merely redressing past harm to avoiding privacy problems. (27)

      After 1970, Congress began to enact privacy laws that were preventive in nature. (28) This process required the legislature to first identify a problem, and then categorize the types of information that might contribute to that risk. (29) This data-centric assessment of whether or not a particular data category constitutes "sufficient" harm to be regulated marked the beginning of the PII-era. (30) To this day, Congress continues to develop various privacy laws around the concept of PII. (31)

    3. Three Approaches of Defining PII

      Despite its significant role in privacy law, there is surprisingly no consistent definition of PII. (32) While some laws and regulations view PII as a rule, others favor PII as a standard. (33) Paul Schwartz and David Solove have synthesized three approaches to defining PII in various privacy laws and regulations: (1) the tautological approach, (2) the non-public approach, and (3) the specific-types approach. (34)

      1. The Tautological Approach

        The tautological approach defines PII as any information that identifies a person. (35) Any information that identifies a person is PII and triggers protection of the right of privacy. (36) While this approach allows flexibility and evolvement, like all standards, it fails to define PII because it merely reiterates PII as PII. (37)

      2. The Non-Public Approach

        The non-public approach is a variant standard of the tautological approach. (38) Instead of defining what PII is, privacy standards under this approach outline what is not PII--information that is either publicly accessible or purely statistical. (39) The Gramm-Leach-Bliley Act, (40) for example, simply defines personally identifiable financial information as "nonpublic personal information." (41) This approach is problematic because it fails to take into account whether such information is identifiable and overlooks the possibility that other nonpublic information may readily be matched to this type of public information. (42)

      3. The Specific-Types Approach

        The specific-types approach exemplifies the qualities of a classic rule--if information falls into an enumerated category, it automatically triggers the privacy law or regulation. (43) The Children's Online Privacy Protection Act of 1998 (44) is an example of the specific-type approach in defining PII. (45) The federal statute defines PII as "individually identifiable information about an individual collected online", such as first and last names, address, social security number, telephone number and email address, and "any other identifier that the [FTC] determines permits the physical or online contacting of a specific individual." (46) Though clearer than other approaches, the specific-type approach is very restrictive in its definition of PII as it always carries the possibility of being under inclusive. (47) In addition, the list of identifiable information is not static because technology continues to grow and non-PII always has the potential to become personally identifiable. (48)

        Despite its fundamental role in privacy regulations, there appears to be no uniform definition and application of PII. (49) All three of the current approaches are flawed and offer no concrete guidance as to what type of information belongs to the list of PII. (50) As a result, the fine line between PII and non-PII continues to fluctuate based on contextual and ever-changing technology. (51)

    4. Using Anonymization in Balancing Internet Privacy

      In an effort to protect the privacy of individuals, data administrators or collectors anonymize data when storing or disclosing person-specific information. (52) Anonymization is the "process of removing or modifying the identifying variables in the microdata dataset". (53) Typical anonymization techniques include data reduction and data perturbation. (54) Data reduction hides unique or rare recognizable data by increasing the number of individuals in the sample sharing similar identifying characteristics or selectively revealing such data. (55) Data reduction methods include removing variables, removing records, global recording, top and bottom coding and local suppression. (56) Data perturbation, on the other hand, modifies values of the identifying attributes using a randomized process. (57) Data perturbation techniques include micro-aggregation, data swapping, post randomization, adding noise and resampling. (58) Latanya Sweeney, a well-known computer scientist in data privacy (59), suggests a privacy model called k-anonymity that ensures that no data disclosure will allow a person to be distinguished from fewer than "k-1" other individuals, leaving the value of "k" up to policy makers. (60) A common challenge with all anonymization techniques is to identify the PII that allows an inference of identity and then controlling this inference, a problem that will be addressed in Part III of this note. (61)

      Anonymization has encouraged the Legislature to outweigh what seemed to be a minimal risk of sharing de-identified data against important values such as security, innovation and free flow of information. (62) Even if information falls within the scope of PII, Congress permits a more flexible regulatory system as long as such data is anonymized. (63) Therefore, sensitive information may be traded publicly as long as the data administrator makes the PII unidentifiable. (64)

      With the help of anonymization, Congress has developed law around the concept of PII to avoid weighing the cost and benefits of privacy regulations. (65) At first glance, Congress seems to have developed an approach that evaluates the inherent privacy risk of data categories by assessing with mathematical precision whether or not a data category causes sufficient harm to be regulated. (66) In reality...

To continue reading