Sustainable access to data for postmarketing medical product safety surveillance under the amended HIPAA Privacy Rule.

Author:Evans, Barbara J.


Pharmacoepidemiology explores "the use of and the effects of drugs in large numbers of people." (1) The Food and Drug Administration Amendments Act of 2007 (FDAAA) (2) authorized the U.S. Food and Drug Administration (FDA) to carry out a program of postmarketing drug safety surveillance that relies heavily on pharmacoepidemiological studies to assess safety risks with already-approved drugs. (3) To implement this program, the agency is developing the Sentinel system, (4) a very large-scale health information infrastructure, and its pilot phase, Mini-Sentinel (together, Sentinel). Mini-Sentinel already includes health data for ninety-nine million persons. (5) Large-scale data infrastructures for postmarketing drug safety surveillance also exist in Canada, (6) the European Union, (7) and Japan. (8)

Such systems present privacy and ethical issues that have been extensively discussed elsewhere. (9) This article takes as its starting assumptions that many members of the public support the goal of reducing medical product injuries and that postmarketing surveillance programs can advance that goal. The article explores the challenge of ensuring a sustainable supply of data for these purposes and asks whether recent amendments to the HIPAA Privacy Rule (10) have met that challenge. It concludes that the amendments move closer to but ultimately fall short of resolving this challenge.


    Raw patient health data--the records of a patient's encounters with the healthcare system--are not in themselves a very useful resource for postmarketing surveillance of drugs and other medical products. (11) To be useful, each individual's data must be longitudinally linked to create a chronological record that reflects diagnoses, treatments (including which medical products the patient used), and outcomes. (12) Pharmacoepidemiological studies typically need longitudinal records for large numbers of patients because the product-related injuries of interest often are very rare, making it difficult to detect statistically significant patterns between the use of specific medical products and specific injuries. Such studies sometimes require highly inclusive datasets that capture data for most or all of the people who have been exposed to the product. In some cases, even the small biases associated with letting patients "opt in" or "opt out" of having their data used may materially distort the study results. (13) Postmarketing medical product safety surveillance sometimes requires extremely large-scale information assets encompassing longitudinally linked records for tens or hundreds of millions of people. (14)

    The various entities that hold raw health data, such as insurers and healthcare providers, do not maintain patients' health information in a standardized format. (15) Each dataset must be translated into a common format before data from different sources can be linked longitudinally (for individual patients) or meaningfully compared (across different patients). This process requires significant inputs of skilled labor and information infrastructure such as software systems. (16) Once the datasets are in a common format, they must be brought together so that they can be used to answer specific questions about medical product safety.

    Various system architectures can bring large amounts of data together. (17) Perhaps the most obvious approach is to deposit the data in a large, central database. Centralization of the large amount of data required by medical product safety surveillance, however, has various practical and privacy disadvantages. (18) An alternative seen in many recent postmarketing drug safety surveillance systems is to adopt a distributed network architecture. (19) In a distributed network, individuals' health data stays at its original location, such as a healthcare provider's or insurer's database. These data holders link their datasets together virtually by converting their data into an agreed common format and responding to queries using their respective data assets; these piecemeal responses are then aggregated into an integrated response to the question at hand. (20) In a distributed data network, data-holding institutions are not just suppliers of data; they also supply services. (21) The services involve tasks such as: searching through their records to identify data relevant to a specific query; retrieving that information and converting it into the agreed common format; studying it to develop responses to the query; and transmitting the results. (22)

    Either option--creating centralized data assets or bringing data together virtually--requires significant investments of labor, information infrastructure, and, of course, money. It is overly simplistic to portray raw patient health data as valuable information assets in themselves. Raw data are made valuable, for purposes of postmarketing safety surveillance, only by investing labor and developing information infrastructure to facilitate the operations just described.

    A critical question is how to incentivize the necessary investments in information infrastructure so that the needed data resources will be available on a sustainable basis now and in the future. Compulsory approaches to data access are one way to procure data to use in postmarketing surveillance: simply force data holders to make their data available for these activities. This approach echoes Professor Marc Rodwin's proposal to enact legislation requiring data holders to report data in de-identified form for creation of a publicly owned national database to support various research and public health activities. (23) Unfortunately, collecting raw, de-identified health data in a public database would not by itself create a useful information asset for postmarketing surveillance. To make the data useful, data holders also would need to supply services and develop the infrastructure to convert their data into a common format before reporting the data. Compulsory provision of services - forcing civilians to do work for the government--is fraught with legal problems in the United States' system of law. (24) Moreover, if each data holder reports the raw data in de-identified format, the centralized database will not be able to link the data longitudinally because at least some identifying information must be shared in order to establish that data received from various sources relate to the same individual. (25) Compulsory sharing of health data in identified form would raise serious privacy concerns.

    An additional critique of compulsory approaches is that they may not ensure sustainable access to data over the long term. Such approaches favor static efficiency over dynamic efficiency: static efficiency focuses on how best to allocate rights to data that already exist today, whereas dynamic efficiency focuses on ensuring adequate supplies of data for the future. (26) Entities that hold rich stores of health data often have invested substantial sums of private capital to develop their datasets and related infrastructures. Forcing them to donate their data may be statically efficient in the sense of meeting today's immediate needs for data, but it also may destroy incentives for them to invest in developing data for future uses.

    Expropriating private assets has not been an effective approach in other infrastructure industries where it has been tried. The U.S. is the only nation that kept its large energy and resource infrastructures, such as oil and gas pipelines, under private ownership throughout the twentieth century. (27) Many nations around the world placed such assets under governmental ownership during the middle decades of the twentieth century. (28) As that century ended, however, governments in many nations were backing away from public ownership of these assets in an effort to restore investment incentives and efficiencies lost through the earlier expropriations. (29) Forcing infrastructure investors to donate their information assets and services to the public understandably diminishes the incentives to develop future assets. For this reason, compulsory data access schemes seem unlikely to ensure sustainable supplies of data and information infrastructure to support postmarketing surveillance activities on a long-term basis. And medical product safety is, if anything, a long-term problem that requires ongoing study; it is not susceptible to a one-time solution.

    Because compulsory data access schemes have so many problems, there is ongoing interest in non-compulsory (voluntary) approaches. These can include both donative and market-oriented approaches. Donative approaches rely on the goodwill of data holders to make their data available gratis for use in postmarketing surveillance activities. Market-oriented approaches are still voluntary in the sense that data holders are not forced to supply data and related services, but these approaches harness economic incentives to help overcome the natural human reluctance to share. The Health Information Technology for Economic and Clinical Health (HITECH) Act, (30) enacted in 2009 as part of economic stimulus legislation, (31) called for changes to the HIPAA Privacy Rule. HITECH recognized that donative approaches to data access, by themselves, may be unable to meet the needs of important research and public health activities that require access to health data.

    The Office for Civil Rights (OCR) within the U.S. Department of Health and Human Services (HHS) initiated proceedings (32) in 2010 to modernize the HIPAA Privacy Rule...

To continue reading