THE ETHICS IN SYNTHETICS: STATISTICS IN THE SERVICE OF ETHICS AND LAW IN HEALTH-RELATED RESEARCH IN BIG DATA FROM MULTIPLE SOURCES.

AuthorBassan, Sharon
PositionBASSAN AND HAREL, THE ETHICS IN SYNTHETICS
  1. INTRODUCTION 88 II. A DELICATE EQUILIBRIUM 89 III. DIFFERENT SCOPES OF PROTECTIVE REGULATIONS 93 IV. AN AUTHORIZATION TO USE INFORMATION FOR HEALTH-RELATED RESEARCH 97 V. POSSIBLE EXEMPTIONS FROM THE AUTHORIZATION REQUIREMENT 103 VI. SYNTHETIC DATA AS A MEANS TO FULFILL ETHICAL REQUIREMENTS 109 VII. A NEW RISK-BENEFIT BALANCE 112 VIII. CONCLUSION 116 I. INTRODUCTION

    An ethical advancement of scientific knowledge demands a delicate equilibrium between benefits and harms, in particular in health-related research. When applying and advancing scientific knowledge or technologies, Article 4 of UNESCO's Universal Declaration on Bioethics and Human Rights, ethically justifiable research requires maximizing direct and indirect benefits, and minimizing possible harms. (1) The National Institution of Health [NIH] Data Sharing Policy and Implementation Guidance similarly states that data necessary for drawing valid conclusions and advancing medical research, should be made as widely and freely available as possible (in order to share the benefits), while safeguarding the privacy of participants from potentially harmful disclosure of sensitive information. (2) This paper discusses the challenges in the maximization of research benefit and the minimization of potential harms in the unique context of health-related research in Big Data from multiple sources, which are differently protected by the law.

    Part I frames the ethical dilemma by discussing potential benefits and harms, showing the constant misalignment in health-related research in Big Data from multiple sources, between the benefits in the use of confidential information for scientific purposes, and the value in keeping confidentiality. In part II, the paper addresses existing regulations, their nature and legal coverage. It highlights the challenges prevailing when combining data from multiple sources that are differently protected by the law. Part III compares different requirements for consent or authorization to use persons' health information for research. It focuses on the difficulty of existing regulation to ensure those requirements when using multiple sources of data. Part IV investigates whether exemptions from the authorization requirement could prevail in the context of information that exceeds the protection of the HIPAA and the Protection of Human Subjects Regulations. In part V the paper proposes a solution is of a statistical nature, using the method of synthetic data to balance conflicting consideration. Part VI shows how the use of synthetic data can overcome some of the ethical challenges.

  2. A DELICATE EQUILIBRIUM

    The term "Big Data" is differently defined by users and policy makers. What it means is dramatically different to the media, business, health, or academic statistics communities, and to different regulatory bodies. (3) To our knowledge, there is no gold standard definition. Big Data is considered data on a massive scale in terms of volume, intensity, and complexity that exceed the ability of standard software tools to manage and analyze. (4) But also, "It is less about data that is big than it is about a capacity to search, aggregate, and cross-reference large data sets." (5) Laney coined the definition in the Big Data analytics world: volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources)--the 3V definition. (6)

    As life is being recorded and quantified in ways hard to imagine a decade ago, there is great promise in Big Data research, in particular for health purposes. The literature often addresses medical records as the source for health-related research in Big Data, (7) for example, electronic records document multiple aspects of medical care: quantitative and qualitative data of patients, imaging records, providers' documentation of health care delivery (medication and other services), narratives and genetic information, all of which provide important information on a person's physical condition. (8) But as many details of our lives are documented and easily available for analysis, a variety of nontraditional or even unstructured data types contains different kinds of health-related information, which is combined with traditional medical databases. Medical or genetic data can be connected to data found on multiple sources: social media, surveillance videos, education, military service, exercise regimens, credit card payments for physician visit co-pays, visits to alternative practitioners, over the counter medications, home testing products, tobacco products, diet habits, or leisure time preferences. (9) Since a single database may not provide a complete picture of a patient's condition or health history, combining information from multiple sources is often necessary and allows ways of research previously not possible through traditional methods performed on a narrow spectrum of samples. (10)

    On the one hand, health-related research combining multiple sources of Big Data offers the potential to explore hidden structures of the data, and extract important common features across data sets, in order to derive accurate results regarding complex questions in real-time. (11) Research can find correlation between multiple contextual variables found in public databases without having to interview a single patient. It contributes to a better understanding of people's life-style by creating an observational and even dynamic analysis, even when there are significant individual variations. (12) Such research can open the door for the promising world of personalized medicine and bring each individual customized treatments based on evidence drawn from their own lives. It can lead to the improvement of efficiency of health care delivery and public health decisions through standardized care and advance medicine, while saving costs nationally and globally.

    On the other hand, there is a constant conflict between the benefit of using multiple sources of information and the value of preserving confidentiality of medical information. The need to maintain patients' privacy is an ethical obligation inherent in the physician-patient relationship, believed to be essential in order to generate better medicine--from diagnosis to treatment. The rationale underlying the doctrine of confidentiality of medical information is to enable patients to benefit from free and open communication regarding their status. Since information is essential to treatment, it is of no wonder that health service providers have access to all or portions of a patient's health records, however health care providers have a duty to avoid disclosure of medical information they obtain. In legal terms, the patient has a right of privacy, which aims to restrict the disclosure of confidential information. (13)

    While the classical physician-patient model requires that information will be kept confidential, the more health information is present in electronic databases, the harder it is to maintain the privacy of individuals who are the subject of such information, and the greater is the potential misuse of information. Evidence of confidentiality breaches exist in State agencies in the US, healthcare organizations, (14) as well as in private organizations. (15)

    Free communication is altered when patients fear that their sensitive health information might be electronically disclosed. Such fear may compromise the health care they seek. As studies show, for fear of disclosure to unauthorized persons, patients may withhold information, giving an incomplete or misleading description of their condition. (16) In recent surveys, a substantial number of people said they would withhold data from their physician due to privacy concerns related to technology. (17) Patients concerned with privacy violations are also less likely to seek care or return for follow-up treatment. They may seek care outside of their provider network, compromising the benefits of care coordination. (18)

    Achieving ideal privacy or attempting to eliminate all possible breaches of confidentiality prevents society from the benefits inherent in research. However, researchers are often unaware of potential harms, especially given the large presence of health information in different sources of databases, some of which is voluntarily provided by users. (19) The next chapter reviews existing regulations, their nature and legal coverage. It highlights the challenges prevailing when combining data from multiple sources that are differently protected by the law.

  3. DIFFERENT SCOPES OF PROTECTIVE REGULATIONS

    In the case of health-related research in Big Data, policies and regulations should assure two dimensions of ethics: on the one hand, focused on harms, the protection of information itself and of subjects whose information is used. On the other, focused on benefit, that the benefit from the research overweighs potential harms. The first dimension relates to confidentiality of personal information held in databases, whether or not they are health-related, Big Data, publicly available, or not. The second dimension addresses the equilibrium between privacy concerns and benefits from the research. Research in health-related data from multiple sources jeopardizes both aspects.

    The legal frameworks addressing the release of data for health-related research purposes differ in levels of protection in terms such as the scope of the information covered and permitted disclosures: identified and de-identified information, held by entities covered or non-covered by laws, different uses of information, etc. This paper focuses on the HHS Protection of Human Subjects Regulations, and the Privacy Rule of Health Insurance Portability and Accountability Act (HIPAA), which represent the regulative framework of health-related research on the one hand, and of the operation of personal health information, on the other. (20)

    The Protection of Human Subjects Regulations are the leading...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT