Ghost in the network.

Author:Bambauer, Derek E.
Position::III. The Known Unknowns through Conclusion, with footnotes, p. 1050-1091

    Known security bugs are everywhere.

    Most hackers exploit known vulnerabilities. Examples are legion. The massive breach of Heartland Payment Systems occurred because the payment processor's website was vulnerable to an SQL injection attack--a widely known flaw. (283) Heartland is in good company: a study by Hewlett-Packard (HP) found that 86% of tested web applications were vulnerable to such injection attacks. (284) Sony's PlayStation Network was hacked because the company used an outdated, unpatched version of the Apache web server software. (285) Wyndham Hotels lost data on 500,000 accounts to hackers in Russia because their servers used default credentials and passwords and because the chain stored credit card data in plain text. (286) The SCADA systems that control many public utilities' operations often use default passwords (287)--in fact, one study found 7200 SCADA systems connected to the Internet with default passwords. (288) A significant fraction of SAP enterprise resource planning (ERP) servers are exposed to the Internet, even though the software is increasingly targeted by attacks that exploit known weaknesses. (289) The average website examined in the 2011 study contained 79 serious security vulnerabilities such as cross-site scripting weaknesses; this was actually a dramatic improvement from 2010, when the average number of flaws was 230. (290)

    Remediation is slow. Companies often take nearly a year to apply patches supplied by software vendors, leaving them open to attacks. (291) Experts estimate that firms using SCADA software--including many in critical infrastructure areas such as energy utilities--only apply 10%-20% of available patches. (292) The availability of the patches indicates that solutions to these security problems are known and available to firms. Though implementing those solutions may be expensive, time consuming, or complex, it is possible. To a significant degree, users of information technology remain insecure to known unknowns--attacks against previously described and patched vulnerabilities--by choice. As described above, this choice may be rational for each actor in purely economic terms, but it results in unacceptable levels of insecurity for society.

    Building on prior work in information-based cybersecurity, (293) and applying insights from normal accident theory, this Article proposes a pair of design principles to guide regulation of the known unknowns: disaggregation and heterogeneity (or, more colloquially, "divide and differ"). The regulatory goal is for organizations to implement these principles in their computer systems. Disaggregation splits information into multiple, separated data stores. The loss of any single store, or perhaps several of them, does not confer all of an organization's information upon an attacker. Heterogeneity requires that organizations use multiple types of hardware and software: UNIX with Windows Server for operating systems, MySQL with IBM DB2 for databases, or Juniper equipment with Cisco for routers. A successful attack on any variant of hardware or software will not compromise the entire infrastructure (and hence information) for an organization. (294) In concert, these principles seek to reduce the effect of a cybersecurity failure rather than to prevent it. In normal accident theory's terms, they seek to make the system less tightly coupled. (295) They operate in parallel with efforts to reduce the likelihood of such a failure by making systems less vulnerable and by detecting and interdicting attacks when they occur. This aspect of cybersecurity--resilience--is significantly underaddressed by both scholars and policymakers.

    A Resilience

    Disaggregation and heterogeneity increase resilience. (296) Spreading information across multiple data stores and using multiple versions of operating systems, hardware, and applications does little to prevent cyberattacks. Indeed, it may provide hackers with a greater attack surface: there are more locations for weaknesses such as misconfiguration to occur, and there are more lines of code to comb for exploits. (297) For cybersecurity scholars and policymakers fixated on stopping attacks, such an approach is anathema.

    This risk-spreading approach, however, is perfectly in line with normal accident theory. (298) Inevitably, software bugs occur, and hackers discover how to exploit them. (299) There will be attacks--successful ones--against information technology systems. The key is to reduce the effects of these attacks. This lessens the harm they cause and makes hacking less attractive by reducing its payoff. Charles Perrow cites marine shipping as a similar instance where the configuration of incentives (as with cybersecurity's externalities problems) leads inexorably to accidents. (300) The approach this Article proposes, by analogy, is not to increase the vigilance of captains or the accuracy of nautical maps. It is to ensure that ships are built with watertight compartments so that an accident does not sink the vessel. (301)

    The divide-and-differ strategy reduces the effects of successful attacks or breaches. The two pillars of the strategy also reinforce one another. Partitioning information into multiple repositories forces an attacker to compromise several systems to gain access to all of an organization's data. (302) This approach is particularly effective if related files--such as the plans for the Joint Strike Fighter or elements of a bank customer record--are dispersed across the data stores. (303) Then, an attacker must break into more than one system to gain access to one set of files. Employing different elements within an IT system, such as multiple operating systems, makes any single, successful attack less effective. (304) If the files are dispersed across multiple locations, each with a different code base, the attacker will need multiple exploits to gain access to or alter the data.

    In normal accident theory terms, using heterogeneous systems (variegated code and hardware) makes those systems less tightly coupled. (305) A failure in one part of the system affects fewer of the other parts than if the system were homogeneous. (306) This limits the spread, and therefore the effects, of such a failure. While introducing more components into the system could make it more interactively complex, which increases the risk of error, this possibility is mitigated by disaggregation. (307) Under disaggregation, IT systems would be effectively partitioned, reducing the linkage between components in the different data stores. An exploit compromising a Windows-based computer in one data warehouse is less likely to have a spillover effect on a UNIX-based computer in a second warehouse when those two repositories are separated. (308) Thus, heterogeneity decreases system coupling and increases the complexity of system interactivity. Nonetheless, the boost in interactivity is likely offset by the disaggregation part of the known-unknowns strategy.

    Splitting an organizations information into multiple parts, storing them separately, and hosting them on different software and hardware may increase the risk of a successful attack, but it greatly decreases that attack's payoff.

    1. Disaggregation: Divide and Conquer

      Disaggregation reduces the harm that accrues from a successful attack against a system by dividing data stores--information--into multiple, separate components. This reduces the payoff from a breach: a hacker gains only a fraction of the organization's information from breaking into one data store. Disaggregation also increases the cost of the attack: the hacker must compromise multiple systems to access the entirety of the data.

      Though data storage is increasingly spread across multiple computers, such as in cloud computing, these systems are designed such that the information appears to be in one location. (309) For example, even if a database is spread across several servers in Amazon's (310) or Google's (311) cloud computing platforms, it looks like a seamless whole to users. Each part of the database is linked to, and accessible by, every other part. (312) Disaggregation breaks those links. Parts of the data store are isolated from one another, logically and perhaps physically. (313) Gaining access to one part does not confer access to any other part.

      There are good examples of disaggregated systems currently in use, such as Mini-Sentinel. The Food and Drug Administration (FDA) must perform surveillance of prescription drugs after they are approved for marketing in the United States. (314) The enabling legislation requires the FDA to avoid revealing individually identifiable health information when queries are made on the resulting data. (315) To comply, the FDA built Mini-Sentinel. (316) Data sources, such as pharmaceutical companies, retain the postapproval data and structure it in standardized form. (317) Both the FDA and the data sources use Mini-Sentinel's software. (318) Queries on the data are submitted from the Mini-Sentinel Center to the data sources, which return summary responses to queries. (319) Mini-Sentinel was designed to respond to privacy concerns, but it also provides a compelling model for security worries. (320) A breach of any single data source compromises only that entity's information. And, an attack against the querying computer at the Mini-Sentinel Center can only obtain summary information, rather than individually identifiable data.

      The benefit of disaggregation--reducing the payoff of an attack and forcing an attacker to gain control over more systems--comes at two costs. First, using the data (accessing it, altering it, or both) for authorized purposes requires that the isolated parts of the information are joined, at least in part or intermittently. (321) Performing analysis on large datasets thus becomes computationally more expensive and depends on reliable connectivity between the subsets. (322)

      Analysis of the...

To continue reading