AuthorSmith, Bryant Walker

CONTENTS I. Introduction 89 II. Dall-E, Machine Learning, And Image Generation 90 III. Nondeterministic Systems 93 IV. Palsgraf v. Long Island Railroad Co 95 V. Cardozo's Facts 99 VI. Andrews's Facts 110 VII. Some Human-Generated Thoughts 117 VIII. Conclusion: Cardozo Versus Andrews 123 IX. Postscript: ChatGPT 126 I. INTRODUCTION

What happens when we ask a leading artificial intelligence (AI) tool for image generation to illustrate the facts of a leading law school case? This article does just that. I first introduce this tool specifically and machine learning generally. I then summarize the seminal case of Palsgraf v. Long Island Railroad. For the main event, I show the images that the tool created based on the facts as the majority and dissent recount them. Finally, I translate this exercise into lessons for how lawyers and the law should think about AI.


    DALL-E is a computer tool that generates photorealistic images based on text supplied by the user. (1) For example, in response to the phrase "a fancy law school classroom with a cat professor," DALL-E created these four original images:

    DALL-E is developed and maintained by the OpenAI organization, which seeks "to ensure that artificial general intelligence (AGI)--by which we mean highly autonomous systems that outperform humans at most economically valuable work--benefits all of humanity." (2) DALL-E itself is an example of specific artificial intelligence rather than AGI, which does not yet exist. (3) When this article was written, access to DALL-E was by invitation only and conditioned on adherence to OpenAI's content policy. (4) It is now available publicly. Similar tools are also available. (5)

    In general, tools for recognizing or generating images initially learn by processing huge numbers of images that are each linked in some way with descriptive text. (6) Through this training, these artificial neural networks develop relationships between various visual and textual elements to create what is in effect a much more complex and dynamic version of a thesaurus. (7)

    There are a range of approaches to training these neural networks. In a traditional model of supervised learning, humans manually label each image in the training dataset. One popular dataset, ImageNet, "required over 25,000 workers to annotate 14 million images for 22,000 object categories." (8) Amazon's Mechanical Turk is an example of a platform that connects developers with workers who are paid for each image that they "tag" by "writ[ing] three words or short phrases that summarize its contents." (9)

    Alternatives to this formal labeling make use of potential relational information already available on the web, courtesy of its ordinary human users. (10) Billions of internet photos are connected in some way with text--descriptive captions, hashtags, alternative labels for accessibility, filenames, and metadata--that can offer clues to their content and meaning. (11) Developers of a DALL-E building block, for example, "constructed a new dataset of 400 million (image, text) pairs collected from a variety of publicly available sources on the internet." (12) This is known as "scraping." In what is often called semisupervised or unsupervised learning, (13) a neural network can then develop its own understanding of these scraped data.

    Once a neural network has begun to develop the requisite associations, it can apply its training to data outside its original training dataset. It may be directed to classify additional images, and its performance may be compared to previous models or evaluated by humans. For example, ordinary internet users who prove they are human by completing Google's reCAPTCHA prompts (e.g., "Select all images with crosswalks") confirm or challenge labels provisionally assigned to the reCAPTCHA images. (14) Neural networks that incorporate this feedback are engaging in what is called reinforcement learning--analogous to how one might train a dog to play fetch. (15)

    Image generation tools build from these associations between concepts and visual elements. DALL-E "uses a process called 'diffusion,' which starts with a pattern of random dots and gradually alters that pattern towards an image when it recognizes specific aspects of that image." (16) To offer a very rough analogy, this is like an enormous game of Battleship in which initially wild guesses are iteratively refined based on feedback. This randomness also means that DALL-E will generate different images every time it runs--even in response to the exact same prompt.


    Such a system, in which identical inputs can produce varying outputs, is called nondeterministic. (17) In contrast, a calculator is deterministic: Entering "1" and then "+" and then "1" will always return "2." Due to the randomness inherent in its operation, a nondeterministic system can change its "best guess" with each run. When such a system is explicitly probabilistic, it may also be able to express a degree of confidence in its guess--roughly analogous to how a meteorologist might predict a "90% chance of rain." Even if the probabilistic prediction ("90% chance of rain") is correct, a binary prediction derived from it ("it will rain") will occasionally be incorrect.

    These concepts are familiar to law itself--an indeterministic system that often pretends otherwise. The "preponderance of the evidence" or "more likely than not" standard common in civil litigation implies confidence greater than 50%. The "beyond a reasonable doubt" standard common in criminal law, while not expressed to juries as a probability, is nonetheless described academically as something like 95% confidence. (18) In both instances, this means that a fact finder will at least occasionally be wrong--hence Blackstone's famous adage that "the law holds that it is better that 10 guilty persons escape, than that 1 innocent suffer." (19)

    This adage also helps to illustrate two concepts important to the evaluation of any system. A false positive is the assertion that something is present when in fact it is absent: declaring the guilt of a person who is actually innocent, diagnosing a disease that a patient does not actually have, or perceiving a "phantom" child in the road who is not actually there. (20) In contrast, a false negative is the assertion that something is absent when in fact it is present: declaring the innocence of a person who is actually guilty, failing to diagnose a disease that a patient actually has, or failing to perceive a real child who is actually in the road. When the ground truth is known or assumed, (21) a system's performance can be described in terms of its false positives and false negatives. (22)

    Importantly, reducing false negatives may mean increasing false positives--and vice versa. (23) Replacing the "beyond a reasonable doubt" standard with a "preponderance of the evidence" standard in criminal trials would decrease the number of guilty defendants who are acquitted (false negatives) while increasing the number of innocent defendants who are convicted (false positives). Increasing the sensitivity of a Covid-19 test would reduce the share of results that are falsely negative while increasing the share of results that are falsely positive. An automated emergency braking system that detects every real object in the road might also stop suddenly for "phantom" objects. (24) A railroad on Long Island that seeks to avoid assisting anyone carrying dynamite might fail to assist others who are not.


    Palsgraf v. Long Island Railroad Company is one of the most famous court cases in all of law school. (25) It is the subject of numerous academic articles, (26) videos, (27) cartoons, (28) and songs. (29) Many of these focus on the case's somewhat bizarre facts, the recitation of which is surely unnecessary for many lawyers.

    Nonetheless, here are those facts (maybe). (30) While helping a would-be passenger board a moving train, a Long Island Railroad employee knocked a package to the ground. (31) The fireworks concealed in the package exploded, and either this explosion or the panic of the crowd toppled a platform scale, which seriously injured Helen Palsgraf. (32)

    When the case eventually reached New York's highest court, its famous chief judge, Benjamin Cardozo, and one of his colleagues, William Andrews, disagreed about whether the defendant railroad owed a duty to the plaintiff. (33) Writing for the majority, Cardozo declared that the railroad could not be liable to Palsgraf because it had done nothing wrong to her: "The conduct of the defendant's guard, if a wrong in its relation to the holder of the package, was not a wrong in its relation to the plaintiff standing far away. Relatively to her it was not negligence at all." (34) In dissent, Andrews articulated a more expansive vision of duty. "Every one owes to the world at large the duty of refraining from those acts that may unreasonably threaten the safety of others." (35) The railroad had breached this duty by its employee's careless dislodging of the package, and the jury had reasonably concluded that this breach was a proximate cause of Palsgraf's injuries. (36)

    While Cardozo's view prevailed in the case (to the detriment of Palsgraf herself), Andrews has largely prevailed in modern common law. (37) Many state courts as well as the authors of the Restatement (Third) of Torts have adopted his view of duty. (38) Moreover, his characterization of the limits of liability is itself classic: "[B]ecause of convenience, of public policy, of a rough sense of justice, the law arbitrarily declines to trace a series of events beyond a certain point. This is not logic. It is practical politics." (39) At the same time, the broader issues at the heart of these two opinions remain contested: What are the appropriate limits on the liability of a defendant for their unreasonable conduct? Should these...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT