From Warhol to War on Hal: Copyright Infringement and Fair Use as Applied to Artificial Intelligence After the Supreme Court's Warhol Decision

JurisdictionUnited States,Federal
CitationVol. 6 No. 6
Publication year2023

[Page 409]

Richard A. Crudo, Ivy Clarice Estoesta, and William H. Milliken *


This article describes copyright infringement and fair use issues surrounding generative artificial intelligence and, in particular, the recent lawsuits against Stability AI's art generator.

Generative artificial intelligence (AI) has entered the mainstream. The technology harnesses the power of machine learning by training software to study patterns in data and using the acquired knowledge to generate new content in response to human input. The technology has led to exciting possibilities for enhancing creativity, from generating logos and social media postings to producing new software, poetry, musical compositions, and artwork. But it has not done so without controversy.

Earlier this year, stock-image aggregator Getty Images and a trio of artists separately sued London start-up Stability AI for copyright infringement based on its "Stable Diffusion" technology. 1 Stable Diffusion is an open-source AI art generator that creates photorealistic images in response to users' text prompts. 2 It does so by "training" itself to identify relationships between objects and text using billions of digital images scraped from the web. In a sense, this process is similar to an experienced artist using her decades' worth of experience to create customized paintings. The difference is that Stable Diffusion's training is carried out more quickly and on a much more massive scale.

The problem, according to the plaintiffs, is that Stability AI does not license any of the images it uses to train the AI, many of which are protected by copyright. The plaintiff artists insist that some of the generated images are too similar in appearance to the copyrighted images on which the AI is trained.

[Page 410]

Stability AI (and its co-defendants) have sought to dismiss the lawsuits. But, if the cases are allowed to proceed, courts will have to address novel questions of copyright law, including whether the enigmatic doctrine of "fair use" applies to AI technology. They will have to do so, moreover, against a recently altered legal landscape, thanks to the Supreme Court's Warhol ruling issued earlier this year. There, the Court held that the commercial licensing of Andy Warhol's Orange Prince silkscreen portrait for use on a magazine cover did not constitute transformative use of a copyrighted photo of music artist Prince. The Court reached this conclusion even though the portrait imbued the original photo with new expression, purportedly to convey the dehumanizing nature of celebrity. In so holding, the Court appeared to cabin transformative uses to those that do not serve substantially the same purpose as the originals. 3

As explained below, Warhol may have a significant effect on copyright liability in the generative-AI context, and more immediately on the copyright suits involving Stability AI.

The Stable Diffusion Technology: A "21st-century Collage Tool" or a Sophisticated System for Creating New Works of Art from Scratch?

The plaintiff artists describe Stable Diffusion as a "21st-century collage tool" analogous to an Internet search engine that "looks up" a user's query in a "massive database" and patches together bits and pieces of digital images that hit on the query. 4 But that analogy has drawn criticism as overly simplistic and inaccurate. In reality, Stable Diffusion uses machine learning to create images from scratch.

At a high level, the image-generation process boils down to three steps. 5 The first is the "input step" in which the system amasses as much source data as possible to train the AI. Stable Diffusion gets its training data from a collection of five billion publicly available images (and their accompanying captions, annotations, and metadata) scraped from the web and indexed by German nonprofit Large-Scale Artificial Intelligence Open Network (LAION). The plaintiffs allege that these training image-text pairings are ingested by Stable Diffusion and loaded into computer memory, but it is unclear whether copies of the images are made at this stage (and, if so, where they are saved). What is clear, though, is that the training images are not stored in the Stable Diffusion software itself,

[Page 411]

which is too small to contain copies (even compressed copies) of such images.

The second step is the "training step." The system first encodes the training images and accompanying text by converting them into mathematical representations called "latent" representations. One can think of these representations as giant arrays of numbers, with the values and positions of the numbers defining various aspects of the images. These encoded images take up less space than the originals and result in faster processing. Then, having encoded the image-text pairs, the system iteratively tweaks the data by gradually adding "noise" until the images become distorted and are no longer recognizable (a process known as "diffusion"). 6 Once a set of diffused images is created, the system then reverses this process, iteratively learning to remove the noise until the original images are reconstructed and matched to their corresponding text. At first, the system does a poor job of this. But by repeating the process many times on billions of images, the AI learns how to identify specific objects, lines, colors, shades, and other qualities from random noise—or, stated differently, to create an image containing such attributes.

The final step is the "output step." When a user enters a text prompt, the system converts the text into a latent representation. Then, beginning with a latent representation of a random diffused image, the system uses its training to determine how to successively remove noise so as to reveal an image "conditioned" by the user's prompt; that is, containing the attributes referenced in the prompt (much like a sculptor chiseling away at a block of marble to create a statue matching a customer's request). The resulting latent representation is then converted to a visual image that is displayed to the user, within seconds of the user inputting her prompt. For example, the text prompt "water color painting of contemplative dog wearing a hat and eating ice cream in the rain" produced the images in Figure 1.

Importantly, the system is designed so that no particular training image has an undue influence on the AI. In one study, researchers intentionally tried to recreate particular images and were successful just 0.03% of the time. 7 Nevertheless, users can request Stable Diffusion to generate an image "in the style" of a particular artist or work, which will cause the system to emulate features derived from that artist's work(s). 8 Examples of this type of output are shown in Figure 2.

[Page 412]

Figure 1 Stable Diffusion Output Using the Prompt "water color painting of contemplative dog wearing a hat and eating ice cream in the rain"

Figure 2 Stable Diffusion Output Using a Similar Prompt as Shown in Figure 1 "in the style" of Different Artists and Works

[Page 413]

Copyright Infringement and Fair Use as Applied to Stable Diffusion

The purpose of copyright law is to encourage creators to create so as to "enrich[] the general public through access to creative works." 9 The law achieves this purpose by striking a balance between "rewarding authors' creations while also enabling others to build on that work." 10

To that end, copyright law allows creators to block others from reproducing and preparing variations of their copyrighted works (called "derivative works") that are "substantially similar." But it also limits creators' rights to control their works. For example, general ideas such as motifs, genres, stereotypes, and tropes are not protectable—only specific expressions and arrangements of those ideas can be copyrighted. 11 Further, the doctrine of "fair use" excuses certain types of copying that would otherwise constitute infringement. In effect, the doctrine provides "breathing space" for certain infringing uses such as criticism, commentary, news reporting, teaching, scholarship, and the like. 12 To determine whether a particular infringing use qualifies as fair use, courts assess the following list of nonexhaustive factors:


1. the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
2. the nature of the copyrighted work;
3. the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
4. the effect of the use upon the potential market for or value of the copyrighted work. 13

In weighing these factors, courts "must keep in mind the public policy" underlying copyright law and determine whether permitting the use to continue is likely to "stimulate artistic creativity for the general public good." 14

Does Stable Diffusion Infringe Copyrighted Training Images?

While much attention has been given to whether Stable Diffusion's use of training images qualifies as fair use, the antecedent

[Page 414]

question is whether such use infringes in the first place. There are three potential acts of infringement that could be relevant:


1. Ingesting copyrighted images for subsequent use in training the AI (input step);
2. Making intermediate copies of the images during training (training step); and
3. Generating new images based on the training and users' text prompts (output step).

With regard to the input step, training images are reproduced when they are ingested by Stable Diffusion so that latent representations can be created. And presumably, those copies are retained by the system to permit it to assess the efficacy of the model. If reproductions are in fact made at this stage, the copying would likely qualify as an act of infringement. But the details surrounding the copies, such as what they look like and where and how long they are stored, remain a mystery. These details will likely come to light during discovery (assuming the...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT