The Legal Challenges of Generative Ai-part 1

Publication year2023
Pages40
The Legal Challenges of Generative AI-Part 1
No. Vol. 52, No. 6 [Page 40]
Kansas Bar Journal
July, 2023

Skynet and HAL Walk Into a Courtroom

BY COLIN E. MORIARTY

This is the first in a multi-article series discussing the legal implications of using computer programs that mimic human creativity. This article describes how the current generative AI technology works, examines potential legal challenges under the Copyright Act, and introduces questions to consider as this technology develops.

Before lawyers or businesses attempt to reliably make use of generative AI, they should know how it works, at least at a high level, and have some foundation in the law that addresses how generative AI operates. This article provides a high-level overview of the current technology, walks through how the current law might apply to the way generative AI is built, and poses questions about whether and how the law may adapt to accommodate this rapidly changing technology. Future articles in this series will delve into the potential limitations, risks, and legal issues specific to end users, including attorneys. These issues are being presented in multiple segments partly due to the size of the topic, but also because technology is changing so rapidly that it is likely new developments will require further comment.

Introduction to Generative Artificial Intelligence

Intelligent robots in popular culture have been portrayed as rigid and emotionless. Although computers showed superhuman skills in mathematical tasks, authors blithely proclaimed that humans still had the edge in creativity and that emotion and creative invention were beyond the reach of mere computer code. Yet, humans have been using the circuit board as a canvas for decades. We have created digital art, used word processors to create novels and legal briefs, used computer graphics for our movies, and used software to touch up our photos. We have even created entirely new forms of art, like video games, that depend entirely on computers for their existence. Our species has been using complicated arrangements of silicon for creative purposes practically since the moment we first shot lightning through them. In retrospect, then, it should not be all that surprising that computers are more capable of creative expression than supposed.

Programmers have not determined the precise algorithm for creative thought. To date, no one has been able to program the explicit instructions for a computer program that generates Expressionist paintings on command or writes a legal brief at the level of a human. Strangely, we may have created it anyway. Largely out of the public eye, researchers have sidestepped the design problem by inventing ways to train software to solve problems without needing to figure out the solution in advance. Prior thoughts about the limits of artificial intelligence may have really been about a human being's inability to articulate instructions and not actual limits on what computers can really do.

The public only saw glimmers of the progress being made in the last few decades when a program was able to play chess or win at Jeopardy.[1] As of late 2022, however, artificial intelligence that can generate creative works that appear to have been created by a human (generative AI) has clearly seized the public's attention. Stable Diffusion, an open-source generative AI program that can turn text inputs into art or photographs, was released in August.[2]ChatGPT, a type of AI that can respond to text prompts in an uncannily human manner, was released to the public in November.[3] Both have been the subject of intense interest.

One jaw after another has dropped as people realize how far the technology has come.[4]Despite the limitations of the current software and warnings from major tech CEOs, politicians, and researchers alike about how disruptive and potentially dangerous artificial intelligence is becoming,[5] the business world is now in a race to develop a technology that promises to change how white-collar work is done.[6] Surveys report that some businesses are already replacing workers with AI despite warnings that ChatGPT in particular is unreliable and shouldn't be trusted for "anything important."[7] Perhaps unlike other recently hyped technologies like virtual reality, NFTs, or blockchain, generative AI appears to be game-changing.

Any task that requires producing written or other creative work is potentially affected by generative AI. Google and Microsoft are now racing to implement the technology in their office suites.[8] New companies are selling or using AI in tools such as virtual personal assistants, writing guides, candidate screening, or customer service.[9] But, as with any disruptive new technology or automation, some will be harmed by the resulting changes in supply and demand.[10] Some artists, programmers, writers, and yes, lawyers, may be worried by the availability of technology that can do at least a superficially good job of replicating their work product at a lower price.

The law now must grapple with the questions raised by widespread use of software that can produce human-seeming creative works on demand. Copyright law, the major area of law that protects creative works in the United States, does not currently have clear answers for how generative AI may be trained or used, whether works created using generative AI have copyright protection, or many other questions. Copyright is a flexible area of law and can evolve with new technology. Still, it remains to be seen whether and how the law can adapt to the challenges of generative AI and what legal framework will best integrate this new technology into society.

Lawyers have a special interest in generative AI because it seems capable of performing or assisting with many of the mechanical aspects of law practice, such as document review, legal research, legal writing, and blogging.

LexisNexis, one of the leading legal research services, has announced generative AI functionality allowing lawyers to ask for simple legal briefs, letters, and other written material with citations.[11] According to Chief Product Officer Jeff Pfeifer, Lexis anticipates a commercial release of its AI-powered search, writing, and research capabilities within LexisNexis in late summer 2023.[12] In the meantime, he noted that the "[t]hing that scares me most is the amount of experimentation I'm seeing without understanding the technical infrastructure of the model setup."[13] Even if the tools are not powerful enough to replace associate attorneys just yet, some law firms are already experimenting with incorporating generative AI into their practices.[14]

Overview of the Technical Details of Generative AI

Understanding the capabilities and limitations of generative AI starts with knowing how the technology works, at least at a very high level. Generative AI programs are generally trained by presenting a neural network[15] with a large body of preexisting data and engaging in some form of repetitive machine learning to encourage the network to develop relationships between text input and particular output that fits the training data. 'There are different techniques within the umbrella of machine learning, but in general this entails seeing how well the existing model does and then adjusting the neural network in a direction that would have produced a better result. This process is then iterated a mind-bogglingly large number of times until the network gradually produces results that closely match the training data.[16] The resulting trained network is called a "model."

Current generative AI models require massive dumps of training data, largely collected from public sources on the Internet, in order to use this training process. The kind of data used depends on the purpose of the model being trained. Generative AI models that read and write human language are called large language models (LLMs) and are generally trained on large text databases curated to ensure the program gets a good sample of the kind of writing it is intended to simulate. ChatGPT, an LLM that seeks to emulate a virtual assistant, was trained on 570 gigabytes of text obtained from "books, webtexts, Wikipedia, articles, and other pieces of writing on the internet."[17]GitHub CoPilot, a computer code model, was trained on "natural language text and source code from publicly available sources, including code in public repositories on GitHub."[18] The training data used for other LLMs are being kept secret by their creators, but probably also consist of enormous text dumps of similar writings.[19] Generative AI programs dealing with images, video, or music are being trained on large datasets that are paired with text descriptions. Image-generating programs are trained using large sets of images together with text descriptions, such as LAION-5B or CIFAR-10, scraping images from the World Wide Web.[20]Video-generating programs are trained by using sets of narrated videos, again matching video data with text.[21] Music-generating programs can be trained on large datasets of music files.[22]

After initial training, models often are next put through a period of fine-tuning by using adversarial training or other techniques to warp the model in the direction of the desired final behavior. An LLM, for example, is first trained to accurately predict the next word[23] by testing it based on its training data. Then, it may be taken through a period of human-assisted reinforced learning involving a human subject ranking their satisfaction with the output of the LLM[24] or further training on a specific kind of writing to encourage the model to generate text of that kind.[25] The LLM ends up with a remarkable ability to predict the next word based on what it saw in its training data and what the human-based feedback preferred.[26]Similar kinds of fine-tuning can be done for other kinds of...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT