Let's Talk, ChatGPT: What Will the Judiciary's Future Look Like?

AuthorKlingensmith, Mark W.

In a 2016 article for The Florida Bar Journal, (1) I offered the following question for discussion: Could computers eventually diminish, or even eliminate, the need for human judges? At the time, I foresaw no danger that trial court judges would become obsolete but predicted that technological progress in artificial intelligence (AI), such as IBM's Watson program, posed the greatest danger to the existence of appellate court judges.

Appellate issues are presented to the courts by written submission, usually through briefs or motions, identifying the specific issues on appeal. These issues are phrased in a manner to allow appellate judges to analyze them according to an established body of law. The relevant underlying facts have been "found" by the lower tribunal; the appellate court considers those established facts according to the applicable law. Or, the court is asked to interpret the meaning of words or phrases in a law to properly apply to a given set of facts. Under either scenario, a computer program like [IBM's] Watson could be programmed to provide answers to such questions submitted to it. Well, move over Watson. Make room for ChatGPT, an AI language model produced by OpenAI that was released in late 2022 and is free for users. (2) A judge in Colombia even used ChatGPT to assist in drafting a court ruling in what was apparently the first public acknowledgement by a judge anywhere that an AI text generator had helped craft a legal decision. (3) The judge included the chatbot's full responses in the decision, along with his own insights into the applicable legal precedents, using AI only to "extend the arguments of the adopted decision." (4) Although the judge indicated the AI was "mostly used to speed up drafting the decision and that its responses were fact-checked, it's likely a sign that more legal and judicial uses of AI are on the way." (5)

The New York Times called ChatGPT "quite simply, the best artificial intelligence chatbot ever released to the general public." (6) The program, also known as Generative Pre-trained Transformer 3 (GPT-3) (7) generates sophisticated, human-like responses based on requests from users and derived from mountains of data, including a staggering amount of online published text. (8) To understand how it works, for example, if given the phrase "I drove to--", allowing the computer to complete the sentence, the model might predict the next words are "the store" with 5% probability, or "work" with 4% probability, etc. The model can then repeatedly predict subsequent words to add to those that have already been predicted (such as inserting an "and") to compose indefinitely long bodies of text, each sentence building upon the previous one. In March 2023, OpenAI released the new and improved version of the technology that powers its onloine chatbot based on the GPT-4 platform. (9)

ChatGPT was programmed and trained on a general-purpose body of language using a dataset of text called the Common Crawl corpus, which is a collection of web pages and documents from across the internet. (10) It was optimized for general-purpose dialogue, not to provide answers in specific technical areas. It can complete a wide range of tasks, ranging from trivial matters like joke creation to writing essays about the symbolism found in literary works. (11) Since it was made publicly available in November 2022, ChatGPT has been used to not only generate original essays, but also stories and song lyrics in response to user requests.

However, ChatGPT has shown to perform surprisingly well in responding to questions posed in certain technical areas such as computer programming, (12) data manipulation, (13) and medical diagnosis. (14) It has drafted research paper abstracts that have fooled some scientists. (15) It also performed "comfortably within the passing range" of the U.S. medical licensing exam. (16) Corporate CEOs have even used it to write emails or do accounting work. (17) At least one law firm is said to be putting ChatGPT to work in drafting its press releases. (18)

But how does it perform in non-technical fields like law and business? The answer may surprise you.

According to a study at the University of Pennsylvania, ChatGPT fared well on a business management exam at the Wharton Business School where it earned a B to B- grade. (19) In a paper detailing its performance, Prof. Christian Terwiesch said ChatGPT did "an amazing job" at answering basic operations management and process-analysis questions. (20) Unfortunately, it struggled with more advanced questions and made "surprising mistakes" with basic math that "can be massive in magnitude." (21)

Elsewhere, at the University of Minnesota Law School, professors used ChatGPT to assess its performance on law exams in four courses, then graded the tests blindly. (22) After completing 95 multiple choice questions and 12 essay questions, the bot scored "on average at the level of a C+ student, achieving a low but passing grade in all four courses." (23) The bot's performance may not be the sort one would be overly proud of, but it was nonetheless sufficient to earn college credit. (24) For someone who doesn't bother to attend class or read even a single chapter in the textbook, a C+ grade might be entirely acceptable. With some degree of human input and editing to the final product, it is not inconceivable that this C+ could possibly have been bumped up to an even higher grade--perhaps a B, or even a B+.

While ChatGPT using the GPT-3 technology proved to be adept at answering law school essay questions, it came up short on the multi-state multiple-choice portion of the bar exam, (25) but did pass both the evidence and tort sections. (26) However, according to OpenAI, the new GPT-4 is able to score a 1,300 (out of 1,600) on the SAT and can now perform among the top 10% of students on the Uniform Bar Examination. (27) In a demonstration of its capabilities handling legal questions, OpenAI's President and Co-Founder Greg Brockman gave the new bot a bar exam question containing several paragraphs to analyze, and the initial answer it provided was correct but "filled with legalese." (28) When asked to explain its answer in plain English so a layperson could understand, the bot was also able to do that as well. (29)

One adjunct law professor in Texas recently tested ChatGPT's abilities in the legal academic realm by giving it exercises and quizzes he designed for his law and computer science graduate students. (30) For one question, he asked ChatGPT about the differences between "electronic discovery and computer forensics" and compared the response to his own. (31) What he got back from Chat-GPT was a surprisingly good answer that not only adequately addressed the topic but even mentioned its application in another field:

My response in class focused on the relative accessibility and intelligibility of the [electronically stored information] we deal with in e-discovery versus digital forensics, and I didn't tie forensics to criminal investigations because so much of my work in the field has concentrated on civil cases. But I can't say I did any better than the AI. (32) So, if ChatGPT can perform well enough to be an average law school student or an average law school professor, could it also perform as well as an "average" lawyer by providing accurate legal answers to questions at only a fraction of the cost? Not quite yet.

In another test, ChatGPT was asked to write a blog post for lawyers about undue influence claims in California trust and estate...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT