Prescriptions and Descriptions: AI and the Language of Law

There's Something Afoot in the 11th Circuit...

Sep 25, 2024

The American public largely holds wild misconceptions about lawyers. I blame popular culture.1

Suits' on Netflix': Why this old show is so popular now — “Okay, fine, I’m not really a lawyer. But I did stay at a Holiday Inn Express last night.”

Consider my mother as an example. She was a fan of Suits (2011-2019), a television show set in a fictional corporate law firm in New York City. Since I also run a corporate law firm—though of a different sort—it’s reasonable that she would (and still does?) ask me questions about my career through the prism of the show.

Our conversations would typically unfold like this:

Mom: Be honest—have you ever [insert whatever outrageous thing the lawyers did on Suits that particular week]?

Me: No, absolutely not. In fact, that sounds unethical, possibly illegal. Did they really do that?!

Mom: Oh, you know what I’m talking about.

Me: I really don’t. You know that I pretty much just read, write, and talk on the phone all day, right?

Mom: [awkward pause] So, how are my grandchildren?

And so, our conversations often fall flat. And yet, for the many inaccuracies perpetuated by pop culture, a few of the lawyer stereotypes ring true—particularly when it comes to our obsession with words.

Words Get In The Way2

Steve Martin put it best: "Some people have a way with words, and other people...oh, uh, not have way."

Lawyers have that way.

Pin page — “Those French have a different word for everything!”

Lawyers love words like ~~Michelangelo loved marble~~ ~~influencers love selfies~~ ~~my therapist loves uncovering childhood trauma~~… You get the idea. Super into words, lawyers are. Bigly. Consider this Oliver Wendell Holmes quote:

“A word is not a crystal, transparent and unchanged; it is the skin of a living thought and may vary greatly in color and content according to the circumstances and the time in which it is used.”

I mean, come on. Who but a jurist could write something like that?

And yet (of course), he’s not wrong.

Oliver Wendell Holmes Jr. - Quotes, Common Law & Education — They don’t teach mustaches like that in law school

Early on in my legal career I litigated a case that hinged on the plain meaning of the phrase “based on.” The amount of debate over those two simple words was staggering. Both plaintiffs and defendants cited dictionaries, films, literature—even television shows—to argue their respective interpretations.

This was not some academic exercise. Our goal was not to demonstrate how well-read we were, but rather to shape meaning. And for that sort of exercise, dictionaries3 are often just a starting point. Often times, there can be a chasm between a dictionary definition and how folks use language in common parlance. And when debating plain meaning, that gap is everything.

One of these things is not like the other

Not sold yet? Consider the word “sigma.” According to Webster’s, is the 18th letter of the Greek alphabet. However, my 11-year-old son tells me it also means “a cool dude,” a definition he likely picked up from a friend, who probably heard his older sibling use the phrase, which they in turn almost certainly heard on a Tik Tok or Reel, and so on.4

Language is fluid. Words change.

Ordinary Meaning Through Extraordinary Means

There’s something tremendously exciting happening in the 11^th Circuit, and not enough people are talking about it5

There’s a judge – Judge Newsom6 Back in May, in a concurring opinion in Snell v. Specialty Insurance Company, Judge Newsom analyzed whether the term "landscaping" could encompass installing a trampoline for insurance purposes. As the term wasn’t defined in the policy, the case hinged on the "common everyday meaning of the word." Judge Newsom offered a provocative suggestion: Could large language models (LLMs) like OpenAI’s ChatGPT inform legal interpretation?

Is Trampoline Job Landscape Work? AI Says Yes. 11th Circuit Says It's Not Covered — The trampoline at issue. Feels kinda lanscape-y, right?!

In his opinion, Judge Newsom carefully analyzed dictionary definitions but found them lacking. He then turned to ChatGPT, whose response he found surprisingly sensible and aligned with his own experience of how people might use the word “landscaping” more broadly than traditional dictionaries suggested. Judge Newsom even cross-checked the results with Google Bard, yielding similar findings.

He went further, proposing that LLMs, with their vast language training, could assist in understanding how ordinary people use language in context—perhaps even better than traditional dictionaries. LLMs, Judge Newsom argued, are accessible, capable of understanding context, and could democratize legal interpretation by providing a widely available, inexpensive research tool.

While he acknowledged the risks—hallucinations, potential manipulation, and concerns about dystopia—he concluded that it’s no longer far-fetched to think that LLMs could offer valuable insights into the common everyday meaning of words. “At the very least,” he wrote, “it no longer strikes me as ridiculous to think that an LLM like ChatGPT might have something useful to say about the common, everyday meaning of the words and phrases used in legal texts.”

Into the Glass Elevator

Great Glass Elevator | Charlie and the Chocolate Factory Wiki | Fandom — A photo of Judge Newsom at work

Turns out that Judge Newsom was just warming up. Just a few weeks ago, in another concurrence in USA v. DeLeon, he explored a question that arose from his use of LLMs: What should we make of the fact that LLMs sometimes give subtly different answers to the same question?

Initially spooked by this variation, Judge Newsom came to appreciate it, viewing the slight differences as a reflection of real-world language usage. He likened it to how real people might respond if asked the meaning of a phrase like "physically restrained." Responses would likely differ around the margins but reveal a common core. In Judge Newsom’s view, this commonality reflects ordinary meaning, with the variation capturing the organic and sometimes messy nature of language.

Judge Newsom argued that the fact LLMs produce slightly different responses underscores their potential value in legal interpretation. They reflect how real people use language in real contexts—an important tool for judges seeking to discern ordinary meaning. And, just like in Snell, he goes out with a bang. Dare I say, a bigger bang:

“Remember, our aim is to discern ‘ordinary meaning.’ Presumably, the ideal gauge of a word's or phrase's ordinary meaning would be a broad-based survey of every living speaker of American English—totally unrealistic, but great if you could pull it off. Imagine how that experiment would go: If you walked out onto the street and asked all umpteen million subjects, ‘What is the ordinary meaning of “physically restrained”?,’ I think I can confidently guarantee that you would not get the exact same answer spit back at you verbatim over and over and over. Instead, you'd likely get a variety of responses that differed around the margins but that, when considered en masse, revealed a common core. And that common core, to my way of thinking, is the ordinary meaning.”

Think that was tasty? It gets even better:

“The fact is, language is an organic thing, and like most organic things, it can be a little messy. So too, unsurprisingly, are our efforts to capture its ordinary meaning. Because LLMs are trained on actual individuals’ uses of language in the real world, it makes sense that their outputs would likewise be less than perfectly determinate—in my experience, a little (but just a little) fuzzy around the edges. What's important, though—and I think encouraging—is that amidst the peripheral uncertainty, the LLMs’ responses to my repeated queries reliably revealed what I've called a common core.
Now, I'm not at all sure that we've yet come up with the perfect method for tapping into the models’ seeming ability to quantify and communicate that core. I'm not even sure that existing LLMs (warts and all) can realistically accomplish that goal. But it does strike me that some marginal uncertainty is inherent in the assessment of ordinary meaning, and I (now) view the fact that LLMs appear to reflect that uncertainty as a virtue rather than a vice.”

The Democratization of Language

This brings us to the crossroads of this piece. A kind of intellectual fork in the road, if you will, where you (yes, you, the reader) decide whether you’re in or out.

What Did Robert Johnson Encounter at the Crossroads? (Part 1 of 2) — This is you!

Let’s agree that:

dictionaries are a kind of snapshot of a word at a point in time; and
LLMs are a kind of mirror reflecting a word’s real-time usage.

Dictionary definitions v. LLM definitions

The dictionary’s competitive advantage must be that it is based upon a sounder ideology that the sort of fluid democracy inherent in LLMs, right? Sure, Newsom’s point with respect to LLM common cores and their inherent variability is intellectually appealing and all. But we believe in dictionaries the way that we believe in ~~our political leaders~~ ~~network news anchors~~ ~~The New York Times~~… You know what I mean. Dictionaries have people manning the shop in the way that LLMs do not, and we like people manning shops. Right? Right?!?!

David Foster Wallace, Mary Karr, & the Tyranny of Genius - The Atlantic — This gentleman is about to question the shopkeeper

As it turns out, the answer to that question is something like: Kind of.

In the April ’01 issue of Harper’s, David Foster Wallace (DFW) wrote a (terrifyingly prescient) article entitled “Tense Present: Democracy, English, and the Wars over Usage.” The article is about… well, it’s about a lot of things (it’s DFW, after all). But its thematic center is DFW’s analysis of Bryan A. Garner’s oft-cited A Dictionary of Modern American Usage (ADMAU).

Garner wrote ADMAU himself. No editorial staff. No nothing. All Garner. And that’s really, really important:

“We regular citizens tend to go to The Dictionary for authoritative guidance. Rarely however, do we ask ourselves who decides what gets in The Dictionary or what words or spellings or pronunciations get deemed ‘substandard’ or ‘incorrect.’ Whence the authority of dictionary-makers to decide what’s OK and what isn’t? Nobody elected them, after all. And simply appealing to precedent or tradition won’t work, because what’s considered correct changes over time.”

He then echoes Holmes (or Newsom, depending on how you’re tracking all of this):

“English itself changes over time; if it didn’t, we’d all still be talking like Chaucer. Who’s to say which changes are natural and which are corruptions? And when Bryan Garner or E. Ward Gilman do in fact presume to say, why should we believe them?”

Queue the sizzle reel:

“[W]hat Garner’s authority comes down to – what it really, truly means – is something like, ‘Trust me.’ It’s the boldest, most ambitious, and also most distinctively American of rhetorical Appeals, because it requires the rhetor to convince us not just of his intellectual acuity or technical competence but of his basic decency and fairness and sensitivity to the audience’s own hopes and fears.”7

Bryan A. Garner – Audio Books, Best Sellers, Author Bio | Audible.com — Do you trust this man?

What DFW’s reasoning suggests is that the error in Judge Newsom’s “LLM concurrences8,” if any, is his humility. That is, he need not bend a knee to what are fundamentally antidemocratic (autocratic?) texts. Rather, I would argue, if the fluid common core is (as I suspect) a function of the democratic approach to lexicography that is inherent in the LLMs themselves, that is a good thing. Maybe a great thing.

Now, where the h*** does that leave us?

The Reasonable Person in One Click

Before you get too worked up, remember: there are good reasons that we found ourselves in this linguistic pickle. Let’s say that you are clerking for Oliver Wendell Holmes in 1903 and he asks you to parse a phrase. Your options are probably something like:

Look up the phrase in a dictionary; and
Analyze other cases that have parsed the phrase; or
You’re fired.

That is to say, dictionaries long have (and still, to some extent, do) serve a useful purpose. I’m not suggesting that they are simply good doorstops.

A large hardcover dictionary being used as a doorstop to hold open a wooden door. The scene shows the corner of a room with a door slightly ajar, and the dictionary placed sideways on the floor, wedged between the door and the doorframe. The dictionary is old with a faded cover, and the door is painted white with some light wear and tear. Natural light spills into the room from outside. — Image courtesy of GPT-4

However, the advent of information technology in the ‘90s effectively blew up (or perhaps should have blown up) that type of simplistic analytical framework. Google analytics data would probably give us a more accurate descriptive definition of a word than a dictionary would even in the early ‘00s. But it was nascent technology that was difficult to parse and just too [new/different/spooky] for your everyday lawyer or jurist to change their ways. And, who knows, maybe they were right.

The anti-technology movement — Technology scares people

But as the LLM concurrences illustrate, we are probably “there” now. Or at least really close.

And that’s not even the most interesting bit…

Let’s take this a step further. Flash forward a couple of years. The LLMs have gotten really, really good. They have ingested all written human language and are pulling in all recordable data. Maybe we’re even at AGI already, who knows.

Let’s say that there’s a business dispute about the meaning of a particular phrase. Let’s further assume that the stakes are really, really high – it’s effectively make or break litigation for the defendant. Let’s play a game and assume that both sides get to agree on who shall decide the dispute/define the meaning:

(1) The presiding judge; or

(2) A jury of 12 citizens; or

(3) An LLM that ingests all of the evidence from the case.

We know that the LLM is more democratic and more accurate. Who knows, by then we may be able to create almost identical proxy of the specific person who was reading the specific language in the specific time and place.

If we’re interested in truth, this is great. We are there. We’ve done it! Great job, everybody.

But it’s not only about truth, right? There are state and federal constitutional guardrails that would of course make this kind of exercise impossible, of course. And there’s the broader issue of judicial adherence to precedence and, moreover, analytical orthodoxy. All the sorts of stuff you learn about in your first year of law school.

But man, all of that great process and procedures seems to miss the forest through the trees, right? Procedure is good. Due process is great. But, if we can get to the truth, or at least the closest proxy to truth that mankind has ever created, wouldn’t that be more ethical? More “just?”

Judge Newsom gets this. He sees where this is going. Shouldn’t we all get on board? Or are we simply not ready to handle the truth?

The best television show about lawyers is Law & Order (I’m talking early 1990s, apex Jerry Orbach, and not any of the silly spinoffs), although apex L.A. Law is damn good. The best movie about lawyers is My Cousin Vinny (cue the inevitable: “That’s ridiculous! It’s no [The Verdict, 12 Angry Men, A Few Good Men, etc.].” I get it, but Marisa Tomei is not in those films, which is grounds for immediate disqualification). The best book about lawyers is probably To Kill a Mockingbird, but the best “contemporary” legal fiction is John Grisham’s The Pelican Brief (decent film, brilliant book). I’m just here to report the facts.

I am befuddled that Gloria Estefan never achieved consensus critical acclaim in the United States. It befuddles me. She has sold over 100 million records worldwide, and for good reason. Her innovative musical style blended Latin rhythms with pop, dance, and ballads. And the ballads, my goodness, forget about it. The official power ranking of the top 5 Gloria Estefan songs is: (1) Words Get In The Way (1986); (2) Rhythm Is Gonna Get You (1987); (3) 1-2-3 (1988); (4) Conga (1985); and (5) Get On Your Feet (1989) (shout out to my Parks & Rec heads).

Ellen P. April of Loyola Law School wrote a fabulous piece on dictionaries and the Supreme Court back in ‘98 entitled “The Law of the Word: Dictionary Shopping in the Supreme Court” (Arizona State Law Journal, Vol. 30, p. 277, 1998). The headline: “dictionaries are not as authoritative, precise, or scholarly as we and the Justices often assume.” Think about that before you click back up on this footnote and get into the head-spinning AI section of this article.

Turtles all the way down.

Shout out to Cecilia Zinti for first bringing this line of cases to my attention. She gets it.

Judge Newsom’s nickname is “The Great Concurrer.” Solid nickname right there. David Lat recently sat down for an interview Judge Newsom available here. He gets it.

I would probably give up a small body part to write a passage this insightful and scintillating just once in my life. Strike that, if Robert Johnson sold his soul to the devil for mastery of the guitar, I would definitely trade a toe if I could bang out a paragraph this good, say, maybe, twice a year.

I believe I just coined this phrase right now. Please feel free to amplify, we could make this a “thing.”

Ben’s Substack

Discussion about this post