In a recent opinion, Judge Posner wanted to know more about the meaning of "harboring," and he didn't find what he wanted in a dictionary. "Dictionary definitions are acontextual," he wrote, "whereas the meaning of sentences depends critically on context,including all sorts of background understandings."
I wish he would have cited The Best Student Comment Ever, aka The Dictionary Is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning. If he had, he would have discovered corpus linguistics, and that might have led him to the Corpus of Contemporary American English (COCA).
As it was, he simply turned to Google. While it's true that Google is a corpus -- check out the Web as Corpus Community -- it's not a very good way to do what Posner was trying to do, which was to find the ordinary meaning of "harboring."
Judge Posner should have just called Utah Supreme Court Justice Tom Lee for some tips. Justice Lee has used corpus linguistics in two opinions (see here for the first one), with the help of Stephen Mouritsen (author of the aforementioned comment).
Have I mentioned that I think corpus linguistics is a big deal that is going to transform legal scholarship? If not, I just wanted to get that on the record because I have been telling everyone I know.
We are talking about word frequencies today in Corpus Linguistics, and as my in-class experiment, I created a Wordle cloud from my most recent article, Private Ordering with Shareholder Bylaws:
If you don't know what the article is about, you can read the abstract, but reading the word cloud gives you a pretty good idea at a glance. Could word frequencies help us find the scholarship that most interests us?
Tomorrow is my first class in Corpus Linguistics, the data-driven study of language. I am lucky that Mark Davies teaches at BYU, and grateful that he has agreed to let me sit in on his course. I have blogged about corpus linguistics here, here, and here, but, until now, I have had no formal training in the field.
Today I was reading tomorrow's assignment out of Susan Hunston's Corporata in Applied Linguistics, and I found this sentence in chapter 1:
The main argument in favour of using a corpus is that it is a more reliable guide to language use than native speaker intuition is.
Just two pages later, however, Hunston offers this on the role of intuition:
[Intuition] is an essential tool for extrapolating important generalizations from a mass of specific information in a corpus.
In addition to reading the first chapter of Hunston, we were assigned to read the last chapter (no cliffhangers in this course), which tells us that corpus linguistics can make life simpler ... and more complex. The source of complexity?
New ideas about language emerge and the old ones need re-evaluation.
This is going to be fun.
I have blogged twice about corpus linguistics (here and here), but both posts have generated substantial interest, so in an effort to give additional credit where credit is due -- and to facilitate exploration of the application of corpus linguistics to law -- I am linking to Neal Goldfarb's briefs, including the AT&T brief that attracted so much attention recently. Goldfarb is not trained as a linguist, but he has elevated the profile of linguistics through his work as a practicing lawyer. You can read about him here. His blog is called LAWnLinguistics.
For other pioneers in this area, readers may be interested in Clark D. Cunningham, Judith N. Levi, Georgia M. Green & Jeffrey P. Kaplan, Plain Meaning and Hard Cases, 103 Yale L.J. 1561 (1993) and Clark D. Cunningham & Charles J. Fillmore, Using Common Sense: A Linguistic Perspective on Judicial Interpretations of “Use a Firearm,” 73 Wash. U. L. Q. 1159 (1995).
By the way, Mark Liberman, Director of the Linguistic Data Consortium at the University of Pennsylvania, is astonished that it has taken lawyers so long to get around to linguistics. Yes, I agree. I am astonished, too.
Last month I blogged about the "best student comment ever," the first law review article to rely on corpus linguistics as the basis for analysis. [See below for update.] As I have worked with corpus linguistics (through the comment's author, Stephen Mouritsen) over the past few months, I have come to conclude that it will revolutionize the study of law, at least insofar as we are attempting to understand word usages.
Today, my former colleage and current Utah Supreme Court Justice Tom Lee used corpus linguistics in a lengthy concurring opinion (the relevant section starts at page 34). In this opinion, Justice Lee is interpreting the word "custody," and he brings corpus linguistics to the fight. Of course, it's no accident that Stephen Mouritsen is Justice Lee's law clerk, but the bigger point here is that Justice Lee was persuaded -- as I am -- of the value of corpus linguistics to shed light on this interpretive question. Justice Lee's collegues are not enamored with the approach, but you can read the opinions for yourself and see who gets the better of the argument.
This seems to be the first judicial opinion anywhere using corpus linguistics, but it will surely not be the last. If you are as intrigued by corpus linguistics as I am, you might be interested in this paper by Mark Davies, a BYU Professor of Corpus Linguistics who is a leader in this field, on how one might use the Corpus of Contemporary American English. I am told that a similar paper on the Corpus of Historical American English is forthcoming.
UPDATE: As noted by Neal Goldfarb, the first law review article to use a linguistic corpus was written by Charles Fillmore and Clark Cunningham, Using Common Sense: A Linguistic Perspective on Judicial Interpretations of 'Use a Firearm', 73 Wash. U. L.Q. 1159 (1995). Indeed, Mouritsen cites the article in his comment.
Mouritsen’s comment differs from the Fillmore and Cunningham article both in its method and its claim. Fillmore and Cunningham use corpus linguistics to examine the word "use" in an attempt to understand what it might mean to "use a firearm." They use the British National Corpus to examine the range of possible meanings of that statutory term in much the same way that a lexicographer might rely on a citation file to find usage examples.
Rather than explore the range of possible uses of a statutory term, Mouritsen relies exclusively on corpus-based data to attempt to demonstrate the “ordinary meaning” of a statutory term in a particular context. His article is the first to do this. Thanks to Neal for raising the issue, causing me to make a more precise statement about the contribution of the Mouritsen piece.
The best student comment I have ever read was published by the BYU Law Review last year. The Dictionary Is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning was written by Stephen Mouritsen, who is now clerking with Justice Tom Lee on the Utah Supreme Court and will start at Cravath this fall. Stephen's comment was cited by the NYT today, though that citation did not do the piece justice. By the way, Stephen and I are writing an article this summer using corpus linguistics. Very cool stuff.