July 01, 2006
Variation in English Words and Phrases
Posted by Gordon Smith

BYU linguist Mark Davies has developed a very interesting website called VIEW. Here is a description of it from the BYU press release:

The "VIEW" in view.byu.edu stands for Variation In English Words and phrases, and the site uses as its database the 100-million-word British National Corpus. Davies is among a rare breed who loves to gather millions of words of written and spoken communication and catalog them into a collection called a "corpus." In addition to building an interface for the material provided by the University of Oxford authors of the British corpus, Davies has built his own corpus for Spanish and is putting the finishing touches on his Portuguese version. Those two projects were funded by grants from the National Endowment for the Humanities totaling more than $300,000.

It's not simply a matter of dumping words into a database. "You want the corpus to represent the range of types of usage, so you need to first determine that you want a certain percentage from newspapers, a certain portion from books, another portion from speeches, and so on," Davis said. "And then within books, you balance that between fiction and nonfiction, and then within those, between westerns and romance and engineering and history, for example."

The entries then must be tagged as particular parts of speech and organized in an architecture and interface that allows them to be accessed easily. That's Davies' specialty and the reason that he was given access to the British National Corpus. He's already on tap to build the interface for the first American National Corpus, currently under construction. And he's building himself the largest historical corpus of English (the British entries are all post-1970), which will include a quarter-billion words produced from 1500-1900. That project will enable study of how usage and meaning of words has changed over time.

The interface is fairly offputting, but take the three-minute tour, and you will be hooked.

Miscellany | Bookmark

TrackBacks (0)

TrackBack URL for this entry:

Links to weblogs that reference Variation in English Words and Phrases:

Recent Comments
Popular Threads
Search The Glom
The Glom on Twitter
Archives by Topic
Archives by Date
January 2019
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
Miscellaneous Links