Tuesday, April 12, 2011

The Corpus of Historical American English

Technique Tuesday

Using language well is one of the hallmarks of a writer. (After all, if no one ever learn'dja ta-talk good, why'dja think you-kin tell stories?)

One of the reasons it's critical for writers to read widely is because they must develop a sense of the differences--sometimes subtle--in how people have used our language in different times and places. It can make the difference between sounding false and achieving verisimilitude (or, for those of you more comfortable with the colloquial, "the difference between screwing it up and getting it right").

That's why you must run, not walk, and add http://corpus.byu.edu/coha/ to your bookmarks. The link will take you to the Corpus of Historical American English (CHAE). The site shows you how frequently a given word appeared in published works each decade for the last two hundred years.

For example, the word, 'airship,' comes out of nowhere with 30 instances in the 1860's, nothing in the 1870s, ramping to a huge spike (195) in the 1910s, falling down to only 4 instances in the 1950s, and then climbing back to a respectable 38 in the 1980s. Notice the linguistic shadow of the rise and fall of the technology?

Nothing can replace being well read, but the CHAE is a great way to spot-check potentially anachronistic word and phrases.

Addendum: A reader kindly pointed out that the 30 airship references in the 1860's all come from references to Tom Swift and his Airship, which somehow got mixed with period-appropriate text. So the resource isn't perfect, and as with everything else on the Internet, one should always check the sources.


Image: luigi diamanti / FreeDigitalPhotos.net