Friday, January 27, 2012

A Very Cool Tool

Want to see how often a word is used in books by year?  Books.google.com has something called an Ngram Viewer that lets you see how often a particular word or phrase in used across the centuries.  There are some surprises: rifle appears surprisingly often in the seventeenth century, then falls off, coming up in frequency in the nineteenth century.  Of course, remember the limitations of those OCRed books--there are a lot of words that do not appear because the print was hard to OCR.

The other limitation is that the starting date of many series is used for publication date, even if the issue in question is centuries later (or even wrong).  For example, the Ngram for the word "feminism" had matches in the early seventeenth century but nothing until the twentieth century.  What are those early matches?  One is an error on the Latin word "feminis" in the Malleus maleficarum de Lamiis (and for all I know, it might even mean the same thing, since I think this is a Latin work on witchcraft) and the other is from Chemical Abstracts, vol. 87.  I am sorry, but I do not believe that Chemical Abstracts series really starts in 1620!

1 comment:

Rich Rostrom said...

I've been playing with Ngrams for quite a while now. I didn't realize it went back before 1800.

Incidentally, it will search for phrases as well as words. This allows one to compare the use of a term in a common phrase with its total use, i.e. revealing its use otherwise.

It's fun to look up fairly common words and phrases, and discover how they came and went over the years. There are common phrases now that were unknown in the 1800, and phrases and terms that were far more common then than now.