The Corpus of Historical American English (COHA)

Dataset Details

Data Summary

The Corpus of Historical American English (COHA) was created by Mark Davies, and it is the largest structured corpus of historical English. COHA contains more than 475 million words of text from the 1820s-2010s (which makes it 50-100 times as large as other comparable historical corpora of English) and the corpus is balanced by genre decade by decade. The creation of the corpus results from a grant from the National Endowment for the Humanities (NEH) from 2008-2010. For more info, see https://www.corpusdata.org/corpora.asp.

Temporal Extent

Corpora, 1820s-2010s, from to

Institutional Partners

UW Libraries