Search by Language

Search by Directory

Search by Corpus Name

Just Starting

A good starting point is the NLTK data set, since it has an example corpus for almost any task. (Here is the NLTK site's summary).

/Volumes/Data/Corpora/nltk-data

Additional NLTK corpora in the directory /Volumes/Data/Corpora/nltk-data-0.2 (Everything not listed here is also in the above nltk-data directory)

By directories

Most top-level directories are also language codes:

By language

By corpus

JonesCorpora (last edited 2009-06-12 20:46:47 by ScottLedbetter)