Skip to main content

Linguistics: Corpus Linguistics

Corpora Available at WFU

LDA Corpora

LDC Catalog
WFU is a member of the Linguistic Data Consortium (LDC), starting with the 2012 membership year. You can request access to a 2012 (or later) corpus by emailing Carol.

WFU has already downloaded the following corpora:

You need to be on campus -OR- on VPN to download the files. Contact Carol for more information on how to get access to these data sets.

Old English Corpus

Dictionary of Old English Web Corpus
Contains at least one copy of every surviving Old English text. In some cases, more than one copy is included, if it is significant because of dialect or date. As such, the DOE Web Corpus represents over three million words of Old English and fewer than a million words of Latin.

Corpora with Special Restrictions

Access to full text version only for faculty and graduate student thesis projects. Contact Carol if you need access.


Articles about Corpus Linguistics

When searching, it can be tricky to formulate your search to distinguish articles that discuss corpus linguistics as a methodology vs. applications of corpus linguistics. LLBA provides some subject headings that can help with this.

First, make sure you're using the Advanced Search screen.

Choose Subject Heading (all) SU in the first right-side drop-down menu

In the search box type:

  • "corpus linguistics" if you're interested in methodology
  • "corpus analysis" if you're interested in applications

Make sure you include the quotation marks.

In addition, a keyword search for corpus approach (no quotes) tends to return applied articles, and corpus methodology (no quotes) tends to return methodological articles.

Need help? Chat with us