It is important to preserve language documentation for future generations, particularly when documenting endangered languages. Every linguist who collects data must be a kind of archivist, making sure that the language data collected will transcend both time and technology.
This area of the school presents information on producing archival-quality language documentation materials and preparing existing field corpora for storage in long-term archives, using practices that allow the materials to endure.
A corpus is any set of linguistic materials relating to a language; it might consist of audio recordings, video recordings, transcriptions, field notes, a grammar, a lexicon, grammatical analyses, or ethnographic information. Any linguist that works in the field creates a corpus of language data. In creating an archive-ready corpus, it is important to ensure that the materials are in good shape and clearly labeled at every stage in their production. Language documentation is a long-term process, and there is always the chance that someone else may have to complete another linguist's work.
E-MELD strongly recommends that linguists place their corpora in a reputable archive, since an archive has the resources to keep up with changes in technology and make sure language documentation remains accessible over time. However, even if a corpus is not deposited in an archive, it is important to take steps to make materials durable and lasting.
Creating a Corpus
How to Find an Archive
How to How to Establish an Archive