Linguist Start Page
- What ethical considerations need to be made?
- What kinds of tools should I use?
- What should I collect when I collect data?
- How have other linguists done it?
Which tools are available?For information on software and hardware tools available, visit the Toolroom.
What software should I use for specific kinds of data?
- For advice about software considerations and options specific to audio data, please visit the Classroom's Digitization of Audio Files page.
- To learn about software options and considerations for images, please visit the Classroom's Digitization of Existing Images page.
- Information about video software can be found in the Classroom's Digitization of Video Files page.
- To learn about the standard for text input, visit the Classroom's Unicode page. To learn about how to store data in an enduring format, visit the Classroom's XML pages. To learn about how to use stylesheets to organize the layout of data, visit the Classroom's Stylesheet page.
Do I need to collect anything besides language data?Along with data, it is imperative to collect metadata. To learn about metadata and how to compile it, visit the Classroom's Metadata page. To learn about and use the OLAC Repository Editor (ORE), a tool for metadata creation, visit the Workroom's Metadata page.
How do I build a corpus?Of course, a linguistic corpus is also necessary. To learn how to collect material and build a corpus, visit the Classroom's Archives page. To learn how to record this corpus in an enduring format, visit the Classroom's XML pages. To learn how to create a lexicon, visit the Workroom's Lexical Analysis and Output page.
Should I add anything to the data?Annotating a corpus assigns meaning to the data and enables future researchers to access your insights. To learn how to annotate a corpus, visit the Classroom's Annotation page. When annotating a corpus, attention should be paid to the terminology used. To learn about terminological mapping, visit the Workroom's Terminology page. To view the General Ontology for Linguistic Description (GOLD), visit the Ontology Tree.
To read examples of data conversion from legacy format to best practices format, vist the Case Studies. To learn about other documentation projects, in the light of best practices, visit the Documentation Projects section of the Case Studies and the UNESCO Register of Good Practices in Language Preservation.
BP in a Nutshell
What are Best Practices?
Why Follow BP?
Community Start Page
Linguist Start Page
Archivist Start Page