This page addresses the technologies involved in interlinearizing texts. XML serves well as an archival format, especially when terms are related to a generally accepted ontology of linguistic terms. XSL stylesheets can be used to convert an XML file to a presentation format.


XML - a human readable archival format that, with the use of XSL stylesheets, can be converted to presentation format - provides a portable storage method for IGTs when standard tags are used and linked to GOLD (General Ontology for LinguisticDescription). GOLD is a standard ontology of linguistic terms that states relationships between possible linguistic features. If a term used in annotation is related to a feature (or set of features) in the ontology, then it does not matter what annotation term is used: a machine will always know whether it is the same as or different from another term. This allows material from disparate languages to be compared, and searched.

XML uses descriptive tags to mark up data, which makes it more portable than other mark-up languages that use tags to format data. Using these tags, linguists can describe the data being presented in a hierarchal manner. For example, the tags <word> and </word> can surround a word in an IGT, and the tags <phrase> and </phrase> can surround a phrase. This way the data is human readable and, if the tags used are consistent with generally accepted standards, the data can be universally understood.

Bow, Bird and Hughes (2003) supply the following model for using XML to represent a 4-level IGT:

  <item type="title">The Title</item>
      <item type="gls">A phrasal translation</item>
        <item type="txt">Word</item>
              <item type="txt" >Morph</item>
              <item type="gls" >Gloss</item>
              <item type="txt" >Morph</item>
              <item type="gls" >Gloss</item>
              <item type="txt" >Morph</tiem>
              <item type="gls:" >Gloss</item>

XSL stylesheets can be used to convert an archival XML document to presentation format. This is useful for websites or any other incarnations of the data that will be viewed by a wider audience that will be less willing to study the XML document for information.

An XSL stylesheet contains a set of rules that specify how an element in an XML document is to be presented. The use of a stylesheet does not change the original file, so the archival copy will remain intact. This is a key asset, as the integrity of the archival copy should be the upmost priority in language documentation.

