Gary F. Simons, SIL International,
Brian Fitzsimons, U of Arizona,
D. Terence Langendoen, U of Arizona,
William D. Lewis, California State U, Fresno,
Scott O. Farrar, University of Bremen,
Alexis Lanham, U of Arizona,
Ruby Basham, U of Arizona,
Hector Gonzalez & California State University, Fresno

A Model for Interoperability: XML Documents as an RDF Database

We propose a model for a Resource Description Format (RDF) database for interlinear glossed text (IGT) created from documents encoded in the Extensible Markup Language (XML) using markup metaschemas. A metaschema, constructed using the Semantic Interpretation Language (SIL) (Simons 2004) maps XML-encoded documents to a common semantically rich RDF database. The RDF database in turn can be searched using RDF-search engines providing the key functionality of a database management system (DBMS). Simons et al. (2004) gives a proof of concept of the model by mapping differently encoded XML lexicons to a common RDF form. Search capability is provided across these data using SeRQL, a SQL-like query language built around the Sesame RDF database program. In this paper, we extend these results to corpora of interlinear glossed text obtained from various sources, including some from the Web following Lewis (2003), combined with a language profile for each language variety, which provides basic grammatical information about that variety.