Dena'ina Case Study: From Cassette to Easy-Access Software

Page Index




The content of this page was developed from the data of Dr. James Kari. The conversion work was undertaken as part of the DATA (Dena'ina Archiving, Training, and Access) project, a 3-year project organized by LINGUIST List in collaboration with the Alaska Native Language Center and funded by the National Science Foundation, under grant OPP-0326805.

The initial audio digitization was funded by the University of Alaska President's Special Projects Fund.

Introduction: Easy-Access Software

This case study shows the steps taken to convert cassette and open-reel recordings of traditional stories in Dena'ina Athabascan into a sentence-by-sentence, user-friendly HTML display of the stories and their translations. "Easy-access" principles were key in the development of the project, which means that the final products are intended to be simple to use and based on archival-quality files. Specifically, the developers of the project, Dr. Gary Holton, Dr. James Kari, and graduate researchers Andrea Berez, Sadie Williams, and Olga Müller kept the following points in mind:

The Dena'ina Audio Collection

Dena'ina is a fairly well-documented language of the Cook Inlet region of Alaska. In the 1960s, 1970s, 1980s and 1990s several linguists collected recordings of spoken Dena'ina on audio cassette and open-reel tape. These recordings included traditional stories, prayers, songs, wordlists, and ethnographic information. Many of these cassettes resided in the Alaska Native Heritage Center Archives, but some were in other locations, like private homes and various Alaskan libraries. Linguist James Kari has been working to gather the audio together to form the Dena'ina Audio Collection, a comprehensive assemblage linguistic recordings from the past forty years.

In 2003 and 2004, graduate researchers at the University of Alaska Fairbanks converted the tapes to digital format. The Dena'ina Audio Collection currently includes over 200 audio CDs, and the collection is growing as more Dena'ina audio recordings surface.

A fraction of the audio has been transcribed and translated into English over the years, and some has been published in various books on the Dena'ina people. Dr. Kari worked with Dena'ina speakers to correct previous translations, and typed them into WordPerfect. Dena'ina orthography has only one non-Roman character, ł, and its capital, Ł. In the past, this character was typed as a slash (\) when the ł character was not available, as with typewriters, or in pre-Unicode days. Dr. Kari's texts contained both versions (note: hatted-h and hatted-y were used in older Dena'ina texts, but not in Dr. Kari's).

Digitization of the Dena'ina audio cassettes

Initially, the Alaska Native Language Center in Fairbanks received a grant from the University of Alaska President's Special Projects Fund to support converting the Dena'ina audio collection to digital form. Graduate researchers used a dual tape deck to feed the audio signal through an Edirol and into a PC. She used the Peak software to record the digital file, which she then burned onto CDs. When the entire collection of two hundred cassettes was completely digitized, the collection was backed up to both a free-standing hard drive and to the Arctic Region Supercomputing Center (http://www.arsc.edu).

More on digitization of audio files

Aligning with ELAN

Dr. Kari provided both the WordPerfect file and the digital audio file of some 20 traditional Dena'ina stories to graduate researcher Andrea Berez. After converting the files to Unicode, she used the ELAN software, created by the Max Planck Institute for Psycholinguistics, to create a two-tier alignment of the audio selections. ELAN also produces an XML file, which can be used for archiving purposes and, more importantly to this project, can be converted to a user-friendly HTML display with XSL (eXtensible Stylesheet Language).

See the detailed steps for aligning with ELAN

More on character conversion

More on interlinearized text

Using XSL to transform to HTML

After the alignments were complete, an XSL stylesheet was developed to render the XML files produced by ELAN into HTML. A few pieces of needed information were added to each XML file (for example, the story titles in Dena'ina and English), and then each file was converted to HTML. The stylesheet allowed for the addition of graphics, fonts, colors, and QuickTime plug-ins to play each audio selection. QuickTime was used as the media player because of the ability to use the <starttime> and <endtime> elements to play small portions of the audio, eliminating the need to cut the audio file into many small files.

See the detailed steps for making and using the XSL stylesheet

More on stylesheets and XSL

The Finished Product

Below is a screenshot of the final product, a sixteen-story compilation on CD-ROM. This project unites Dena'ina audio and text with an English translation for the first time.

Finished Product

Follow the path of the Dena'ina Data

  1. Get started: Summary of the Dena'ina conversion
  2. Digitize audio data: Audio pages (Classroom)
  3. Convert characters to Unicode: Conversion page (Classroom)
  4. Align text: Interlinearized glossed text pages (classroom)
  5. Store data: XML pages (classroom)
  6. Render data: Stylesheets pages (classroom)

User Contributed Notes
Dena'ina
+ Add a comment
  + View comments

Back to top Credits | Glossary | Help | Navigation | Site Map | Site Search