Transcription tools from LDC
Three open source tools from LDC
One such tool is XTrans, a next generation transcription tool that is designed to support transcription tasks in multiple languages on multiple platforms. Its versatile and powerful waveform display/playback component can load multiple audio files of different file formats and sampling rates at the same time. The virtual channel supported by XTrans provides the most natural method for transcribing overlapping speech. Virtual channel represents an audio source, not a physical channel, that is identified and transcribed in a given audio recording. A single-channel audio file can contain many audio sources. For instance, a round-table talk show with five speakers contains five audio sources in a single-channel audio recording. With XTrans, that file is modeled as a 5-virtual-channel audio file, and each virtual channel is transcribed independently. Additionally, if a recording consists of audio files with different sampling rates, XTrans will automatically resample them to the same rate. The LDC has used XTrans for many varied projects, and the tool has proven to be quick to learn and easy to master. We are currently working through licensing issues with organizations that provided libraries for XTrans. Once those issues are resolved, we will make XTrans generally available.
Two other general use tools developed by the LDC – The Annotation Graph Toolkit (AGTK) and Champollion Tool Kit (CTK) — are available on Sourceforge.net Like XTrans, these tools represent creative solutions to difficult problems:
- The Annotation Graph Toolkit (AGTK) is a primary resource for annotation tool development at LDC. AGTK is a suite of software components for building tools for annotating linguistic signals, time-series data which documents any kind of linguistic behavior (e.g. audio, video). Unlike the traditional approach of designing and implementing data structures and user interfaces for new tasks from scratch, AGTK allows developers to quickly prototype tools and define data formats. The flexible nature of the AG model means that data representations can be rapidly modified in response to evolving annotation task definitions. AGTK allows for rapid deployment of highly specialized, task-specific tools that maximize user interface ergonomics and improve the speed and accuracy of annotation.
- Champollion Tool Kit (CTK) was developed to address issues in aligning parallel text consisting of remote language pairs and a significant amount of noise. To achieve high precision and recall on manually-aligned text, CTK assumes a noisy input, that is, that a sizable percentage of alignments will not be one to one, and that the number of deletions and insertions will be significant. Furthermore, CTK differs from other lexicon-based approaches in assigning greater weight to less frequent translation pairs. CTK was first evaluated using Chinese-English parallel text but is designed to be used on as many language pairs as possible.