Interspeech 2007 Session WeC.O2: Language resources and tools
Type
oral
Date
Wednesday, August 29, 2007
Time
13:30 – 15:30
Room
Darwin
Chair
Khalid Choukri (ELRA/ELDA)
WeC.O2‑1
13:30
The Buckeye Corpus of Speech: Updates and Enhancements
Eric Fosler-Lussier, The Ohio State University
Laura Dilley, Bowling Green State University
Na'im Tyson, The Ohio State University
Mark Pitt, The Ohio State University
This paper describes recent progress in the development of the Buckeye Corpus of Speech, a phonetically labeled corpus of conversational American English speech, first described in Pitt et al. (2005). With the publication of the second phase of transcription, the corpus has nearly doubled in size from the first release. We briefly give an overview of the corpus, report on additional studies of inter-labeler agreement, and describe a new GUI designed to facilitate searching the annotated speech files.
WeC.O2‑2
13:50
Development of Multimodal Resources for Multilingual Information Retrieval in Basque context
Nora Barroso, University of the Basque Country
Aitzol Ezeiza, University of the Basque Country
Nagore Gilisagasti, University of the Basque Country
Karmele López de Ipiña, University of the Basque Country
Alicia López, University of the Basque Country
Jose Manuel López, University of the Basque Country
The development of Automatic Index Systems requires appropriate Multimodal Resources (MR) to design all the components of the system. The project1 the authors are involved in implements a baseline Multimodal Index System for users in the Basque Country, so it is essential to cover all the languages spoken: Basque, Spanish, and French. Since the specific goal tackled in this work is the development of a system to search information in audio files, this paper summarizes the ongoing efforts to develop resources for the Multilingual Continuous Speech Recognition system that would be available for researchers interested in this field.
WeC.O2‑3
14:10
Construction of a Phonotactic Dialect Corpus using Semiautomatic Annotation
Reva Schwartz, USSS
Wade Shen, MIT/Lincoln Laboratory
Joseph Campbell, MIT/Lincoln Laboratory
Shelley Paget, Appen Ltd.
Julie Vonwiller, Appen Pty Limited
Dominique Estival, Appen Ltd.
Christopher Cieri, Linguistic Data Consortium
In this paper, we discuss rapid, semiautomatic annotation techniques of detailed phonological phenomena for large corpora. We describe the use of these techniques for the development of a corpus of American English dialects. The resulting annotations and corpora will support both large-scale linguistic dialect analysis and automatic dialect identification. We delineate the semiautomatic annotation process that we are currently employing and, a set of experiments we ran to validate this process. From these experiments, we learned that the use of ASR techniques could significantly increase the throughput and consistency of human annotators.
WeC.O2‑4
14:30
BECAM Tool - A Semi-automatic Tool for Bootstrapping Emotion Corpora Annotation and Management
Slim Abdennadher, Department of Computer Science, German University in Cairo, Egypt
Mohamed Aly, Department of Computer Science, German University in Cairo, Egypt
Dirk Buehler, Institute of Information Technology, University of Ulm, Germany
Wolfgang Minker, Institute of Information Technology, University of Ulm, Germany
Johannes Pittermann, Institute of Information Technology, University of Ulm, Germany
Corpus annotation is an important aspect in speech applications where stochastic models need to be trained and evaluated. Multimodal corpora are also annotated. Moreover, corpus annotation is an essential phase in the construction of emotion recognizer engines. Large corpora, as they are essential to construct representative knowledge bases, have been a problem for corpus annotators. Time consumed for labeling such corpora is very significant. Furthermore, manageability becomes more arduous and tedious. In this paper, we propose a semi-automatic tool, called BECAM tool, that will help corpus annotators in managing and annotating large sample emotion corpora.
WeC.O2‑5
14:50
Resources for New Research Directions in Speaker Recognition:The Mixer 3, 4 and 5 Corpora
Christopher Cieri, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, USA
Linda Corson, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, USA
David Graff, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, USA
Kevin Walker, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, USA
This paper describes new resources designed to support research in speaker recognition. It begins with a brief overview of collections protocols, motivates the shift from the Switchboard protocol to the Mixer protocol, summarizes yields from the earliest phase of Mixer collection and then describes more recent phases, yields and expected yields and lessons learned.
WeC.O2‑6
15:10
Intercoder Reliability in Annotating Complex Disfluencies
Peter Heeman, CSLU OGI OHSU
Andy McMillin, Hearing & Speech Institute
J. Scott Yaruss, University of Pittsburgh
In previous work, we presented an annotation scheme that can describe complex disfluencies. In this paper, we first show the prevalence of complex disfluencies and illustrate the types of distinctions that our scheme allows. Second, we present an annotation tool that allows the scheme to be easily applied. Third, we present the results of a reliability study in annotating complex disfluencies with the annotation tool. We find that subjects, even with a minimal amount of training, achieve high intercoder agreement. This work will help pave the way for speech recognizers to precisely model the structure of disfluencies, both for understanding conversational speech of non-stutterers and for assessing stuttering severity.