Interspeech 2007 logo
August 27-31, 2007

Antwerp, Belgium
Antwerp Cathedral tower Antwerp main square Antwerp harbour in twilight
printer friendly version of this page

Interspeech 2007 Session FrB.O1: Systems for spoken language translation I

Type oral
Date Friday, August 31, 2007
Time 10:00 – 12:00
Room Elisabeth
Chair Ralf Schlueter (RWTH Aachen University, Lehrstuhl Informatik 6 - Computer Science Department)

Improved Machine Translation of Speech-to-Text outputs
Daniel Déchelotte, LIMSI/CNRS
Holger Schwenk, LIMSI/CNRS
Gilles Adda, LIMSI/CNRS
Jean-Luc Gauvain, LIMSI/CNRS

Combining automatic speech recognition and machine translation is frequent in current research programs. This paper first presents several pre-processing steps to limit the performance degradation observed when translating an automatic transcription (as opposed to a manual transcription). Indeed, automatically transcribed speech often differs significantly from the machine translation system's training material, with respect to caseing, punctuation and word normalization. The proposed system outperforms the best system at the 2007 TC-STAR evaluation by almost 2 points BLEU. The paper then attempts to determine a criteria characterizing how well an STT system can be translated, but the current experiments could only confirm that lower word error rates lead to better translations.
Improvements in Machine Translation for English/Iraqi Speech Translation
Shirin Saleem, BBN Technologies
Krishna Subramanian, BBN Technologies
Rohit Prasad, BBN Technologies
David Stallard, BBN Technologies
Chia-lin Kao, BBN Technologies
Prem Natarajan, BBN Technologies
Raid Suleiman, BBN Technologies

In this paper, we describe techniques for improving machine translation quality in the context of speech-to-speech translation for significantly different language pairs. Specifically, we explore three broad approaches for improving translation from English to Iraqi and vice versa. First, we investigate normalization techniques which address the differences in spoken and written forms of both languages. Second, we incorporate additional knowledge sources into the translation process such as a bilingual lexicon and named entity detection. Third, we exploit the rich morphological structure of Iraqi Arabic using two different approaches. The first approach decomposes words in Iraqi Arabic whereas the second approach, a novel one inflects English by combining key phrases into words using the minimum descriptive length criterion. Significant gains in accuracy are observed, while translating from text as well as speech recognition output.
Improving Speech Translation with Automatic Boundary Prediction
Evgeny Matusov, RWTH Aachen
Dustin Hillard, University of Washington
Mathew Magimai-Doss, ICSI
Dilek Hakkani-Tur, ICSI
Mari Ostendorf, University of Washington
Hermann Ney, RWTH Aachen

This paper investigates the influence of automatic sentence boundary and sub-sentence punctuation prediction on machine translation (MT) of automatically recognized speech. We use prosodic and lexical cues to determine sentence boundaries, and successfully combine two complementary approaches to sentence boundary prediction. We also introduce a new feature for segmentation prediction that directly considers the assumptions of the phrase translation model. In addition, we show how automatically predicted commas can be used to constrain reordering in MT search. We evaluate the presented methods using a state-of-the-art phrase-based statistical MT system on two large vocabulary tasks. We find that careful optimization of the segmentation parameters directly for translation quality improves the translation results in comparison to independent optimization for segmentation quality of the predicted source language sentence boundaries.
Punctuating Confusion Networks for Speech Translation
Roldano Cattoni, Fondazione Bruno Kessler - IRST, Trento (I)
Nicola Bertoldi, Fondazione Bruno Kessler - IRST, Trento (I)
Marcello Federico, Fondazione Bruno Kessler - IRST, Trento (I)

Translating from confusion networks (CNs) has been proven to be more effective than translating from single best hypotheses. Moreover, it is widely accepted that the availability of good punctuation marks in the input can improve translation quality. At present, no ASR systems can generate punctuation marks in the word graphs, therefore CNs miss punctuation. In this paper we investigate the problem of adding punctuation marks into confusion networks. We investigate different punctuation strategies and show that the use of multiple hypotheses improves translation quality in a large-vocabulary speech translation task.
Integration of ASR and Machine Translation Models in a Document Translation Task
Aarthi Reddy, McGill University, Canada
Richard Rose, McGill University, Canada
Alain Desiltes, National Research Council, Canada

This paper is concerned with the problem of machine aided human language translation. It addresses a translation scenario where a human translator dictates the spoken language translation of a source language text into an automatic speech dictation system. The source language text in this scenario is also presented to a statistical machine translation system (SMT). The techniques presented in the paper assume that the optimum target language word string which is produced by the dictation system is modeled using the combined SMT and ASR statistical models. These techniques were evaluated on a speech corpus involving human translators dictating English language translations of a French text obtained from transcriptions of the proceedings of the Canadian House of Commons. It will be shown in the paper that the combined ASR/SMT modeling techniques described in the paper were able to reduce ASR WER by 26.6% relative to the WER of an ASR system that did not incorporate SMT knowledge.
Bilingual LSA-based Translation Lexicon Adaptation for Spoken Language Translation
Yik-Cheung Tam, Carnegie Mellon University
Tanja Schultz, Carnegie Mellon University

We present a bilingual LSA (bLSA) framework for translation lexicon adaptation. The idea is to apply marginal adaptation on a translation lexicon so that the lexicon marginals match to in-domain marginals. In the framework of speech translation, the bLSA method transfers topic distributions from the source to the target side, such that the translation lexicon can be adapted before translation based on the source document. We evaluated the proposed approach on our Mandarin RT04 spoken language translation system. Results showed that the conditional likelihood on the test sentence pairs is improved significantly using an adapted translation lexicon compared to an unadapted baseline. The proposed approach showed improvement on BLEU in SMT. When both the target-side LM and the translation lexicon were adapted and applied simultaneously for SMT decoding, the gain on BLEU was more than additive compared to the scenarios when the adapted models were individually applied.

ISCA logo Universiteit Antwerpen logo Radboud University Nijmegen logo Katholieke Universiteit Leuven logo