Interspeech 2007 Session FrB.P2a: Story segmentation
Friday, August 31, 2007
10:00 – 12:00
Dilek Hakkani-Tur (ICSI)
Modeling the Statistical Behavior of Lexical Chains to Capture Word Cohesiveness for Automatic Story Segmentation
Shing-kai Chan, Human-Computer Communications Laboratory, Department of Systems Engineering and Engineering Management, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Lei Xie, Human-Computer Communications Laboratory, Department of Systems Engineering and Engineering Management, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Helen Mei-ling Meng, Human-Computer Communications Laboratory, Department of Systems Engineering and Engineering Management, the Chinese University of Hong Kong, Shatin, N.T., Hong Kong
We present a mathematically rigorous framework for modeling the statistical behavior of lexical chains for automatic story segmentation of broadcast news audio. Lexical chains were first proposed in  to connect related terms within a story, as an embodiment of lexical cohesion. The vocabulary within a story tends to be cohesive, while a change in the vocabulary distribution tends to signify a topic shift that occurs across a story boundary. Previous work focused on the concept and nature of lexical chains but performed story segmentation based on arbitrary thresholding. This work proposes the use of the log-normal distribution to capture the statistical behavior of lexical chains, together with data-driven parameter selection for lexical chain formation. Experimentation based on the TDT-2 Mandarin Corpus shows that the proposed statistical model leads to better story segmentation, where the F1-measure increased from 0.468 to 0.641.
Prosodic Features and Feature Selection for Multi-Lingual Sentence Segmentation
James Fung, ICSI
Dilek Hakkani-Tur, ICSI
Mathew Magimai, ICSI
Liz Shriberg, ICSI
Sebastien Cuendet, ICSI
Nikki Mirghafori, ICSI
In this paper, we perform a cross-linguistic study of prosodic features in sentence segmentation by using two different feature selection approaches: a forward search wrapper and feature filtering. Experiments in Arabic, English, and Mandarin show that prosodic features make significant contributions in all three languages. Feature selection results indicate that feature relevancy can vary greatly depending on the target language, and therefore the optimal feature subset varies considerably between languages. We observe patterns in the feature selection and the affinity of the different languages toward certain feature types, which gives us insight into future feature selection and feature design.
Varying Input Segmentation for Story Boundary Detection in English, Arabic and Mandarin Broadcast News
Andrew Rosenberg, Columbia University
Mehrbod Sharifi, NA
Julia Hirschberg, Columbia University
Story segmentation of news broadcasts has been shown to improve the accuracy of the subsequent processes such as question answering and information retrieval. In previous work, a decision tree trained on automatically extracted lexical and acoustic features was trained to predict story boundaries, using hypothesized sentence boundaries to define potential story boundaries. In this paper, we empirically evaluate several alternatives to this choice of input segmentation on three languages: English, Mandarin and Arabic. Our results suggest that the best performance can be achieved by using 250ms pause-based segmentation or sentence boundaries determined using a very low confidence score threshold.
Speaker Role Based Structural Classification of Broadcast News Stories
BalaKrishna Kolluru, University of Sheffield
Yoshihiko Gotoh, University of Sheffield
This paper is concerned with automatic classification of broadcast news stories based on speaker roles such as anchor, reporter and others. The story classification is the first step for many related tasks such as browsing, indexing, and summarising the news broadcast. We use broadcast news audio and its automatic speech recogniser transcripts to implement the classification system. It builds on speaker segmentation and identification, story segmentation and named entity identification. It has achieved 92% accuracy when individual stories were provided manually. The performance declined to 67% and 51%, of precision and recall related measures respectively, when combined with automatic story boundary segmentation.