Interspeech 2007 Session FrC.O2: Phonetics

Type oral
Date Friday, August 31, 2007
Time 13:30 – 15:30
Room Darwin
Chair Jonathan Harrington (Institute of Phonetics and Speech processing, Munich)

/r/-Realisation in Swiss German and Austrian German
Christiane Ulbrich, University of Ulster

Rhotics are generally by phoneticians believed to be phonetically heterogeneous. Such sounds are usually classified as rhotics due to their similar phonological behaviour and their diachronic and synchronic alternation. Ladefoged and Maddieson [112, pg 245] state that the only true unity of rhotics “seems to rest mostly on the historical connections between these subgroups and on the choice of the letter ‘r’ to represent them all”. Although there are some generalizations to be made for all r-sounds including phonotactic properties, as well as synchronic and diachronic alternations, considerable segmental variation, especially in the realisation of /r/ in German, has been described, leading some to conclude that a positive description of the German phoneme /r/ does not make sense.
Singleton and Geminate Stops in Finnish – Acoustic Correlates
Christopher S. Doty, University of Oregon
Kaori Idemaru, Carnegie Mellon University
Susan G. Guion, University of Oregon

The present study examined a variety of acoustic correlates to the stop length contrast in Finnish beyond the duration of the consonant itself. Of interest were the durations of surrounding vowels, the duration of voice onset time (VOT), and the amplitude of the release burst and the following vowel. Results indicated that for geminate stops, VOT is shorter and the amplitude of both the following vowel and the release burst are higher than for singleton stops. Further, long vowels preceding geminate stops are shorter than those preceding singleton stops, although no difference was found for short vowels. Post-consonantal vowel duration does not vary as a function of consonant length, but is affected by the length of the first-syllable vowel. These results agree with data from other languages in some respects, but not in others. It is proposed that this discrepancy arises from the fact that Finnish, despite being stress- or syllable-timed, also has mora-like length features.
Segment Deletion in Spontaneous Speech: A Corpus Study using Mixed Effects Models with Crossed Random Effects
Christophe Van Bael, Centre for Language and Speech Technology, Radboud University Nijmegen, the Netherlands
Harald Baayen, Centre for Language and Speech Technology, Radboud University Nijmegen, the Netherlands
Helmer Strik, Centre for Language and Speech Technology, Radboud University Nijmegen, the Netherlands

We studied the frequencies of phone and syllable deletions in spontaneous Dutch, and the extent to which such deletions are influenced by the various linguistic and sociolinguistic factors represented in the transcriptions, word segmentations and metadata of the Spoken Dutch Corpus. In addition to providing insight into the frequencies of phone and syllable deletions and the factors influencing them, our study illustrates the new opportunities for analysing rich and therefore complex corpus data offered by a recently developed statistical modelling technique: the possibility to model the effects of random factors as crossed instead of nested with generalised linear mixed effects models. We observed average phone and syllable deletion rates of 7.57% and 5.46% respectively. 20.32% of the words had at least one phone missing, and 6.89% of the words had at least one syllable deleted. The mixed effects models for phone and syllable deletion had several effects in common, which implies that both types of deletion are to a large extent influenced by the same factors. The strongest factors across both models were lexical stress, word duration and the segmental context of the syllable onset of the following word.
Categorical Perception of Cantonese Tones in Context: a Cross-Linguistic Study
Hongying Zheng, Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China
Peter WM. Tsang, Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China
William S-Y. Wang, Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China

When human beings perceive speech sounds, they categorize the sounds into one or another phonemic category. The mechanism which is responsible for this phenomenon remains unknown. Is it influenced by listeners’ long term language experience or does it reflect some general psychoacoustic aspects of processing? Previous study shows Cantonese level tones are perceived continuously in citation forms [1]. However, they are perceived categorically in the presence of context [6]. The work in [6] does not provide enough evidence to support the hypothesis of long term language influence. In this study, we compare Mandarin and Cantonese speaker’s perception on Cantonese level tones in context, and Cantonese speaker’s perception on speech and nonspeech (analogous complex harmonics) in the same context as well. Results show evidence of categorical perception on speech stimuli for Cantonese speakers only. These findings support the hypothesis of long term language influence.
A Corpus Study of the 3rd Tone Sandhi in Standard Chinese
Yiya Chen, Department of Linguistics, Radboud University Nijmegen
Jiahong Yuan, Department of Linguistics, University of Pennsylvania

In Standard Chinese, a Low tone (Tone 3) is often realized with a rising F0 contour before another Low tone, known as the 3rd tone Sandhi. This study investigates the acoustic characteristics of the 3rd tone Sandhi in Standard Chinese using a large telephone conversation speech corpus. Sandhi Rising was found to be different from the underlying Rising tone (Tone 2) in bi-syllabic words in two measures: the magnitude of the F0 rising and the time span of the F0 rising. We also found different effects of word frequency on Sandhi Rising and the underlying Rising tones. Finally, for tri-syllabic constituents with Low tone only, constituent boundary showed interesting but puzzling effects on the 3rd tone Sandhi.
Age-related changes in fundamental frequency and formants: a longitudinal study of four speakers
Jonathan Harrington, Institute of Phonetics and Speech Processing (IPS), University of Munich, Munich, Germany
Sallyanne Palethorpe, Macquarie Centre for Cognitive Science (MACCS), Macquarie University, Sydney, Australia
Catherine Watson, Department of Electrical and Computing Engineering, The University of Auckland, New Zealand

The study is concerned with a longitudinal acoustic analysis of two sets of recordings from the same four speakers over an interval of between 29 and 50 years. The aim was to determine whether there is any evidence for age-related acoustic changes. Our analysis showed that the same speakers have lower f0, a lower F1, a marginally lower F2, and an unchanging or sometimes higher F3 in their later recordings. There is some suggestion from these data that the change in F1-f0 in Bark from earlier to late recordings is proportional to the change in F3-F2 in Bark. This suggests that there is shift in the speaker space roughly along a diagonal in the phonetic height x backness plane with increasing age.

