Nayak, Shekhar and Bhati, Saurabhchand and Kodukula, Sri Rama Murty
(2019)
Zero Resource Speaking Rate Estimation from Change Point Detection of Syllable-like Units.
In: 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, 12-17 May 2019, Brighton, United Kingdom.
Full text not available from this repository.
(
Request a copy)
Abstract
Speaking rate is an important attribute of the speech signal which plays a crucial role in the performance of automatic speech processing systems. In this paper, we propose to estimate the speaking rate by segmenting the speech into syllable-like units using end point detection algorithms which do not require any training and fine-tuning. Also, there are no predefined constraints on the expected number of syllabic segments. The acoustic subword units are obtained only from speech signal to estimate the speaking rate without any requirement of transcriptions or phonetic knowledge of the speech data. A recent theta-rate oscillator based syllabification algorithm is also employed for speaking rate estimation. The performance is evaluated on TIMIT corpus and spontaneous speech from Switchboard corpus. The correlation results are comparable to recent algorithms which are trained with specific training set and/or make use of the available transcriptions.
Actions (login required)
|
View Item |