Reddy, B Naresh
(2015)
Spoken Term Detection on Low Resource Languages.
Masters thesis, Indian Institute of Technology Hyderabad.
Abstract
Developing efficient speech processing systems for low-resource languages is an immensely challenging
problem. One potentially effective approach to address the lack of resources for any particular language, is to employ data from multiple languages for building speech processing sub-systems. This thesis investigates possible methodologies for Spoken Term Detection (STD) from low-
resource Indian languages. The task of STD intend to search for a query keyword, given in text form, from a considerably large speech database. This is usually done by matching templates of feature vectors, representing sequence of phonemes from the query word and the continuous speech from the database. Typical set of features used to represent speech signals in most of the speech processing systems are the mel frequency cepstral coefficients (MFCC). As speech is a very complexsignal, holding information about the textual message, speaker identity, emotional and health state of the speaker, etc., the MFCC features derived from it will also contain information about all these factors. For eficient template matching, we need to neutralize the speaker variability in features and stabilize them to represent the speech variability alone.
Actions (login required)
|
View Item |