Attention-based phonetic convolutional recurrent neural networks for language identification

Gundluru, R. and Venkatesh, V. and Murty, K S. (2021) Attention-based phonetic convolutional recurrent neural networks for language identification. In: 2021 National Conference on Communications (NCC), 27 July 2021 through 30 July 2021, Virtual, Kanpur.

Full text not available from this repository. (Request a copy)

Abstract

Language identification is the task of identifying the language of the spoken utterance. Deep neural models such as LSTM-RNN with attention mechanism shown great potential in language identification. The language cues like phonemes and their co-occurrences are an important component while distinguishing the languages. The acoustic feature-based systems do not utilize phonetic information. So the phonetic feature-based LSTM-RNN models have shown improvement over the raw-acoustic features. These methods require a large amount of transcribed speech data to train the phoneme discriminator. Obtaining transcribed speech data for low resource Indian languages is a difficult task. To alleviate this issue, we investigate the usage of pre-trained rich resource phonetic discriminators for low resource target languages to extract the phonetic features. We then trained an attention CRNN based end-to-end utterance level language identification (LID) system with these discriminative phonetic features. We used open-source LibriSpeech English data to train the phoneme discriminator with sequence discriminate objective lattice-free maximum mutual information (LF-MMI). We achieved overall 20% absolute improvements over the baseline acoustic features CRNN model. We also investigate the significance of the duration in LID.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Murty, K S.https://orcid.org/0000-0002-6355-5287
Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Acoustic features, Attention mechanisms, Bottleneck features, Co-occurrence, Feature-based, Language identification, Low resource languages, Neural modelling, Phonetic features, Speech data
Subjects: Electrical Engineering
Divisions: Department of Electrical Engineering
Depositing User: Mrs Haseena VKKM
Date Deposited: 15 Nov 2021 06:44
Last Modified: 15 Nov 2021 06:44
URI: http://raiithold.iith.ac.in/id/eprint/8973
Publisher URL: https://ieeexplore.ieee.org/document/9530030/
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 8973 Statistics for this ePrint Item