Neural Machine Transliteration Of Indian Languages

Singh, Aryan and Bansal, Jhalak (2021) Neural Machine Transliteration Of Indian Languages. In: 4th International Conference on Computing and Communications Technologies, ICCCT 2021, 16 December 2021through 17 December 2021, Chennai.

[img] Text
ICCCT_2021.pdf - Published Version
Restricted to Registered users only

Download (3MB) | Request a copy

Abstract

Transliteration is a task of converting one language written in a foreign script to its written form in native script. It's not only important to understand the written form of language for transliteration but also the sound associated with the written words of the language. Hindi and Punjabi are two of the most widely spoken languages in the world with a combined base of around 500 million speakers. While English is widely understood now, regional languages remain the mainstay for spoken and written conversation. Most of the modern devices still come with English keyboards which makes it very difficult to express in regional languages. This research is aimed at developing a scalable and universal architecture that gives state of the art results for the transliteration of Hindi and Punjabi languages. It explores different heuristics in sequence to sequence modelling, attention and transformer networks to determine the best suited architecture for transliteration of Indian languages. Out of these variants, character/grapheme level bi-directional encoder and auto-regressive decoder model proved to be best-performing architecture and gave the state of the art results for both transliteration and back transliteration tasks with SOTA BLEU score of 0.88 on Punjabi and 0.97 on Hindi. © 2021 IEEE.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Auto-regressive; Back-transliteration; Bi-directional; Indian languages; Machine transliteration; Sequence models; Spoken languages; State of the art; Written word
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: . LibTrainee 2021
Date Deposited: 10 Sep 2022 05:08
Last Modified: 10 Sep 2022 05:08
URI: http://raiithold.iith.ac.in/id/eprint/10520
Publisher URL: http://doi.org/10.1109/ICCCT53315.2021.9711806
Related URLs:

    Actions (login required)

    View Item View Item
    Statistics for RAIITH ePrint 10520 Statistics for this ePrint Item