Subtitle Synthesis using Inter and Intra utterance Prosodic Alignment for Automatic Dubbing

Pamisetty, Giridhar and Kodukula, Sri Rama Murty (2022) Subtitle Synthesis using Inter and Intra utterance Prosodic Alignment for Automatic Dubbing. In: 27th National Conference on Communications, NCC 2022, 24 May 2022 through 27 May 2022, Virtual, Online.

[img] Text
2022_National_Conference .pdf - Published Version
Restricted to Registered users only

Download (321kB) | Request a copy

Abstract

Automatic dubbing or machine dubbing is the process of replacing the speech in the source video with the desired language speech, which is synthesized using a text-to-speech synthesis (TTS) system. The synthesized speech should align with the events in the source video to have a realistic experience. Most of the existing prosodic alignment processes operate on the synthesized speech by controlling the speaking rate. In this paper, we propose subtitle synthesis, a unified approach for the prosodic alignment that operates at the feature level. Modifying the prosodic parameters at the feature level will not degrade the naturalness of the synthesized speech. We use both inter and intra utterance alignment in the prosodic alignment process. We should have control over the duration of the phonemes to perform alignment at the feature level to achieve synchronization between the synthesized and the source speech. So, we use the Prosody-TTS system to synthesize the speech, which has the provision to control the duration of phonemes and fundamental frequency (f0) during the synthesis. The subjective evaluation of the translated audiovisual content (lecture videos) resulted in a mean opinion score (MOS) of 4.104 that indicates the effectiveness of the proposed prosodic alignment process. © 2022 IEEE.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Kodukula, Sri Rama Murtyhttps://orcid.org/0000-0002-6355-5287
Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: Machine dubbing; prosodic alignment; Prosody-TTS; text-to-speech synthesis
Subjects: Electrical Engineering
Divisions: Department of Electrical Engineering
Depositing User: . LibTrainee 2021
Date Deposited: 02 Aug 2022 10:19
Last Modified: 02 Aug 2022 10:19
URI: http://raiithold.iith.ac.in/id/eprint/10063
Publisher URL: http://doi.org/10.1109/NCC55593.2022.9806799
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 10063 Statistics for this ePrint Item