Publication Type : Conference Paper
Publisher : Springer
Source : In Congress on Intelligent Systems, (2024) Lecture Notes in Networks and Systems, 865, pp. 141-152., DOI: 10.1007/978-981-99-9043-6_12
Url : https://link.springer.com/chapter/10.1007/978-981-99-9043-6_12
Campus : Coimbatore
School : School of Artificial Intelligence
Year : 2024
Abstract : The increasing need for effective cross-lingual communication has highlighted the vital role of speech-to-speech (S2S) translation systems. These systems hold immense potential, particularly in multilingual countries like India, which boasts 22 official languages, with Hindi being the predominant one. This study is centered around the creation of a Hindi-to-English S2S translation model. Comprising three essential components—speech-to-text (STT), text translation, and text-to-speech (TTS) modules—the model’s development takes a structured approach. The STT module leverages CLSRIL-23 + Wav2Vec2, an innovative self-supervised learning technique developed by Facebook AI Research. Through pre-training on substantial amounts of unlabeled speech data, the model captures valuable speech representations, followed by fine-tuning to align with our specific requirements. The output text generated by the STT module undergoes subsequent processing using a transformer-based methodology, ensuring precise translation into English. The final phase involves TTS translation using the Tacotron 2 model. The outcome is an S2S translation model capable of directly producing the target speech (English) from the source language (Hindi). The model achieves an average Mean Opinion Score (MOS) of 3.8, marking a significant advancement in bridging language barriers and facilitating effective multilingual communication.
Cite this Research Publication : Phogat, Divith, Karnati Sai Prashanth, Mangamuru Sai Rishith, Rachure Charith Sai, Sajja Bala Karthikeya, G. Jyothish Lal, and B. Premjith. "Bridging Language Barriers: Exploring Hindi-to-English Speech-to-Speech Translation for Multilingual Communication." In Congress on Intelligent Systems, (2024) Lecture Notes in Networks and Systems, 865, pp. 141-152., DOI: 10.1007/978-981-99-9043-6_12