Publication Type : Conference Paper
Publisher : 2024 - 4th Workshop on Language Technology for Equality, Diversity, Inclusion, Proceedings of the Workshop
Source : Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 190–195, St. Julian's, Malta. Association for Computational Linguistics.
Url : https://aclanthology.org/2024.ltedi-1.21/
Campus : Coimbatore
School : School of Artificial Intelligence
Center : Center for Computational Engineering and Networking
Year : 2024
Abstract : Speech recognition is known to be a specialized application of speech processing. Automatic speech recognition (ASR) systems are designed to perform the speech-to-text task. Although ASR systems have been the subject of extensive research, they still encounter certain challenges when speech variations arise. The speaker’s age, gender, vulnerability, and other factors are the main causes of the variations in speech. In this work, we propose a fine-tuned speech recognition model for recognising the spoken words of vulnerable individuals in Tamil. This research utilizes a dataset sourced from the LT-EDI@EACL2024 shared task. We trained and tested pre-trained ASR models, including XLS-R and Whisper. The findings highlight that the fine-tuned Whisper ASR model surpasses the XLSR, achieving a word error rate (WER) of 24.452, signifying its superior performance in recognizing speech from diverse individuals.
Cite this Research Publication : Jairam R, Jyothish G, Premjith B, and Viswa M., 2024. CEN_Amrita@LT-EDI 2024: A Transformer based Speech Recognition System for Vulnerable Individuals in Tamil. In Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 190–195, St. Julian's, Malta. Association for Computational Linguistics, p. 190-195.