Back close

A Few-Shot Multi-Accented Speech Classification for Indian Languages using Transformers and LLM’s Fine-Tuning Approaches

Publication Type : Conference Paper

Source : Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 1–9, St. Julian's, Malta. Association for Computational Linguistics

Url : https://aclanthology.org/2024.dravidianlangtech-1.1.pdf

Campus : Coimbatore

School : School of Artificial Intelligence

Year : 2024

Abstract : Accented speech classification plays a vital role in the advancement of high-quality automatic speech recognition (ASR) technology. For certain applications, like multi-accented speech classification, it is not always viable to obtain data with accent variation, especially for resource-poor languages. This is one of the major reasons that contributes to the underperformance of the speech classification systems. Therefore, in order to handle speech variability in Indian language speaker accents, we propose a few-shot learning paradigm in this study. It learns generic feature embeddings using an encoder from a pre-trained whisper model and a classification head for classification. The model is refined using LLM’s finetuning techniques, such as LoRA and QLoRA, for the six Indian English accents in the Indic Accent Dataset. The experimental findings show that the accuracy of the model is greatly increased by the few-shot learning paradigm’s effectiveness combined with LLM’s fine-tuning techniques. In optimal settings, the model’s accuracy can reach 94% when the trainable parameters are set to 5%.

Cite this Research Publication : Jairam R, Jyothish G, and Premjith B. 2024. A Few-Shot Multi-Accented Speech Classification for Indian Languages using Transformers and LLM’s Fine-Tuning Approaches. In Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 1–9, St. Julian's, Malta. Association for Computational Linguistics

Admissions Apply Now