Publication Type : Conference Paper
Publisher : Proceedings of 4th Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Source : (2024) DravidianLangTech 2024 - Proceedings of 4th Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pp. 16-23.
Url : https://aclanthology.org/2024.dravidianlangtech-1.3.pdf
Campus : Coimbatore
School : School of Artificial Intelligence - Coimbatore
Center : Computational Engineering and Networking
Year : 2024
Abstract : Identifying fake news hidden as real news is crucial to fight misinformation and ensure reli- able information, especially in resource-scarce languages like Malayalam. To recognize the unique challenges of fake news in languages like Malayalam, we present a dataset curated specifically for classifying fake news in Malay- alam. This fake news is categorized based on the degree of misinformation, marking the first of its kind in this language. Further, we propose baseline models employing multilingual BERT and diverse machine learning classifiers. Our findings indicate that logistic regression trained on LaBSE features demonstrates promising ini- tial performance with an F1 score of 0.3393. However, addressing the significant data imbalance remains essential for further improvement in model accuracy.
Cite this Research Publication : Devika, K., Hari Prasath, S.B., Haripriya, B., Vigneshwar, E., Premjith, B., Chakravarthi, B.R., "From Dataset to Detection: A Comprehensive Approach to Combating Malayalam Fake News," (2024) DravidianLangTech 2024 - Proceedings of 4th Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pp. 16-23.