Publication Type : Conference Paper
Publisher : Procedia Computer Science
Source : Procedia Computer Science, Volume 233, 2024, Pages 391-400
Campus : Amritapuri
School : School of Computing
Center : AmritaCREATE
Year : 2024
Abstract : The Internet has emerged as a pivotal medium for human interaction, leading to profound transformations in language dynamics, especially in computer-mediated communication (CMC) platforms. These transformations are reflected in changing orthography, graphology, vocabulary, grammar, syntax, pragmatics, and the style of natural languages. Internet-based communication has become more accessible. To increase accessibility, users of Internet-based communication platforms speak in the language that comes more naturally to them. Internet users use new terminologies and phrases to make their messages more engaging. Internet users all over the world are following this trend. Internese is the name of this emerging trend in internet lingo. All the internese features are vividly marked in Malayalam social media comments. Typographical errors are common in Malayalam social media comments. None of the spell checkers currently available in Malayalam recognize words and phrases in the social media comments scripted in Malayalam. This study proposes two different models of automatic typo detectors and typo correctors for Malayalam social media comments. The first model of Malayalam typo detectors and typo correctors is a sequence-2-sequence deep learning model, and the other is a hybrid model. These two models analyzed the input words at their phonogram level. The proposed hybrid system has achieved acceptable performance with the recently built corpus. The sequence-2-sequence model performs rather well. Because of Malayalam's intricate structure, the model's significant reliance on data influences its accuracy. © 2024 The Authors. Published by Elsevier B.V.
Cite this Research Publication : Jyothi Ratnam, D., Karthika, A.N., Praveena, K., Rajesh, T., Thara, S., Nedungadi, P., "Phonogram-based Automatic Typo Correction in Malayalam Social Media Comments," Procedia Computer Science, Volume 233, 2024, Pages 391-400, DOI: 10.1016/j.procs.2024.03.229