Back close

Phonogram-based Automatic Typo Correction in Malayalam Social Media Comments

Publication Type : Conference Paper

Publisher : Procedia Computer Science

Source : Procedia Computer Science, Volume 233, 2024, Pages 391-400

Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-85192498608&doi=10.1016%2fj.procs.2024.03.229&partnerID=40&md5=8ab9e990cbc9988b7342062b833bd68d

Campus : Amritapuri

School : School of Computing

Center : AmritaCREATE

Year : 2024

Abstract : The Internet has emerged as a pivotal medium for human interaction, leading to profound transformations in language dynamics, especially in computer-mediated communication (CMC) platforms. These transformations are reflected in changing orthography, graphology, vocabulary, grammar, syntax, pragmatics, and the style of natural languages. Internet-based communication has become more accessible. To increase accessibility, users of Internet-based communication platforms speak in the language that comes more naturally to them. Internet users use new terminologies and phrases to make their messages more engaging. Internet users all over the world are following this trend. Internese is the name of this emerging trend in internet lingo. All the internese features are vividly marked in Malayalam social media comments. Typographical errors are common in Malayalam social media comments. None of the spell checkers currently available in Malayalam recognize words and phrases in the social media comments scripted in Malayalam. This study proposes two different models of automatic typo detectors and typo correctors for Malayalam social media comments. The first model of Malayalam typo detectors and typo correctors is a sequence-2-sequence deep learning model, and the other is a hybrid model. These two models analyzed the input words at their phonogram level. The proposed hybrid system has achieved acceptable performance with the recently built corpus. The sequence-2-sequence model performs rather well. Because of Malayalam's intricate structure, the model's significant reliance on data influences its accuracy. © 2024 The Authors. Published by Elsevier B.V.

Cite this Research Publication : Jyothi Ratnam, D., Karthika, A.N., Praveena, K., Rajesh, T., Thara, S., Nedungadi, P., "Phonogram-based Automatic Typo Correction in Malayalam Social Media Comments," Procedia Computer Science, Volume 233, 2024, Pages 391-400, DOI: 10.1016/j.procs.2024.03.229

Admissions Apply Now