Publication Type : Conference Paper
Publisher : Springer
Source : Lecture Notes in Networks and Systems, Springer, Volume 74, p.101-109 (2019)
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Department : Electronics and Communication
Year : 2019
Abstract : The boom in social media has been a topic of discussion among all generations of this era. It most certainly has its positives, such as real-time communication, and a platform for all to voice their opinions. There are a few shady sides to it too, such as anonymity of those communicating. Such anonymity, especially in mediums of messaging such as WhatsApp, can turn out dangerous. Here, comes the crucial role of author profiling. This paper describes the analysis of code-mixed Malayalam–English data, collected from WhatsApp, and its classification based on the basic demographic, the gender, of the author. The text has been represented as Term Frequency–Inverse Document Frequency (TFIDF) matrix and as Term Document Matrix (TDM). The classifiers used are SVM, Naive Bayes, Logistic Regression, Decision Tree, and Random Forest. © Springer Nature Singapore Pte Ltd. 2019.
Cite this Research Publication : V. R. Chacko, M. Kumar, A., and Dr. Soman K. P., “Gender Identification of Code-Mixed Malayalam–English Data from WhatsApp”, in Lecture Notes in Networks and Systems, 2019, vol. 74, pp. 101-109.