Back close

Feature Engineering and Selection for the Identification of Fake News in Social Media

Publication Type : Conference Paper

Publisher : International Conference on Signal & Data Processing

Source : International Conference on Signal & Data Processing, (2023) Lecture Notes in Electrical Engineering, 1026 LNEE, pp. 291-301., DOI: 10.1007/978-981-99-1410-4_24

Url : https://link.springer.com/chapter/10.1007/978-981-99-1410-4_24

Campus : Coimbatore

School : School of Artificial Intelligence, School of Artificial Intelligence - Coimbatore

Year : 2023

Abstract : The spread of fake content on social media causes increased hatred and social categorization. Online social media platforms have made it easy for people to share fake information with millions of people within a short time. The amount of false online information has made it difficult for users to verify the content they consume daily. It has led to an increased demand for the automatic detection of fake information on social media. Automatic detection of fake online content is a challenging task due to the complexities of the online shared language. Therefore, detecting fake information from social media relies heavily on text features. This paper introduces a new feature set, which covers linguistic characteristics, social context, and statistical aspects of the online shared text. This paper proposes a novel fake information detection model that utilizes the above-discussed feature set. The experiments were carried out on three different datasets to evaluate the efficacy of the proposed fake news detection model and the proposed feature set: (1) a dataset cre ated as part of this research and two publicly available, (2) Covid-19 Fake News Infodemic Research (Covid19-FNIR) Dataset, and (3) COVID-19 Fake News Dataset. Experimental results on three separate datasets show the efficacy of the proposed model and the feature set. Our Random Forest Classifier-based fake news detection model could achieve a classification accuracy of 90.30% with an F1-score of 0.9033 on our dataset. In contrast, the Logistic Regression Classifier-based fake news detection model attained the highest classification accuracy of 99.20% with an F1-score of 0.9920 on the Covid19-FNIR dataset. On the COVID-19 Fake News Dataset, Random Forest Classifier-based fake news detection model obtained a classification accuracy of 83.72% and an F1 score of 0.8357.

Cite this Research Publication : Vinay, R., Premjith, B., Shukla, D., Soman, K.P., "Feature Engineering and Selection for the Identification of Fake News in Social Media," International Conference on Signal & Data Processing, (2023) Lecture Notes in Electrical Engineering, 1026 LNEE, pp. 291-301., DOI: 10.1007/978-981-99-1410-4_24

Admissions Apply Now