Publication Type : Conference Paper
Publisher : CEUR Workshop Proceedings
Source : CEUR Workshop Proceedings, CEUR-WS, Volume 2124, p.50-56 (2018)
Keywords : Artificial intelligence, Computer crime, Electronic mail, information dissemination, Inverse problems, Learning systems, Machine learning techniques, Phishing, Phishing emails, Reliable frameworks, Supervised classifiers, Term frequencyinverse document frequency (TF-IDF), Text processing
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Department : Electronics and Communication
Year : 2018
Abstract : The number of unsolicited aka phishing emails are increasing tremendously day by day. This suggests the need to design a reliable framework to filter out phishing emails. In the proposed work, we develop a supervised classifier for distinguishing phishing email from legitimate ones. The term frequency-inverse document frequency (tf-idf) matrix and Doc2Vec are formed for legitimate and phishing emails. This is passed to various traditional machine learning classifiers for classification. The machine learning classifiers with Doc2Vec representation have performed well in comparison to the tf-idf representation. Thus we conclude Doc2Vec representation is more appropriate for detecting and classifying phishing and legitimate emails. Copyright © by the paper's authors.
Cite this Research Publication : N. A. Unnithan, Harikrishnan, N. B., Vinayakumar, R., Dr. Soman K. P., and Sundarakrishna, S., “Detecting phishing E-mail using machine learning techniques CEN-SecureNLP”, in CEUR Workshop Proceedings, 2018, vol. 2124, pp. 50-56.