Publication Type : Conference Paper
Publisher : Proceedings of CLEF 2014
Source : Proceedings of CLEF 2014 (2014)
Url : http://ceur-ws.org/Vol-1180/CLEF2014wn-Pan-SurendranEt2014.pdf
Campus : Amritapuri
School : Department of Computer Science and Engineering, School of Engineering
Department : Computer Science
Verified : No
Year : 2014
Abstract : With the evolution of internet, author profiling has become a topic of great interest in the field of forensics, security, marketing, plagiarism detection etc. However the task of identifying the characteristics of the author just based on a text document has its own limitations and challenges. This paper reports on the design, techniques and learning models we adopted for the PAN-2014 Author Profiling challenge. To identify the age and gender of an author from a document we employed ensemble learning approach by training a Random Forest classifier with the training data provided by PAN organizers for English language only. Our work indicate that readability metrics, function words and structural features play a vital role in identifying the age and gender of an author.
Cite this Research Publication : G. Gressel, K., S., A, A., Thara, S., P., H., and Prabaharan Poornachandran, “Ensemble learning approach for author profiling”, in Proceedings of CLEF 2014, 2014.