Publication Type : Journal Article
Thematic Areas : Center for Computational Engineering and Networking (CEN)
Publisher : Research India Publications
Source : ARPN Journal of Engineering and Applied Sciences, Volume 10, Issue 8, Number 8, p.3702-3707 (2015)
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Department : Electronics and Communication
Year : 2015
Abstract : The Era of digitization induces the need of domainclassification in both the on-line and off-line applications. The necessity of automatic text classification arises for utilizing it in diverse fields. Hence various methodologies like Machine Learningalgorithms were proposed to do the same. Here automatic document classification of Tamil documents have been proposed by considering the exponential growth of Tamil text documents in the form of unstructured data available as News, Encyclopedias, E-books, E-Governance, Social Media and much more. Max-Ent, CRF and SVM algorithms are used here to achieve more than 90 percentage average accuracy in both the sentence and document level classification of Tamil text documents. In this work Dinakarannewspaper dataset from EMILLE/CIIL Corpus has been utilized to experiment the ability of Machine Learning algorithms in Tamil domain classification. © 2006-2015 Asian Research Publishing Network (ARPN).
Cite this Research Publication : U. Reshma, Ganesh, H. B. Barathi, M. Kumar, A., and Dr. Soman K. P., “Supervised methods for domain classification of tamil documents”, ARPN Journal of Engineering and Applied Sciences, vol. 10, no. 8, pp. 3702-3707, 2015.