Publication Type : Journal Article
Publisher : INFORMATION PAPER International Journal of Recent Trends in Engineering
Source : INFORMATION PAPER International Journal of Recent Trends in Engineering, Volume 1 (2009)
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Department : Electronics and Communication
Year : 2009
Abstract : There are different approaches to the problem of labeling a part of speech (POS) tag to each word of a natural language sentence. Parts of speech tagging is one of the most well studied problems in the field of Natural Language Processing (NLP).Parts of speech tagging is the sequence labeling problem. Labeling a POS tag to each word of an un-annotated corpus by hand is very time consuming which results in finding a method to automate the job. In this paper SVMTool is applied to the problem of part of speech tagging for TELUGU language. Pos tagging can be seen as multiclass classification problem. This paper mainly explains about how binary classifier can be used for multiclass classification problem. Telugu is written the way it is spoken. The tagset used in this paper consists of 10 tags. The training corpus consists of 25000 words. The obtained accuracy is around 95% for Telugu language. Better results can be achieved by increasing the corpus size.
Cite this Research Publication : G. Binulal, Goud, P., and Dr. Soman K. P., “A SVM based approach to Telugu Parts Of Speech Tagging using SVMTool”, INFORMATION PAPER International Journal of Recent Trends in Engineering, vol. 1, 2009.