Back close

Character based bidirectional LSTM for disambiguating tamil part of speech categories

Publication Type : Journal Article

Publisher : International Journal of Control Theory and Applications, Serials Publications

Source : International Journal of Control Theory and Applications, Serials Publications, Volume 10, Number 3, p.229-235 (2017)

Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-85014052977&partnerID=40&md5=801a503b85f0e6953af146e9705df6c3

Campus : Coimbatore

School : School of Engineering

Center : Computational Engineering and Networking

Department : Computer Science, Electronics and Communication

Verified : No

Year : 2017

Abstract : Part of speech (POS) tagging is the process of labeling a part of speech tag to each and every word in the corpus. In this paper POS tagging for Tamil language is performed by using Bidirectional Long Short Term Memory. A C2W (character to word) model instead of traditional word lookup table for obtaining word embeddings using BLSTM is presented. The C2W model uses characters to form a vector representation of a word. The word embedding from C2W model is used by BLSTM to tag the words in the corpus. This method, when tested with 3723 words produced highest accuracy of 86.45%. © International Science Press.

Cite this Research Publication : K. S. Gokul Krishnan, Pooja, A., Dr. M. Anand Kumar, and Dr. Soman K. P., “Character based bidirectional LSTM for disambiguating tamil part of speech categories”, International Journal of Control Theory and Applications, vol. 10, pp. 229-235, 2017.

Admissions Apply Now