Publication Type : Conference Proceedings
Publisher : Elsevier Procedia Computer Science
Source : Elsevier Procedia Computer Science
Url : https://www.sciencedirect.com/science/article/pii/S1877050920311224
Campus : Amritapuri, Bengaluru
School : School of Computing
Center : Computational Linguistics and Indic Studies
Year : 2020
Abstract : Named Entity Recognition (NER) is one of the fundamental process in Natural Language Processing applications. In this paper, we propose an Agriculture Named Entity Recognition using Topic Modelling techniques (AERTM Algorithm). In the agriculture domain, we have identified Names of Crops, Soil Types, Names of Pathogen, Crop Diseases and Fertilizers as the key entities. Our work presents a hybrid approach using the agriculture vocabulary AGROVOC and the AERTM algorithm. We used AGROVOC for identifying crop names. But it failed to identify Soil Types, Crop Diseases and Fertilizers. Hence, for those entities we propose a Latent Dirichlet Allocation (LDA) based topic modelling algorithm. These named entities can be used for creating a knowledge base which can be further used mainly in Relation Extraction systems, forums supported by various Government distinguished repositories, etc. Because of the absence of benchmark agriculture data, we tested our model using 3000 sentences extracted from reputed agriculture sites. Human evaluation of the method confirms that our approach gives an accuracy of 80%.
Cite this Research Publication : Veena Gangadharan, Deepa Gupta, Recognizing Named Entities in Agriculture Documents using LDA based Topic Modelling Techniques, Procedia Computer Science, Volume 171, 2020, Pages 1337-1345, ISSN 1877-0509, https://doi.org/10.1016/j.procs.2020.04.143.