Publication Type : Conference Proceedings
Publisher : IEEE
Source : 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184)
Url : https://ieeexplore.ieee.org/abstract/document/9142877
Campus : Amritapuri
School : School of Computing
Center : Computational Linguistics and Indic Studies
Year : 2020
Abstract : This paper focuses on detecting paraphrase in sentences using different word vectorization techniques and finding which vectorization method is more efficient. Word vectorization is a technique which is used to retrieve information from large collection of textual data like corpus or documents by associating each word as a vector. As the textual data are massive, the problem with the text data is that it need to defined in a form of numbers for solving mathematical problems. There are elementary to composite methods to solve this problem. In this paper we are comparing different word vectorization techniques and they are, Count Vectorizer,Hashing Vectorizer, TF-IDF Vectorizer, fastText, ELMo, GloVe, BERT.
Cite this Research Publication : V. Gangadharan, D. Gupta, A. L. and A. T.A., "Paraphrase Detection Using Deep Neural Network Based Word Embedding Techniques," 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184), 2020, pp. 517-521, doi: 10.1109/ICOEI48184.2020.9142877.