Publication Type : Conference Paper
Publisher : Advances in Intelligent Systems and Computing
Source : Advances in Intelligent Systems and Computing, Springer Verlag, Volume 750, p.499-508 (2019)
ISBN : 9789811318818
Keywords : Artificial intelligence, Big data, Classification algorithm, Cloud computing, Corpus, Count-based methods, Data mining, Factorization, Inverse problems, Iterative methods, Learning systems, Paraphrase identifications, Representation method, Singular value decomposition, Syntactics, Text processing
Campus : Bengaluru, Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Department : Electronics and Communication
Year : 2019
Abstract : Paraphrase identification is the task of determining whether two sentences convey similar meaning or not. Here, we have chosen count-based text representation methods, such as term-document matrix and term frequency-inverse document frequency matrix, along with the distributional representation methods of singular value decomposition and non-negative matrix factorization, which is iteratively used with different word share and minimum document frequency values. With the help of the above methods, the system will be able to learn features from the representations. These learned features are then used for measuring phrase-wise similarity between two sentences. The features are given to various machine learning classification algorithms and cross-validation accuracy is obtained. The corpus for this task has been created manually from different news domains. Due to the limitation of unavailability of the parser, only a set of collected data in the corpus has been used for this task. This is a first attempt in the task of paraphrase identification in Telugu language using this approach. © 2019, Springer Nature Singapore Pte Ltd.
Cite this Research Publication : A. D. Reddy, M. Kumar, A., Dr. Soman K. P., A.H., A., and B., J., “Paraphrase Identification in Telugu Using Machine Learning”, in Advances in Intelligent Systems and Computing, 2019, vol. 750, pp. 499-508.