Back close

Semantic Representation of Documents Based on Matrix Decomposition

Publication Type : Conference Paper

Publisher : 2018 International Conference on Data Science and Engineering, ICDSE 2018, Institute of Electrical and Electronics Engineers Inc.

Source : 2018 International Conference on Data Science and Engineering, ICDSE 2018, Institute of Electrical and Electronics Engineers Inc. (2018)

Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-85058325045&doi=10.1109%2fICDSE.2018.8527824&partnerID=40&md5=c9ce294f7b05db9d757d4a3103afecc0

ISBN : 9781538648551

Keywords : Approximation theory, Conventional methods, Data integration, Data integration system, Document analysis, Important features, Information Retrieval, Low rank approximations, Matrix algebra, Matrix decomposition, Query processing, Research laboratories, Search engines, Semantic representation of documents, Semantics, Sparse matrices

Campus : Amritapuri

School : Department of Computer Science and Engineering, School of Engineering

Center : AI (Artificial Intelligence) and Distributed Systems

Department : Computer Science

Verified : No

Year : 2018

Abstract : This paper addresses an important problem of semantic representation of documents for information retrieval in a data integration system. Quite often search query on documents seek relevant information. Conventional methods of feature extraction do not capture relevance but rather focus on term matching for query processing. Challenges of semantic representation of documents lie in identification of important features. Most of the techniques for identifying important features, transform original data to a different space. This gives a sparse matrix which is computationally expensive. So we come up with an alternative approach based on CUR matrix decomposition. This technique finds important documents and important terms in order to improvise the query processing. Experimentation results prove the efficacy of this approach on five data sets. © 2018 IEEE.

Cite this Research Publication : C. Baladevi and Sandhya Harikumar, “Semantic Representation of Documents Based on Matrix Decomposition”, in 2018 International Conference on Data Science and Engineering, ICDSE 2018, 2018

Admissions Apply Now