Back close

Semantic integration of heterogeneous relational schemas using multiple L1 linear regression and SVD

Publication Type : Conference Paper

Publisher : International Conference on Data Science and Engineering, ICDSE 2014

Source : International Conference on Data Science and Engineering, ICDSE 2014, Institute of Electrical and Electronics Engineers Inc., p.105-111 (2014)

Url : http://www.scopus.com/inward/record.url?eid=2-s2.0-84936774427&partnerID=40&md5=a37fee353bd8128cad865acd7530a474

ISBN : 9781479968701

Keywords : Data integration, Heterogeneous database, High-dimensional, integration, Iterative methods, Linear regression, Linear subspace, Metadata, Multiple linear regressions, Regression analysis, Relational schemas, Schema information, Semantic integration, Semantics, Singular value decomposition, Technology advances

Campus : Amritapuri

School : Department of Computer Science and Engineering, School of Engineering

Center : AI (Artificial Intelligence) and Distributed Systems

Department : Computer Science

Verified : Yes

Year : 2014

Abstract : The challenge of semantic integration of heterogeneous databases is one of the critical areas of interest due to scalability of data and the need to share the existing data as the technology advances. The schema level heterogeneity of the relations is the major issue for such integration. Though various approaches of schema analysis, transformation and integration have been explored, sometimes those become too general to solve the problem especially when the data is very high-dimensional and the schema information is unavailable or inadequate. In this paper, a method to integrate heterogeneous relational schema at instance-level is proposed, rather than the schema level. A global schema is designed consisting of the integration of most relevant attributes of different relational schema of a particular domain. In order to find the significant attributes, multiple linear regressions based on LI norm and Singular Value Decomposition(SVD) is applied on the data iteratively. This is a variant of L1-PCA, which is efficient, effective and meaningful method of linear subspace estimation. The most prominent instance - level similarity is found by finding the most significant attributes of each relational data source and then finding the similarity among those attributes using L1-norm. Thus an integrated schema is created that maps the relevant attributes of each local schema to a global schema. © 2014 IEEE.

Cite this Research Publication : Sandhya Harikumar, Reethima, R., and Dr. Kaimal, M. R., “Semantic integration of heterogeneous relational schemas using multiple L1 linear regression and SVD”, in International Conference on Data Science and Engineering, ICDSE 2014, 2014, pp. 105-111

Admissions Apply Now