Programs
- M. Tech. in Automotive Engineering -
- Clinical Fellowship in Laboratory Genetics & Genomics - Fellowship
With the continuous growth of data from various sources, it becomes more common to use heterogeneous data storage. In such an environment, characterized by large independent, diverse, and dynamic information sources, access to relevant information is becoming increasingly complicated. The development of a unified data model by semantic integration of knowledge extracted from varied data sources can provide a solution to this problem. As distributed systems and data explosion continue to permeate in almost all domains, information retrieval seeks hidden intricate knowledge from the sources. However, retrieval from multiple sources is challenging due to semantic, syntactic, and structural heterogeneity of the data. So, an intelligent aggregation of individual sources and presenting a unified view to the stake holders can ease the query processing for semantic retrieval of data. State-of-the-art solutions tackle data and/or schematic heterogeneity. These are not suitable for systems with model heterogeneity such as different types of DBMS that require derivation of summaries.
This project aims to develop an integrated view of medical data derived from heterogeneous data sources. Three types of data sources are identified, department-wise patient data (structured form), clinical notes/ discharge summaries(unstructured form), and scanning reports of patients(Image+text). Knowledge extraction from these different sources is done using different machine learning techniques and semantic integration of extracted knowledge is done to give a unified view of underlying data sources. The unified view will be of object-relational model, which consists of relevant features and unique tuples from structured sources, a summary of clinical notes, and scanning images with corresponding keywords.