K-Medoid Clustering for Heterogeneous DataSets

Publication Type : Journal Article

Publisher : Elsevier

Source : Procedia Computer Science, 70, 226-237, Elsevier

Url : https://www.sciencedirect.com/science/article/pii/S187705091503241X

Campus : Amritapuri

School : School of Computing

Center : AI (Artificial Intelligence) and Distributed Systems

Year : 2015

Abstract : Recent years have explored various clustering strategies to partition datasets comprising of heterogeneous domains or types such as categorical, numerical and binary. Clustering algorithms seek to identify homogeneous groups of objects based on the values of their attributes. These algorithms either assume the attributes to be of homogeneous types or are converted into homogeneous types. However, datasets with heterogeneous data types are common in real life applications, which if converted, can lead to loss of information. This paper proposes a new similarity measure in the form of triplet to find the distance between two data objects with heterogeneous attribute types. A new k-medoid type of clustering algorithm is proposed by leveraging the similarity measure in the form of a vector. The proposed k-medoid type of clustering algorithm is compared with traditional clustering algorithms, based on cluster validation using Purity Index and Davies Bouldin index. Results show that the new clustering algorithm with new similarity measure outperforms the k-means clustering for mixed datasets.

Cite this Research Publication : S Harikumar, PV Surya, "K-Medoid Clustering for Heterogeneous DataSets," Procedia Computer Science, 70, 226-237, Elsevier

About Amrita Vishwa Vidyapeetham

Rankings

Accreditation

Governance

Chancellor

Leadership

Press Media

Newsletters

Amritapuri
Campus

Amaravati
Campus

Bengaluru
Campus

Chennai
Campus

Coimbatore
Campus

Faridabad
Campus

Kochi
Campus

Mysuru
Campus

Nagercoil
Campus

Research

Centers

Patents

Publication