Back close

A depth-based nearest neighbor algorithm for high-dimensional data classification

Publication Type : Journal Article

Publisher : Tubitak Academic Journals

Source : Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 27: No. 6, Article 5. https://doi.org/10.3906/elk-1807-163 Publisher: Tubitak Academic Journals

Url : https://journals.tubitak.gov.tr/elektrik/vol27/iss6/5/

Keywords : Subspace-clustering, data-depth, information gain, nearest neighbor, classification

Campus : Amritapuri

School : Department of Computer Science and Engineering

Center : AI (Artificial Intelligence) and Distributed Systems

Department : Computer Science

Year : 2019

Abstract : Nearest neighbor algorithms like k-nearest neighbors (kNN) are fundamental supervised learning techniques to classify a query instance based on class labels of its neighbors. However, quite often, huge volumes of datasets are not fully labeled and the unknown probability distribution of the instances may be uneven. Moreover, kNN suffers from challenges like curse of dimensionality, setting the optimal number of neighbors, and scalability for high-dimensional data. To overcome these challenges, we propose an improvised approach of classification via depth representation of subspace clusters formed from high-dimensional data. We offer a consistent and principled approach to dynamically choose the nearest neighbors for classification of a query point by i) identifying structures and distributions of data; ii) extracting relevant features, and iii) deriving an optimum value of k depending on the structure of data by representing data using data depth function. We propose an improvised classification algorithm using a depth-based representation of clusters, to improve performance in terms of execution time and accuracy. Experimentation on real-world datasets reveals that proposed approach is at least two orders of magnitude faster for high-dimensional dataset and is at least as accurate as traditional kNN.

Cite this Research Publication : Harikumar, Sandhya; A.S, Akhil; and Kaimal, Ramachandra (2019) "A depth-based nearest neighbor algorithmfor high-dimensional data classification," Turkish Journal of Electrical Engineering and Computer Sciences: Vol. 27: No. 6, Article 5. https://doi.org/10.3906/elk-1807-163 Publisher: Tubitak Academic Journals

Admissions Apply Now