KMN - Removing Noise from K-Means Clustering Results

Author(s)
Benjamin Schelling, Claudia Plant
Abstract

K-Means is one of the most important data mining techniques for scientists who want to analyze their data. But K-Means has the disadvantage that it is unable to handle noise data points. This paper proposes a technique that can be applied to the k-means Clustering result to exclude noise data points. We refer to it as KMN (short for K-Means with Noise). This technique is compatible with the different strategies to initialize k-means and determine the number of clusters. Moreover, it is completely parameter-free. The technique has been tested on artificial and real data sets to demonstrate its performance in comparison with other noise-excluding techniques for k-means.

Organisation(s)
Research Group Data Mining and Machine Learning, Research Network Data Science
Pages
137-151
No. of pages
15
DOI
https://doi.org/10.1007/978-3-319-98539-8_11
Publication date
09-2018
Peer reviewed
Yes
Austrian Fields of Science 2012
102033 Data mining
ASJC Scopus subject areas
Theoretical Computer Science, Computer Science(all)
Portal url
https://ucris.univie.ac.at/portal/en/publications/kmn--removing-noise-from-kmeans-clustering-results(b991dc27-f8a3-4e6a-8068-7f7c2eed2f58).html