DipTransformation: Enhancing the Structure of a Dataset and thereby improving Clustering

Author(s)
Benjamin Schelling, Claudia Plant
Abstract

A data set might have a well-defined structure, but this does not necessarily lead to good clustering results. If the structure is hidden in an unfavourable scaling, clustering will usually fail. The aim of this work is to present a technique which enhances the data set by re-scaling and transforming its features and thus emphasizing and accentuating its structure. If the structure is sufficiently clear, clustering algorithms will perform far better. To show that our algorithm works well, we have conducted extensive experiments on several real-world data sets, where we improve clustering not only for k-means, which is our main focus, but also for other standard clustering algorithms.

Organisation(s)
Research Group Data Mining and Machine Learning, Research Network Data Science
Pages
407-416
DOI
https://doi.org/10.1109/ICDM.2018.00056
Publication date
11-2018
Peer reviewed
Yes
Austrian Fields of Science 2012
102033 Data mining
Portal url
https://ucris.univie.ac.at/portal/en/publications/diptransformation-enhancing-the-structure-of-a-dataset-and-thereby-improving-clustering(5d37a110-7297-4148-af65-fd7892c5441b).html