Dataset-Transformation: Improving Clustering by enhancing the structure with DipScaling and DipTransformation
- Author(s)
- Benjamin Schelling, Claudia Plant
- Abstract
A data set might have a well-defined structure, but this does not necessarily lead to good clustering results. If the structure is hidden in an unfavourable scaling, clustering will usually fail. The aim of this work is to present techniques - DipScaling and DipTransformation - which enhance the data set by re-scaling and transforming its features and thus emphasizing and accentuating its structure. If the structure is sufficiently clear, clustering algorithms will perform far better. We refer to such techniques as "Dataset-Transformations" and try to provide a mathematical framework for them. To show that our algorithms work well, we have conducted extensive experiments on several real-world data sets, where we improve clustering not only for k-means, which is our main focus but also for other standard clustering approaches.
- Organisation(s)
- Research Group Data Mining and Machine Learning, Research Network Data Science
- Journal
- Knowledge and Information Systems
- Volume
- 62
- Pages
- 457-484
- No. of pages
- 28
- ISSN
- 0219-1377
- DOI
- https://doi.org/10.1007/s10115-019-01388-5
- Publication date
- 02-2020
- Peer reviewed
- Yes
- Austrian Fields of Science 2012
- 102033 Data mining
- Keywords
- ASJC Scopus subject areas
- Software, Artificial Intelligence, Information Systems, Human-Computer Interaction, Hardware and Architecture
- Portal url
- https://ucris.univie.ac.at/portal/en/publications/datasettransformation-improving-clustering-by-enhancing-the-structure-with-dipscaling-and-diptransformation(a1ac9065-a4a6-4e94-9ea2-0384b0224fdc).html