Details (Don't) Matter: Isolating Cluster Information in Deep Embedded Spaces

Author(s)
Lukas Miklautz, Lena Bauer, Dominik Mautz, Sebastian Tschiatschek, Christian Böhm, Claudia Plant
Abstract

Deep clustering techniques combine representation learning with clustering objectives to improve their performance. Among existing deep clustering techniques, autoencoder-based methods are the most prevalent ones. While they achieve promising clustering results, they suffer from an inherent conflict between preserving details, as expressed by the reconstruction loss, and finding similar groups by ignoring details, as expressed by the clustering loss. This conflict leads to brittle training procedures, dependence on trade-off hyperparameters and less interpretable results. We propose our framework, ACe/DeC, that is compatible with Autoencoder Centroid based Deep Clustering methods and automatically learns a latent representation consisting of two separate spaces. The clustering space captures all cluster-specific information and the shared space explains general variation in the data. This separation resolves the above mentioned conflict and allows our method to learn both detailed reconstructions and cluster specific abstractions. We evaluate our framework with extensive experiments to show several benefits: (1) cluster performance – on various data sets we outperform relevant baselines; (2) no hyperparameter tuning – this improved performance is achieved without introducing new clustering specific hyperparameters; (3) interpretability – isolating the cluster specific information in a separate space is advantageous for data exploration and interpreting the clustering results; and (4) dimensionality of the embedded space – we automatically learn a low dimensional space for clustering. Our ACe/DeC framework isolates cluster information, increases stability and interpretability, while improving cluster performance.

Organisation(s)
Research Group Data Mining and Machine Learning, Research Network Data Science
External organisation(s)
Ludwig-Maximilians-Universität München, Munich Center for Machine Learning (MCML)
Pages
2826-2832
DOI
https://doi.org/10.24963/ijcai.2021/389
Publication date
2021
Peer reviewed
Yes
Austrian Fields of Science 2012
102033 Data mining
Portal url
https://ucris.univie.ac.at/portal/en/publications/details-dont-matter-isolating-cluster-information-in-deep-embedded-spaces(94d1ff00-f7f4-456e-b920-050f34145dc4).html