ITGC: Information-theoretic Grid-based Clustering

Author(s)
Sahar Behzadi Soheil, Hermann Hinterhauser, Claudia Plant
Abstract

Grid-based clustering algorithms are well-known due to their efficiency in terms of the fast processing time. On the other hand, when dealing with arbitrary shaped data sets, density-based methods are most of the time the best options. Accordingly, a combination of grid and density-based methods, where the advantages of both approaches are achievable, sounds interesting. However, most of the algorithms in these categories require a set of parameters to be specified while usually it is not trivial to appropriately set them. Thus, we propose an Information-Theoretic Grid-based Clustering (ITGC) algorithm by regarding the clustering as a data compression problem. That is, we merge the neighbour grid cells (clusters) when it pays off in terms of the compression cost. Our extensive synthetic and real-world experiments show the advantages of ITGC compared to the well-known clustering algorithms.

Organisation(s)
Research Group Data Mining and Machine Learning, Research Network Data Science
External organisation(s)
Universität Wien
Pages
618-621
No. of pages
4
DOI
https://doi.org/10.5441/002/edbt.2019.70
Publication date
2019
Peer reviewed
Yes
Austrian Fields of Science 2012
102033 Data mining
ASJC Scopus subject areas
Software, Information Systems, Computer Science Applications
Portal url
https://ucris.univie.ac.at/portal/en/publications/itgc-informationtheoretic-gridbased-clustering(75a97b0e-223c-4d13-ae36-437bf22a9a9e).html