SHADE: Deep Density-based Clustering

SHADE: Deep Density-based Clustering

Abstract

Detecting arbitrarily shaped clusters in high-dimensional noisy data is challenging for current clustering methods. We introduce SHADE (Structure-preserving High-dimensional Analysis with Density-based Exploration), the first deep clustering algorithm that incorporates density-connectivity into its loss function. Similar to existing deep clustering algorithms, SHADE supports high-dimensional and large data sets with the expressive power of a deep autoencoder. In contrast to most existing deep clustering methods that rely on a centroid-based clustering objective, SHADE incorporates a novel loss function that captures density-connectivity. SHADE thereby learns a representation that enhances the separation of density-connected clusters. SHADE detects a stable clustering and noise points fully automatically without any user input. It outperforms existing methods in clustering quality, especially on data that contain non-Gaussian clusters, such as video data. Moreover, the embedded space of SHADE is suitable for visualization and interpretation of the clustering results as the individual shapes of the clusters are preserved.

Grafik Top
Authors
  • Beer, Anna
  • Weber, Pascal
  • Miklautz, Lukas
  • Leiber, Collin
  • Durani, Walid
  • Böhm, Christian
  • Claudia, Plant
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
IEEE International Conference on Data Mining (ICDM) 2024
Divisions
Data Mining and Machine Learning
Subjects
Kuenstliche Intelligenz
Event Location
Abu Dhabi
Event Type
Conference
Event Dates
9 Dec - 12 Dec 2024
Date
December 2024
Export
Grafik Top