Graph-based clustering with DRepStream

© 2017 ACM. Finding and setting input parameters for clustering algorithms is a challenging thing due to the unsupervised nature of clustering. The accuracy of clustering algorithms can be affected greatly by setting parameters appropriately for the dataset, however without ground truth labels and e...

Full description

Bibliographic Details
Main Authors: Callister, R., Lazarescu, Mihai, Pham, DucSon
Format: Conference Paper
Published: 2017
Online Access:http://hdl.handle.net/20.500.11937/58258
_version_ 1848760214726115328
author Callister, R.
Lazarescu, Mihai
Pham, DucSon
author_facet Callister, R.
Lazarescu, Mihai
Pham, DucSon
author_sort Callister, R.
building Curtin Institutional Repository
collection Online Access
description © 2017 ACM. Finding and setting input parameters for clustering algorithms is a challenging thing due to the unsupervised nature of clustering. The accuracy of clustering algorithms can be affected greatly by setting parameters appropriately for the dataset, however without ground truth labels and external validation it can be impossible to know when the parameters are set well. In this paper we propose the DRepStream algorithm, which extends the RepStream algorithm. DRepStream uses a graph-based approach, and unlike its predecessor does not require the primary K parameter used in K-nearest neighbour graphs. Our algorithm automatically computes the number of outgoing edges for each vertex in the graph using a computed metric known as the anomalous edge score. We evaluate the performance of our algorithm on other previous stream clustering algorithms on real world benchmark datasets.
first_indexed 2025-11-14T10:12:13Z
format Conference Paper
id curtin-20.500.11937-58258
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T10:12:13Z
publishDate 2017
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-582582017-11-24T05:46:56Z Graph-based clustering with DRepStream Callister, R. Lazarescu, Mihai Pham, DucSon © 2017 ACM. Finding and setting input parameters for clustering algorithms is a challenging thing due to the unsupervised nature of clustering. The accuracy of clustering algorithms can be affected greatly by setting parameters appropriately for the dataset, however without ground truth labels and external validation it can be impossible to know when the parameters are set well. In this paper we propose the DRepStream algorithm, which extends the RepStream algorithm. DRepStream uses a graph-based approach, and unlike its predecessor does not require the primary K parameter used in K-nearest neighbour graphs. Our algorithm automatically computes the number of outgoing edges for each vertex in the graph using a computed metric known as the anomalous edge score. We evaluate the performance of our algorithm on other previous stream clustering algorithms on real world benchmark datasets. 2017 Conference Paper http://hdl.handle.net/20.500.11937/58258 10.1145/3019612.3019672 restricted
spellingShingle Callister, R.
Lazarescu, Mihai
Pham, DucSon
Graph-based clustering with DRepStream
title Graph-based clustering with DRepStream
title_full Graph-based clustering with DRepStream
title_fullStr Graph-based clustering with DRepStream
title_full_unstemmed Graph-based clustering with DRepStream
title_short Graph-based clustering with DRepStream
title_sort graph-based clustering with drepstream
url http://hdl.handle.net/20.500.11937/58258