A random finite set model for data clustering

The goal of data clustering is to partition data points into groups to optimize a given objective function. While most existing clustering algorithms treat each data point as vector, in many applications each datum is not a vector but a point pattern or a set of points. Moreover, many existing clust...

Full description

Bibliographic Details
Main Authors: Phung, D., Vo, Ba-Ngu
Format: Conference Paper
Published: Institute of Electrical and Electronics Engineers Inc. 2014
Online Access:http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6916264&action=search&sortType=&rowsPerPage=&searchField=Search_All&matchBoolean=true&queryText=((A%20random%20finite%20set%20model%20for%20data%20clustering)%20AND%20phung)
http://hdl.handle.net/20.500.11937/25019
_version_ 1848751590437027840
author Phung, D.
Vo, Ba-Ngu
author_facet Phung, D.
Vo, Ba-Ngu
author_sort Phung, D.
building Curtin Institutional Repository
collection Online Access
description The goal of data clustering is to partition data points into groups to optimize a given objective function. While most existing clustering algorithms treat each data point as vector, in many applications each datum is not a vector but a point pattern or a set of points. Moreover, many existing clustering methods require the user to specify the number of clusters, which is not available in advance. This paper proposes a new class of models for data clustering that addresses set-valued data as well as unknown number of clusters, using a Dirichlet Process mixture of Poisson random finite sets. We also develop an efficient Markov Chain Monte Carlo posterior inference technique that can learn the number of clusters and mixture parameters automatically from the data. Numerical studies are presented to demonstrate the salient features of this new model, in particular its capacity to discover extremely unbalanced clusters in data.
first_indexed 2025-11-14T07:55:09Z
format Conference Paper
id curtin-20.500.11937-25019
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T07:55:09Z
publishDate 2014
publisher Institute of Electrical and Electronics Engineers Inc.
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-250192017-01-30T12:46:19Z A random finite set model for data clustering Phung, D. Vo, Ba-Ngu The goal of data clustering is to partition data points into groups to optimize a given objective function. While most existing clustering algorithms treat each data point as vector, in many applications each datum is not a vector but a point pattern or a set of points. Moreover, many existing clustering methods require the user to specify the number of clusters, which is not available in advance. This paper proposes a new class of models for data clustering that addresses set-valued data as well as unknown number of clusters, using a Dirichlet Process mixture of Poisson random finite sets. We also develop an efficient Markov Chain Monte Carlo posterior inference technique that can learn the number of clusters and mixture parameters automatically from the data. Numerical studies are presented to demonstrate the salient features of this new model, in particular its capacity to discover extremely unbalanced clusters in data. 2014 Conference Paper http://hdl.handle.net/20.500.11937/25019 http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6916264&action=search&sortType=&rowsPerPage=&searchField=Search_All&matchBoolean=true&queryText=((A%20random%20finite%20set%20model%20for%20data%20clustering)%20AND%20phung) http://purl.org/au-research/grants/arc/FT0991854 Institute of Electrical and Electronics Engineers Inc. restricted
spellingShingle Phung, D.
Vo, Ba-Ngu
A random finite set model for data clustering
title A random finite set model for data clustering
title_full A random finite set model for data clustering
title_fullStr A random finite set model for data clustering
title_full_unstemmed A random finite set model for data clustering
title_short A random finite set model for data clustering
title_sort random finite set model for data clustering
url http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6916264&action=search&sortType=&rowsPerPage=&searchField=Search_All&matchBoolean=true&queryText=((A%20random%20finite%20set%20model%20for%20data%20clustering)%20AND%20phung)
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6916264&action=search&sortType=&rowsPerPage=&searchField=Search_All&matchBoolean=true&queryText=((A%20random%20finite%20set%20model%20for%20data%20clustering)%20AND%20phung)
http://hdl.handle.net/20.500.11937/25019