Clustering breast cancer data by consensus of different validity indices

Clustering algorithms will, in general, either partition a given data set into a pre-specified number of clusters or will produce a hierarchy of clusters. In this paper we analyse several different clustering techniques and apply them to a particular data set of breast cancer data. When we do not kn...

Full description

Bibliographic Details
Main Authors: Soria, Daniele, Garibaldi, Jonathan M., Ambrogi, Federico, Lisboa, Paulo J.G., Boracchi, Patrizia, Biganzoli, Elia M.
Format: Conference or Workshop Item
Published: IET Digital Library 2008
Subjects:
Online Access:https://eprints.nottingham.ac.uk/28148/
_version_ 1848793517009141760
author Soria, Daniele
Garibaldi, Jonathan M.
Ambrogi, Federico
Lisboa, Paulo J.G.
Boracchi, Patrizia
Biganzoli, Elia M.
author_facet Soria, Daniele
Garibaldi, Jonathan M.
Ambrogi, Federico
Lisboa, Paulo J.G.
Boracchi, Patrizia
Biganzoli, Elia M.
author_sort Soria, Daniele
building Nottingham Research Data Repository
collection Online Access
description Clustering algorithms will, in general, either partition a given data set into a pre-specified number of clusters or will produce a hierarchy of clusters. In this paper we analyse several different clustering techniques and apply them to a particular data set of breast cancer data. When we do not know a priori which is the best number of groups, we use a range of different validity indices to test the quality of clustering results and to determine the best number of clusters. While for the K-means method there is not absolute agreement among the indices as to which is the best number of clusters, for the PAM algorithm all the indices indicate 4 as the best cluster number.
first_indexed 2025-11-14T19:01:33Z
format Conference or Workshop Item
id nottingham-28148
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T19:01:33Z
publishDate 2008
publisher IET Digital Library
recordtype eprints
repository_type Digital Repository
spelling nottingham-281482020-05-04T20:27:56Z https://eprints.nottingham.ac.uk/28148/ Clustering breast cancer data by consensus of different validity indices Soria, Daniele Garibaldi, Jonathan M. Ambrogi, Federico Lisboa, Paulo J.G. Boracchi, Patrizia Biganzoli, Elia M. Clustering algorithms will, in general, either partition a given data set into a pre-specified number of clusters or will produce a hierarchy of clusters. In this paper we analyse several different clustering techniques and apply them to a particular data set of breast cancer data. When we do not know a priori which is the best number of groups, we use a range of different validity indices to test the quality of clustering results and to determine the best number of clusters. While for the K-means method there is not absolute agreement among the indices as to which is the best number of clusters, for the PAM algorithm all the indices indicate 4 as the best cluster number. IET Digital Library 2008 Conference or Workshop Item PeerReviewed Soria, Daniele, Garibaldi, Jonathan M., Ambrogi, Federico, Lisboa, Paulo J.G., Boracchi, Patrizia and Biganzoli, Elia M. (2008) Clustering breast cancer data by consensus of different validity indices. In: International Conference on Advances in Medical, Signal and Information Processing (4th), 14-16 July 2008, Santa Margherita Ligure, Italy. Clustering algorithms Breast cancer Validity indices http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4609085&filter%3DAND%28p_IS_Number%3A4609057%29%26rowsPerPage%3D75
spellingShingle Clustering algorithms
Breast cancer
Validity indices
Soria, Daniele
Garibaldi, Jonathan M.
Ambrogi, Federico
Lisboa, Paulo J.G.
Boracchi, Patrizia
Biganzoli, Elia M.
Clustering breast cancer data by consensus of different validity indices
title Clustering breast cancer data by consensus of different validity indices
title_full Clustering breast cancer data by consensus of different validity indices
title_fullStr Clustering breast cancer data by consensus of different validity indices
title_full_unstemmed Clustering breast cancer data by consensus of different validity indices
title_short Clustering breast cancer data by consensus of different validity indices
title_sort clustering breast cancer data by consensus of different validity indices
topic Clustering algorithms
Breast cancer
Validity indices
url https://eprints.nottingham.ac.uk/28148/
https://eprints.nottingham.ac.uk/28148/