A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means

Previously, a semi-manual method was used to identify six novel and clinically useful classes in the Nottingham Tenovus Breast Cancer dataset. 663 out of 1,076 patients were classified. The objectives of our work is three folds. Firstly, our primary objective is to use one single automatic method (p...

Full description

Bibliographic Details
Main Authors: Lai, Daphne Teck Ching, Garibaldi, Jonathan M., Soria, Daniele, Roadknight, Christopher M.
Format: Article
Published: Springer Verlag 2014
Subjects:
Online Access:https://eprints.nottingham.ac.uk/28154/
_version_ 1848793517686521856
author Lai, Daphne Teck Ching
Garibaldi, Jonathan M.
Soria, Daniele
Roadknight, Christopher M.
author_facet Lai, Daphne Teck Ching
Garibaldi, Jonathan M.
Soria, Daniele
Roadknight, Christopher M.
author_sort Lai, Daphne Teck Ching
building Nottingham Research Data Repository
collection Online Access
description Previously, a semi-manual method was used to identify six novel and clinically useful classes in the Nottingham Tenovus Breast Cancer dataset. 663 out of 1,076 patients were classified. The objectives of our work is three folds. Firstly, our primary objective is to use one single automatic method (post-initialisation) to reproduce the six classes for the 663 patients and to classify the remaining 413 patients. Secondly, we explore using semi-supervised fuzzy c-means with various distance metrics and initialisation techniques to achieve this. Thirdly, the clinical characteristics of the 413 patients are examined by comparing with the 663 patients. Our experiments use various amount of labelled data and 10-fold cross validation to reproduce and evaluate the classification. ssFCM with Euclidean distance and initialisation technique by Katsavounidis et al. produced the best results. It is then used to classify the 413 patients. Visual evaluation of the 413 patients’ classifications revealed common characteristics as those previously reported. Examination of clinical characteristics indicates significant associations between classification and clinical parameters. More importantly, association between classification and survival based on the survival curves is shown.
first_indexed 2025-11-14T19:01:34Z
format Article
id nottingham-28154
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T19:01:34Z
publishDate 2014
publisher Springer Verlag
recordtype eprints
repository_type Digital Repository
spelling nottingham-281542020-05-04T20:13:28Z https://eprints.nottingham.ac.uk/28154/ A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means Lai, Daphne Teck Ching Garibaldi, Jonathan M. Soria, Daniele Roadknight, Christopher M. Previously, a semi-manual method was used to identify six novel and clinically useful classes in the Nottingham Tenovus Breast Cancer dataset. 663 out of 1,076 patients were classified. The objectives of our work is three folds. Firstly, our primary objective is to use one single automatic method (post-initialisation) to reproduce the six classes for the 663 patients and to classify the remaining 413 patients. Secondly, we explore using semi-supervised fuzzy c-means with various distance metrics and initialisation techniques to achieve this. Thirdly, the clinical characteristics of the 413 patients are examined by comparing with the 663 patients. Our experiments use various amount of labelled data and 10-fold cross validation to reproduce and evaluate the classification. ssFCM with Euclidean distance and initialisation technique by Katsavounidis et al. produced the best results. It is then used to classify the 413 patients. Visual evaluation of the 413 patients’ classifications revealed common characteristics as those previously reported. Examination of clinical characteristics indicates significant associations between classification and clinical parameters. More importantly, association between classification and survival based on the survival curves is shown. Springer Verlag 2014-09 Article PeerReviewed Lai, Daphne Teck Ching, Garibaldi, Jonathan M., Soria, Daniele and Roadknight, Christopher M. (2014) A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means. Central European Journal of Operations Research, 22 (3). pp. 475-499. ISSN 1435-246X Breast cancer Fuzzy clustering Molecular classification http://link.springer.com/article/10.1007/s10100-013-0318-3 doi:10.1007/s10100-013-0318-3 doi:10.1007/s10100-013-0318-3
spellingShingle Breast cancer
Fuzzy clustering
Molecular classification
Lai, Daphne Teck Ching
Garibaldi, Jonathan M.
Soria, Daniele
Roadknight, Christopher M.
A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means
title A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means
title_full A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means
title_fullStr A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means
title_full_unstemmed A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means
title_short A methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means
title_sort methodology for automatic classification of breast cancer immunohistochemical data using semi-supervised fuzzy c-means
topic Breast cancer
Fuzzy clustering
Molecular classification
url https://eprints.nottingham.ac.uk/28154/
https://eprints.nottingham.ac.uk/28154/
https://eprints.nottingham.ac.uk/28154/