Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry

Objectives: Using the prediction of cancer outcome as a model, we have tested the hypothesis that through analysing routinely collected digital data contained in an electronic administrative record (EAR), using machine-learning techniques, we could enhance conventional methods in predicting clinical...

Full description

Bibliographic Details
Main Authors: Gupta, Sunil, Tran, The Truyen, Luo, W., Phung, D., Kennedy, R., Broad, A., Campbell, D., Kipp, D., Singh, M., Khasraw, M., Matheson, L., Ashley, D., Venkatesh, S.
Format: Journal Article
Published: BM J Group 2014
Online Access:http://hdl.handle.net/20.500.11937/24886
_version_ 1848751553139179520
author Gupta, Sunil
Tran, The Truyen
Luo, W.
Phung, D.
Kennedy, R.
Broad, A.
Campbell, D.
Kipp, D.
Singh, M.
Khasraw, M.
Matheson, L.
Ashley, D.
Venkatesh, S.
author_facet Gupta, Sunil
Tran, The Truyen
Luo, W.
Phung, D.
Kennedy, R.
Broad, A.
Campbell, D.
Kipp, D.
Singh, M.
Khasraw, M.
Matheson, L.
Ashley, D.
Venkatesh, S.
author_sort Gupta, Sunil
building Curtin Institutional Repository
collection Online Access
description Objectives: Using the prediction of cancer outcome as a model, we have tested the hypothesis that through analysing routinely collected digital data contained in an electronic administrative record (EAR), using machine-learning techniques, we could enhance conventional methods in predicting clinical outcomes. Setting: A regional cancer centre in Australia. Participants: Disease-specific data from a purpose-built cancer registry (Evaluation of Cancer Outcomes (ECO)) from 869 patients were used to predict survival at 6, 12 and 24 months. The model was validated with data from a further 94 patients, and results compared to the assessment of five specialist oncologists. Machine-learning prediction using ECO data was compared with that using EAR and a model combining ECO and EAR data. Primary and secondary outcome measures: Survival prediction accuracy in terms of the area under the receiver operating characteristic curve (AUC). Results: The ECO model yielded AUCs of 0.87 (95% CI 0.848 to 0.890) at 6 months, 0.796 (95% CI 0.774 to 0.823) at 12 months and 0.764 (95% CI 0.737 to 0.789) at 24 months. Each was slightly better than the performance of the clinician panel. The model performed consistently across a range of cancers, including rare cancers. Combining ECO and EAR data yielded better prediction than the ECO-based model (AUCs ranging from 0.757 to 0.997 for 6 months, AUCs from 0.689 to 0.988 for 12 months and AUCs from 0.713 to 0.973 for 24 months). The best prediction was for genitourinary, head and neck, lung, skin, and upper gastrointestinal tumours. Conclusions: Machine learning applied to information from a disease-specific (cancer) database and the EAR can be used to predict clinical outcomes. Importantly, the approach described made use of digital data that is already routinely collected but underexploited by clinical health systems.
first_indexed 2025-11-14T07:54:33Z
format Journal Article
id curtin-20.500.11937-24886
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T07:54:33Z
publishDate 2014
publisher BM J Group
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-248862017-09-13T15:15:01Z Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry Gupta, Sunil Tran, The Truyen Luo, W. Phung, D. Kennedy, R. Broad, A. Campbell, D. Kipp, D. Singh, M. Khasraw, M. Matheson, L. Ashley, D. Venkatesh, S. Objectives: Using the prediction of cancer outcome as a model, we have tested the hypothesis that through analysing routinely collected digital data contained in an electronic administrative record (EAR), using machine-learning techniques, we could enhance conventional methods in predicting clinical outcomes. Setting: A regional cancer centre in Australia. Participants: Disease-specific data from a purpose-built cancer registry (Evaluation of Cancer Outcomes (ECO)) from 869 patients were used to predict survival at 6, 12 and 24 months. The model was validated with data from a further 94 patients, and results compared to the assessment of five specialist oncologists. Machine-learning prediction using ECO data was compared with that using EAR and a model combining ECO and EAR data. Primary and secondary outcome measures: Survival prediction accuracy in terms of the area under the receiver operating characteristic curve (AUC). Results: The ECO model yielded AUCs of 0.87 (95% CI 0.848 to 0.890) at 6 months, 0.796 (95% CI 0.774 to 0.823) at 12 months and 0.764 (95% CI 0.737 to 0.789) at 24 months. Each was slightly better than the performance of the clinician panel. The model performed consistently across a range of cancers, including rare cancers. Combining ECO and EAR data yielded better prediction than the ECO-based model (AUCs ranging from 0.757 to 0.997 for 6 months, AUCs from 0.689 to 0.988 for 12 months and AUCs from 0.713 to 0.973 for 24 months). The best prediction was for genitourinary, head and neck, lung, skin, and upper gastrointestinal tumours. Conclusions: Machine learning applied to information from a disease-specific (cancer) database and the EAR can be used to predict clinical outcomes. Importantly, the approach described made use of digital data that is already routinely collected but underexploited by clinical health systems. 2014 Journal Article http://hdl.handle.net/20.500.11937/24886 10.1136/bmjopen-2013-004007 BM J Group fulltext
spellingShingle Gupta, Sunil
Tran, The Truyen
Luo, W.
Phung, D.
Kennedy, R.
Broad, A.
Campbell, D.
Kipp, D.
Singh, M.
Khasraw, M.
Matheson, L.
Ashley, D.
Venkatesh, S.
Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry
title Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry
title_full Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry
title_fullStr Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry
title_full_unstemmed Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry
title_short Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry
title_sort machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry
url http://hdl.handle.net/20.500.11937/24886