Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015

Preterm birth is a global public health problem with a significant burden on the individuals affected. The study aimed to extend current research on preterm birth prognostic model development by developing and internally validating models using machine learning classification algorithms and populati...

Full description

Bibliographic Details
Main Authors: Wong, Kingsley, Tessema, Gizachew, Chai, Kevin, Pereira, Gavin
Format: Journal Article
Language:English
Published: NATURE PORTFOLIO 2022
Subjects:
Online Access:http://purl.org/au-research/grants/nhmrc/1099655
http://hdl.handle.net/20.500.11937/93230
_version_ 1848765713822515200
author Wong, Kingsley
Tessema, Gizachew
Chai, Kevin
Pereira, Gavin
author_facet Wong, Kingsley
Tessema, Gizachew
Chai, Kevin
Pereira, Gavin
author_sort Wong, Kingsley
building Curtin Institutional Repository
collection Online Access
description Preterm birth is a global public health problem with a significant burden on the individuals affected. The study aimed to extend current research on preterm birth prognostic model development by developing and internally validating models using machine learning classification algorithms and population-based routinely collected data in Western Australia. The longitudinal retrospective cohort study involved all births in Western Australia between 1980 and 2015, and the analytic sample contains 81,974 (8.6%) preterm births (< 37 weeks of gestation). Prediction models for preterm birth were developed using regularised logistic regression, decision trees, Random Forests, extreme gradient boosting, and multi-layer perceptron (MLP). Predictors included maternal socio-demographics and medical conditions, current and past pregnancy complications, and family history. Class weight was applied to handle imbalanced outcomes and stratified tenfold cross-validation was used to reduce overfitting. Close to half of the preterm births (49.1% at 5% FPR, 95% CI 48.9%,49.5%) were correctly classified by the best performing classifier (MLP) for all women when current pregnancy information was available. The sensitivity was boosted to 52.7% (95% CI 52.1%,53.3%) after including past obstetric history in a sub-population of births from multiparous women. Around half of the preterm birth can be identified antenatally at high specificity using population-based routinely collected maternal and pregnancy data. The performance of the prediction models depends on the available predictor pool that is individual and time specific.
first_indexed 2025-11-14T11:39:38Z
format Journal Article
id curtin-20.500.11937-93230
institution Curtin University Malaysia
institution_category Local University
language English
last_indexed 2025-11-14T11:39:38Z
publishDate 2022
publisher NATURE PORTFOLIO
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-932302023-10-09T08:13:03Z Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015 Wong, Kingsley Tessema, Gizachew Chai, Kevin Pereira, Gavin Science & Technology Multidisciplinary Sciences Science & Technology - Other Topics HIGH-RISK OUTCOMES Pregnancy Infant, Newborn Humans Female Premature Birth Western Australia Retrospective Studies Prognosis Risk Factors Cohort Studies Machine Learning Humans Premature Birth Prognosis Risk Factors Retrospective Studies Cohort Studies Pregnancy Infant, Newborn Western Australia Female Machine Learning Preterm birth is a global public health problem with a significant burden on the individuals affected. The study aimed to extend current research on preterm birth prognostic model development by developing and internally validating models using machine learning classification algorithms and population-based routinely collected data in Western Australia. The longitudinal retrospective cohort study involved all births in Western Australia between 1980 and 2015, and the analytic sample contains 81,974 (8.6%) preterm births (< 37 weeks of gestation). Prediction models for preterm birth were developed using regularised logistic regression, decision trees, Random Forests, extreme gradient boosting, and multi-layer perceptron (MLP). Predictors included maternal socio-demographics and medical conditions, current and past pregnancy complications, and family history. Class weight was applied to handle imbalanced outcomes and stratified tenfold cross-validation was used to reduce overfitting. Close to half of the preterm births (49.1% at 5% FPR, 95% CI 48.9%,49.5%) were correctly classified by the best performing classifier (MLP) for all women when current pregnancy information was available. The sensitivity was boosted to 52.7% (95% CI 52.1%,53.3%) after including past obstetric history in a sub-population of births from multiparous women. Around half of the preterm birth can be identified antenatally at high specificity using population-based routinely collected maternal and pregnancy data. The performance of the prediction models depends on the available predictor pool that is individual and time specific. 2022 Journal Article http://hdl.handle.net/20.500.11937/93230 10.1038/s41598-022-23782-w English http://purl.org/au-research/grants/nhmrc/1099655 http://purl.org/au-research/grants/nhmrc/1173991 http://purl.org/au-research/grants/nhmrc/1195716 http://creativecommons.org/licenses/by/4.0/ NATURE PORTFOLIO fulltext
spellingShingle Science & Technology
Multidisciplinary Sciences
Science & Technology - Other Topics
HIGH-RISK
OUTCOMES
Pregnancy
Infant, Newborn
Humans
Female
Premature Birth
Western Australia
Retrospective Studies
Prognosis
Risk Factors
Cohort Studies
Machine Learning
Humans
Premature Birth
Prognosis
Risk Factors
Retrospective Studies
Cohort Studies
Pregnancy
Infant, Newborn
Western Australia
Female
Machine Learning
Wong, Kingsley
Tessema, Gizachew
Chai, Kevin
Pereira, Gavin
Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015
title Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015
title_full Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015
title_fullStr Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015
title_full_unstemmed Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015
title_short Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015
title_sort development of prognostic model for preterm birth using machine learning in a population-based cohort of western australia births between 1980 and 2015
topic Science & Technology
Multidisciplinary Sciences
Science & Technology - Other Topics
HIGH-RISK
OUTCOMES
Pregnancy
Infant, Newborn
Humans
Female
Premature Birth
Western Australia
Retrospective Studies
Prognosis
Risk Factors
Cohort Studies
Machine Learning
Humans
Premature Birth
Prognosis
Risk Factors
Retrospective Studies
Cohort Studies
Pregnancy
Infant, Newborn
Western Australia
Female
Machine Learning
url http://purl.org/au-research/grants/nhmrc/1099655
http://purl.org/au-research/grants/nhmrc/1099655
http://purl.org/au-research/grants/nhmrc/1099655
http://hdl.handle.net/20.500.11937/93230