Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection

Background: Artificial intelligence (AI) has been proposed to reduce false-positive screens, increase cancer detection rates (CDRs), and address resourcing challenges faced by breast screening programs. We compared the accuracy of AI versus radiologists in real-world population breast cancer screeni...

Full description

Bibliographic Details
Main Authors: Marinovich, Luke, Wylie, Elizabeth, Lotter, William, Lund, Helen, Waddell, Andrew, Madeley, Carolyn, Pereira, Gavin, Houssami, Nehmat
Format: Journal Article
Language:English
Published: Elsevier 2023
Subjects:
Online Access:http://purl.org/au-research/grants/nhmrc/1099655
http://hdl.handle.net/20.500.11937/93238
_version_ 1848765716081147904
author Marinovich, Luke
Wylie, Elizabeth
Lotter, William
Lund, Helen
Waddell, Andrew
Madeley, Carolyn
Pereira, Gavin
Houssami, Nehmat
author_facet Marinovich, Luke
Wylie, Elizabeth
Lotter, William
Lund, Helen
Waddell, Andrew
Madeley, Carolyn
Pereira, Gavin
Houssami, Nehmat
author_sort Marinovich, Luke
building Curtin Institutional Repository
collection Online Access
description Background: Artificial intelligence (AI) has been proposed to reduce false-positive screens, increase cancer detection rates (CDRs), and address resourcing challenges faced by breast screening programs. We compared the accuracy of AI versus radiologists in real-world population breast cancer screening, and estimated potential impacts on CDR, recall and workload for simulated AI-radiologist reading. Methods: External validation of a commercially-available AI algorithm in a retrospective cohort of 108,970 consecutive mammograms from a population-based screening program, with ascertained outcomes (including interval cancers by registry linkage). Area under the ROC curve (AUC), sensitivity and specificity for AI were compared with radiologists who interpreted the screens in practice. CDR and recall were estimated for simulated AI-radiologist reading (with arbitration) and compared with program metrics. Findings: The AUC for AI was 0.83 compared with 0.93 for radiologists. At a prospective threshold, sensitivity for AI (0.67; 95% CI: 0.64–0.70) was comparable to radiologists (0.68; 95% CI: 0.66–0.71) with lower specificity (0.81 [95% CI: 0.81–0.81] versus 0.97 [95% CI: 0.97–0.97]). Recall rate for AI-radiologist reading (3.14%) was significantly lower than for the BSWA program (3.38%) (−0.25%; 95% CI: −0.31 to −0.18; P < 0.001). CDR was also lower (6.37 versus 6.97 per 1000) (−0.61; 95% CI: −0.77 to −0.44; P < 0.001); however, AI detected interval cancers that were not found by radiologists (0.72 per 1000; 95% CI: 0.57–0.90). AI-radiologist reading increased arbitration but decreased overall screen-reading volume by 41.4% (95% CI: 41.2–41.6). Interpretation: Replacement of one radiologist by AI (with arbitration) resulted in lower recall and overall screen-reading volume. There was a small reduction in CDR for AI-radiologist reading. AI detected interval cases that were not identified by radiologists, suggesting potentially higher CDR if radiologists were unblinded to AI findings. These results indicate AI's potential role as a screen-reader of mammograms, but prospective trials are required to determine whether CDR could improve if AI detection was actioned in double-reading with arbitration. Funding: National Breast Cancer Foundation (NBCF), National Health and Medical Research Council (NHMRC).
first_indexed 2025-11-14T11:39:40Z
format Journal Article
id curtin-20.500.11937-93238
institution Curtin University Malaysia
institution_category Local University
language eng
last_indexed 2025-11-14T11:39:40Z
publishDate 2023
publisher Elsevier
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-932382023-10-10T06:36:48Z Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection Marinovich, Luke Wylie, Elizabeth Lotter, William Lund, Helen Waddell, Andrew Madeley, Carolyn Pereira, Gavin Houssami, Nehmat Artificial intelligence Breast neoplasms Diagnostic screening programs Sensitivity and specificity Humans Female Breast Neoplasms Artificial Intelligence Retrospective Studies Prospective Studies Cohort Studies Mass Screening Early Detection of Cancer Mammography Humans Breast Neoplasms Mammography Mass Screening Retrospective Studies Cohort Studies Prospective Studies Artificial Intelligence Female Early Detection of Cancer Background: Artificial intelligence (AI) has been proposed to reduce false-positive screens, increase cancer detection rates (CDRs), and address resourcing challenges faced by breast screening programs. We compared the accuracy of AI versus radiologists in real-world population breast cancer screening, and estimated potential impacts on CDR, recall and workload for simulated AI-radiologist reading. Methods: External validation of a commercially-available AI algorithm in a retrospective cohort of 108,970 consecutive mammograms from a population-based screening program, with ascertained outcomes (including interval cancers by registry linkage). Area under the ROC curve (AUC), sensitivity and specificity for AI were compared with radiologists who interpreted the screens in practice. CDR and recall were estimated for simulated AI-radiologist reading (with arbitration) and compared with program metrics. Findings: The AUC for AI was 0.83 compared with 0.93 for radiologists. At a prospective threshold, sensitivity for AI (0.67; 95% CI: 0.64–0.70) was comparable to radiologists (0.68; 95% CI: 0.66–0.71) with lower specificity (0.81 [95% CI: 0.81–0.81] versus 0.97 [95% CI: 0.97–0.97]). Recall rate for AI-radiologist reading (3.14%) was significantly lower than for the BSWA program (3.38%) (−0.25%; 95% CI: −0.31 to −0.18; P < 0.001). CDR was also lower (6.37 versus 6.97 per 1000) (−0.61; 95% CI: −0.77 to −0.44; P < 0.001); however, AI detected interval cancers that were not found by radiologists (0.72 per 1000; 95% CI: 0.57–0.90). AI-radiologist reading increased arbitration but decreased overall screen-reading volume by 41.4% (95% CI: 41.2–41.6). Interpretation: Replacement of one radiologist by AI (with arbitration) resulted in lower recall and overall screen-reading volume. There was a small reduction in CDR for AI-radiologist reading. AI detected interval cases that were not identified by radiologists, suggesting potentially higher CDR if radiologists were unblinded to AI findings. These results indicate AI's potential role as a screen-reader of mammograms, but prospective trials are required to determine whether CDR could improve if AI detection was actioned in double-reading with arbitration. Funding: National Breast Cancer Foundation (NBCF), National Health and Medical Research Council (NHMRC). 2023 Journal Article http://hdl.handle.net/20.500.11937/93238 10.1016/j.ebiom.2023.104498 eng http://purl.org/au-research/grants/nhmrc/1099655 http://purl.org/au-research/grants/nhmrc/1173991 http://purl.org/au-research/grants/nhmrc/1194410 http://creativecommons.org/licenses/by-nc-nd/4.0/ Elsevier fulltext
spellingShingle Artificial intelligence
Breast neoplasms
Diagnostic screening programs
Sensitivity and specificity
Humans
Female
Breast Neoplasms
Artificial Intelligence
Retrospective Studies
Prospective Studies
Cohort Studies
Mass Screening
Early Detection of Cancer
Mammography
Humans
Breast Neoplasms
Mammography
Mass Screening
Retrospective Studies
Cohort Studies
Prospective Studies
Artificial Intelligence
Female
Early Detection of Cancer
Marinovich, Luke
Wylie, Elizabeth
Lotter, William
Lund, Helen
Waddell, Andrew
Madeley, Carolyn
Pereira, Gavin
Houssami, Nehmat
Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection
title Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection
title_full Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection
title_fullStr Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection
title_full_unstemmed Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection
title_short Artificial intelligence (AI) for breast cancer screening: BreastScreen population-based cohort study of cancer detection
title_sort artificial intelligence (ai) for breast cancer screening: breastscreen population-based cohort study of cancer detection
topic Artificial intelligence
Breast neoplasms
Diagnostic screening programs
Sensitivity and specificity
Humans
Female
Breast Neoplasms
Artificial Intelligence
Retrospective Studies
Prospective Studies
Cohort Studies
Mass Screening
Early Detection of Cancer
Mammography
Humans
Breast Neoplasms
Mammography
Mass Screening
Retrospective Studies
Cohort Studies
Prospective Studies
Artificial Intelligence
Female
Early Detection of Cancer
url http://purl.org/au-research/grants/nhmrc/1099655
http://purl.org/au-research/grants/nhmrc/1099655
http://purl.org/au-research/grants/nhmrc/1099655
http://hdl.handle.net/20.500.11937/93238