Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances

We derived machine learning models utilizing features generated by natural language processing (NLP) of free-text data from an ambulance services provider to identify fall cases. The data comprised samples of electronic patient care records care records (ePCRs) from St John Western Australia (WA), t...

Full description

Bibliographic Details
Main Authors: Tohira, Hideo, Finn, Judith, Ball, Stephen, Brink, D., Buzzacott, Peter
Format: Journal Article
Language:English
Published: 2021
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/87826
_version_ 1848764939615862784
author Tohira, Hideo
Finn, Judith
Ball, Stephen
Brink, D.
Buzzacott, Peter
author_facet Tohira, Hideo
Finn, Judith
Ball, Stephen
Brink, D.
Buzzacott, Peter
author_sort Tohira, Hideo
building Curtin Institutional Repository
collection Online Access
description We derived machine learning models utilizing features generated by natural language processing (NLP) of free-text data from an ambulance services provider to identify fall cases. The data comprised samples of electronic patient care records care records (ePCRs) from St John Western Australia (WA), the sole ambulance services provider in most of WA. We manually labeled fall cases by reviewing the free-text summary. The models used features including case characteristics (e.g., age) and text frequency-inverse document frequency (tf-idf) of each word of the free-text generated by NLP. Support vector machine (SVM) and random forest were used as classifiers. We compared the performance of the models against the manual identification of falls by recall, precision, and F-measure. A total of 9,447 cases (1%) were randomly sampled, of which 1,648 (17%) were labeled as fall. The best model was an SVM model using case characteristics and tf-idf’s of the first 100 words of free-text, with recall of 0.84, precision of 0.86, and F-measure of 0.85. This performance was better than an SVM model with only case characteristics. Machine-learning models incorporated with features generated by NLP improved the performance of classifying fall cases compared with models without such features. Scope remains for further improvement.
first_indexed 2025-11-14T11:27:19Z
format Journal Article
id curtin-20.500.11937-87826
institution Curtin University Malaysia
institution_category Local University
language eng
last_indexed 2025-11-14T11:27:19Z
publishDate 2021
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-878262022-03-04T03:52:08Z Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances Tohira, Hideo Finn, Judith Ball, Stephen Brink, D. Buzzacott, Peter Emergency medical services random forest support vector machine text frequency-inverse document frequency We derived machine learning models utilizing features generated by natural language processing (NLP) of free-text data from an ambulance services provider to identify fall cases. The data comprised samples of electronic patient care records care records (ePCRs) from St John Western Australia (WA), the sole ambulance services provider in most of WA. We manually labeled fall cases by reviewing the free-text summary. The models used features including case characteristics (e.g., age) and text frequency-inverse document frequency (tf-idf) of each word of the free-text generated by NLP. Support vector machine (SVM) and random forest were used as classifiers. We compared the performance of the models against the manual identification of falls by recall, precision, and F-measure. A total of 9,447 cases (1%) were randomly sampled, of which 1,648 (17%) were labeled as fall. The best model was an SVM model using case characteristics and tf-idf’s of the first 100 words of free-text, with recall of 0.84, precision of 0.86, and F-measure of 0.85. This performance was better than an SVM model with only case characteristics. Machine-learning models incorporated with features generated by NLP improved the performance of classifying fall cases compared with models without such features. Scope remains for further improvement. 2021 Journal Article http://hdl.handle.net/20.500.11937/87826 10.1080/17538157.2021.2019038 eng restricted
spellingShingle Emergency medical services
random forest
support vector machine
text frequency-inverse document frequency
Tohira, Hideo
Finn, Judith
Ball, Stephen
Brink, D.
Buzzacott, Peter
Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances
title Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances
title_full Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances
title_fullStr Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances
title_full_unstemmed Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances
title_short Machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances
title_sort machine learning and natural language processing to identify falls in electronic patient care records from ambulance attendances
topic Emergency medical services
random forest
support vector machine
text frequency-inverse document frequency
url http://hdl.handle.net/20.500.11937/87826