A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information

Background: Machine learning algorithms have very high predictive ability. However, no study has used machine learning to estimate historical concentrations of PM2.5 (particulate matter with aerodynamic diameter ≤ 2.5 μm) at daily time scale in China at a national level. Objectives: To estimate dai...

Full description

Bibliographic Details
Main Authors: Chen, Gongbo, Li, Shanshan, Knibbs, Luke D., Hamm, Nicholas A.S., Cao, Wei, Li, Tiantian, Guo, Jianping, Ren, Hongyan, Abramson, Michael J., Guo, Yuming
Format: Article
Published: Elsevier 2018
Subjects:
Online Access:https://eprints.nottingham.ac.uk/53028/
_version_ 1848798862297268224
author Chen, Gongbo
Li, Shanshan
Knibbs, Luke D.
Hamm, Nicholas A.S.
Cao, Wei
Li, Tiantian
Guo, Jianping
Ren, Hongyan
Abramson, Michael J.
Guo, Yuming
author_facet Chen, Gongbo
Li, Shanshan
Knibbs, Luke D.
Hamm, Nicholas A.S.
Cao, Wei
Li, Tiantian
Guo, Jianping
Ren, Hongyan
Abramson, Michael J.
Guo, Yuming
author_sort Chen, Gongbo
building Nottingham Research Data Repository
collection Online Access
description Background: Machine learning algorithms have very high predictive ability. However, no study has used machine learning to estimate historical concentrations of PM2.5 (particulate matter with aerodynamic diameter ≤ 2.5 μm) at daily time scale in China at a national level. Objectives: To estimate daily concentrations of PM2.5 across China during 2005–2016. Methods: Daily ground-level PM 2.5 data were obtained from 1479 stations across China during 2014–2016. Data on aerosol optical depth (AOD), meteorological conditions and other predictors were downloaded. A random forests model (non-parametric machine learning algorithms) and two traditional regression models were developed to estimate ground-level PM 2.5 concentrations. The best-fit model was then utilized to estimate the daily concentrations of PM 2.5 across China with a resolution of 0.1° (≈10 km) during 2005–2016. Results: The daily random forests model showed much higher predictive accuracy than the other two traditional regression models, explaining the majority of spatial variability in daily PM 2.5 [10-fold cross-validation (CV) R2=83%, root mean squared prediction error (RMSE) = 28.1μg/m3]. At the monthly and annual time-scale, the explained variability of average PM 2.5 increased up to 86% (RMSE = 10.7μg/m3 and 6.9μg/m3, respectively). Conclusions: Taking advantage of a novel application of modeling framework and the most recent ground-level PM 2.5observations, the machine learning method showed higher predictive ability than previous studies.
first_indexed 2025-11-14T20:26:31Z
format Article
id nottingham-53028
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T20:26:31Z
publishDate 2018
publisher Elsevier
recordtype eprints
repository_type Digital Repository
spelling nottingham-530282020-05-04T19:49:08Z https://eprints.nottingham.ac.uk/53028/ A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information Chen, Gongbo Li, Shanshan Knibbs, Luke D. Hamm, Nicholas A.S. Cao, Wei Li, Tiantian Guo, Jianping Ren, Hongyan Abramson, Michael J. Guo, Yuming Background: Machine learning algorithms have very high predictive ability. However, no study has used machine learning to estimate historical concentrations of PM2.5 (particulate matter with aerodynamic diameter ≤ 2.5 μm) at daily time scale in China at a national level. Objectives: To estimate daily concentrations of PM2.5 across China during 2005–2016. Methods: Daily ground-level PM 2.5 data were obtained from 1479 stations across China during 2014–2016. Data on aerosol optical depth (AOD), meteorological conditions and other predictors were downloaded. A random forests model (non-parametric machine learning algorithms) and two traditional regression models were developed to estimate ground-level PM 2.5 concentrations. The best-fit model was then utilized to estimate the daily concentrations of PM 2.5 across China with a resolution of 0.1° (≈10 km) during 2005–2016. Results: The daily random forests model showed much higher predictive accuracy than the other two traditional regression models, explaining the majority of spatial variability in daily PM 2.5 [10-fold cross-validation (CV) R2=83%, root mean squared prediction error (RMSE) = 28.1μg/m3]. At the monthly and annual time-scale, the explained variability of average PM 2.5 increased up to 86% (RMSE = 10.7μg/m3 and 6.9μg/m3, respectively). Conclusions: Taking advantage of a novel application of modeling framework and the most recent ground-level PM 2.5observations, the machine learning method showed higher predictive ability than previous studies. Elsevier 2018-09-15 Article PeerReviewed Chen, Gongbo, Li, Shanshan, Knibbs, Luke D., Hamm, Nicholas A.S., Cao, Wei, Li, Tiantian, Guo, Jianping, Ren, Hongyan, Abramson, Michael J. and Guo, Yuming (2018) A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information. Science of The Total Environment, 636 . pp. 52-60. ISSN 1879-1026 PM2.5 Aerosol optical depth Random forests Machine learning China https://www.sciencedirect.com/science/article/pii/S0048969718314281?via%3Dihub doi:10.1016/j.scitotenv.2018.04.251 doi:10.1016/j.scitotenv.2018.04.251
spellingShingle PM2.5
Aerosol optical depth
Random forests
Machine learning
China
Chen, Gongbo
Li, Shanshan
Knibbs, Luke D.
Hamm, Nicholas A.S.
Cao, Wei
Li, Tiantian
Guo, Jianping
Ren, Hongyan
Abramson, Michael J.
Guo, Yuming
A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information
title A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information
title_full A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information
title_fullStr A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information
title_full_unstemmed A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information
title_short A machine learning method to estimate PM 2.5 concentrations across China with remote sensing, meteorological and land use information
title_sort machine learning method to estimate pm 2.5 concentrations across china with remote sensing, meteorological and land use information
topic PM2.5
Aerosol optical depth
Random forests
Machine learning
China
url https://eprints.nottingham.ac.uk/53028/
https://eprints.nottingham.ac.uk/53028/
https://eprints.nottingham.ac.uk/53028/