A machine learning approach to geochemical mapping

Geochemical maps provide invaluable evidence to guide decisions on issues of mineral exploration, agriculture, and environmental health. However, the high cost of chemical analysis means that the ground sampling density will always be limited. Traditionally, geochemical maps have been produced throu...

Full description

Bibliographic Details
Main Authors: Kirkwood, Charlie, Cave, Mark, Beamish, David, Grebby, Stephen, Ferreira, Antonio
Format: Article
Published: Elsevier 2016
Subjects:
Online Access:https://eprints.nottingham.ac.uk/33879/
_version_ 1848794726514294784
author Kirkwood, Charlie
Cave, Mark
Beamish, David
Grebby, Stephen
Ferreira, Antonio
author_facet Kirkwood, Charlie
Cave, Mark
Beamish, David
Grebby, Stephen
Ferreira, Antonio
author_sort Kirkwood, Charlie
building Nottingham Research Data Repository
collection Online Access
description Geochemical maps provide invaluable evidence to guide decisions on issues of mineral exploration, agriculture, and environmental health. However, the high cost of chemical analysis means that the ground sampling density will always be limited. Traditionally, geochemical maps have been produced through the interpolation of measured element concentrations between sample sites using models based on the spatial autocorrelation of data (e.g. semivariogram models for ordinary kriging). In their simplest form such models fail to consider potentially useful auxiliary information about the region and the accuracy of the maps may suffer as a result. In contrast, this study uses quantile regression forests (an elaboration of random forest) to investigate the potential of high resolution auxiliary information alone to support the generation of accurate and interpretable geochemical maps. This paper presents a summary of the performance of quantile regression forests in predicting element concentrations, loss on ignition and pH in the soils of south west England using high resolution remote sensing and geophysical survey data. Through stratified 10-fold cross validation we find the accuracy of quantile regression forests in predicting soil geochemistry in south west England to be a general improvement over that offered by ordinary kriging. Concentrations of immobile elements whose distributions are most tightly controlled by bedrock lithology are predicted with the greatest accuracy (e.g. Al with a cross-validated R2 of 0.79), while concentrations of more mobile elements prove harder to predict. In addition to providing a high level of prediction accuracy, models built on high resolution auxiliary variables allow for informative, process based, interpretations to be made. In conclusion, this study has highlighted the ability to map and understand the surface environment with greater accuracy and detail than previously possible by combining information from multiple datasets. As the quality and coverage of remote sensing and geophysical surveys continue to improve, machine learning methods will provide a means to interpret the otherwise-uninterpretable.
first_indexed 2025-11-14T19:20:46Z
format Article
id nottingham-33879
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T19:20:46Z
publishDate 2016
publisher Elsevier
recordtype eprints
repository_type Digital Repository
spelling nottingham-338792020-05-04T20:01:43Z https://eprints.nottingham.ac.uk/33879/ A machine learning approach to geochemical mapping Kirkwood, Charlie Cave, Mark Beamish, David Grebby, Stephen Ferreira, Antonio Geochemical maps provide invaluable evidence to guide decisions on issues of mineral exploration, agriculture, and environmental health. However, the high cost of chemical analysis means that the ground sampling density will always be limited. Traditionally, geochemical maps have been produced through the interpolation of measured element concentrations between sample sites using models based on the spatial autocorrelation of data (e.g. semivariogram models for ordinary kriging). In their simplest form such models fail to consider potentially useful auxiliary information about the region and the accuracy of the maps may suffer as a result. In contrast, this study uses quantile regression forests (an elaboration of random forest) to investigate the potential of high resolution auxiliary information alone to support the generation of accurate and interpretable geochemical maps. This paper presents a summary of the performance of quantile regression forests in predicting element concentrations, loss on ignition and pH in the soils of south west England using high resolution remote sensing and geophysical survey data. Through stratified 10-fold cross validation we find the accuracy of quantile regression forests in predicting soil geochemistry in south west England to be a general improvement over that offered by ordinary kriging. Concentrations of immobile elements whose distributions are most tightly controlled by bedrock lithology are predicted with the greatest accuracy (e.g. Al with a cross-validated R2 of 0.79), while concentrations of more mobile elements prove harder to predict. In addition to providing a high level of prediction accuracy, models built on high resolution auxiliary variables allow for informative, process based, interpretations to be made. In conclusion, this study has highlighted the ability to map and understand the surface environment with greater accuracy and detail than previously possible by combining information from multiple datasets. As the quality and coverage of remote sensing and geophysical surveys continue to improve, machine learning methods will provide a means to interpret the otherwise-uninterpretable. Elsevier 2016-08 Article PeerReviewed Kirkwood, Charlie, Cave, Mark, Beamish, David, Grebby, Stephen and Ferreira, Antonio (2016) A machine learning approach to geochemical mapping. Journal of Geochemical Exploration, 167 . pp. 49-61. ISSN 1879-1689 Uncertainty Modelling Soil geochemistry Quantile regression Random forest South west England http://www.sciencedirect.com/science/article/pii/S037567421630098X doi:10.1016/j.gexplo.2016.05.003 doi:10.1016/j.gexplo.2016.05.003
spellingShingle Uncertainty
Modelling
Soil geochemistry
Quantile regression
Random forest
South west England
Kirkwood, Charlie
Cave, Mark
Beamish, David
Grebby, Stephen
Ferreira, Antonio
A machine learning approach to geochemical mapping
title A machine learning approach to geochemical mapping
title_full A machine learning approach to geochemical mapping
title_fullStr A machine learning approach to geochemical mapping
title_full_unstemmed A machine learning approach to geochemical mapping
title_short A machine learning approach to geochemical mapping
title_sort machine learning approach to geochemical mapping
topic Uncertainty
Modelling
Soil geochemistry
Quantile regression
Random forest
South west England
url https://eprints.nottingham.ac.uk/33879/
https://eprints.nottingham.ac.uk/33879/
https://eprints.nottingham.ac.uk/33879/