Multivariate Image Processing in Minerals Engineering with Vision Transformers
Vision transformers (ViTs) are a new class of deep learning algorithms that have recently emerged as a competitive alternative to convolutional neural networks. In this investigation, their application to two operations previously studied in the mineral processing industry is considered. These are i...
| Main Authors: | , |
|---|---|
| Format: | Journal Article |
| Published: |
Elsevier
2024
|
| Online Access: | http://purl.org/au-research/grants/arc/CE200100009 http://hdl.handle.net/20.500.11937/94374 |
| _version_ | 1848765863391395840 |
|---|---|
| author | Liu, Xiu Aldrich, Chris |
| author_facet | Liu, Xiu Aldrich, Chris |
| author_sort | Liu, Xiu |
| building | Curtin Institutional Repository |
| collection | Online Access |
| description | Vision transformers (ViTs) are a new class of deep learning algorithms that have recently emerged as a competitive alternative to convolutional neural networks. In this investigation, their application to two operations previously studied in the mineral processing industry is considered. These are image recognition of fines in coal particles on conveyor belts and characterisation of the particle size in the underflow of a hydrocyclone. Promising results were achieved by use of vision transformers, as they performed as well as, or better than convolutional neural networks in these image recognition problems. In addition, features extracted from the best ViT model could be used to visualise its performance and these features could also serve as a basis for nonlinear process monitoring models. Furthermore, explainability techniques such as attention maps for ViTs were implemented to better understand the ViT models, similar to techniques such as occlusion sensitivity maps used with convolutional neural networks. |
| first_indexed | 2025-11-14T11:42:00Z |
| format | Journal Article |
| id | curtin-20.500.11937-94374 |
| institution | Curtin University Malaysia |
| institution_category | Local University |
| last_indexed | 2025-11-14T11:42:00Z |
| publishDate | 2024 |
| publisher | Elsevier |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | curtin-20.500.11937-943742024-04-04T06:01:21Z Multivariate Image Processing in Minerals Engineering with Vision Transformers Liu, Xiu Aldrich, Chris Vision transformers (ViTs) are a new class of deep learning algorithms that have recently emerged as a competitive alternative to convolutional neural networks. In this investigation, their application to two operations previously studied in the mineral processing industry is considered. These are image recognition of fines in coal particles on conveyor belts and characterisation of the particle size in the underflow of a hydrocyclone. Promising results were achieved by use of vision transformers, as they performed as well as, or better than convolutional neural networks in these image recognition problems. In addition, features extracted from the best ViT model could be used to visualise its performance and these features could also serve as a basis for nonlinear process monitoring models. Furthermore, explainability techniques such as attention maps for ViTs were implemented to better understand the ViT models, similar to techniques such as occlusion sensitivity maps used with convolutional neural networks. 2024 Journal Article http://hdl.handle.net/20.500.11937/94374 10.1016/j.mineng.2024.108599 http://purl.org/au-research/grants/arc/CE200100009 http://creativecommons.org/licenses/by/4.0/ Elsevier fulltext |
| spellingShingle | Liu, Xiu Aldrich, Chris Multivariate Image Processing in Minerals Engineering with Vision Transformers |
| title | Multivariate Image Processing in Minerals Engineering with Vision Transformers |
| title_full | Multivariate Image Processing in Minerals Engineering with Vision Transformers |
| title_fullStr | Multivariate Image Processing in Minerals Engineering with Vision Transformers |
| title_full_unstemmed | Multivariate Image Processing in Minerals Engineering with Vision Transformers |
| title_short | Multivariate Image Processing in Minerals Engineering with Vision Transformers |
| title_sort | multivariate image processing in minerals engineering with vision transformers |
| url | http://purl.org/au-research/grants/arc/CE200100009 http://hdl.handle.net/20.500.11937/94374 |