Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing

A leaf-eating pest, Metisa plana (Lepidoptera: Psychidae), could cause 10–13% leaf defoliation and up to 40% crop losses, which would have a significant detrimental economic influence on Malaysian oil palm on yield production. A manual census was carried out to measure the current level of infestati...

Full description

Bibliographic Details
Main Authors: Mohd Johari, Siti Nurul Afiah, Khairunniza-Bejo, Siti, Mohamed Shariff, Abdul Rashid, Husin, Nur Azuan, Mohd Masri, Mohamed Mazmira, Kamarudin, Noorhazwani
Format: Article
Language:English
Published: Springer Science and Business Media LLC 2024
Online Access:http://psasir.upm.edu.my/id/eprint/117901/
http://psasir.upm.edu.my/id/eprint/117901/1/117901.pdf
_version_ 1848867374638301184
author Mohd Johari, Siti Nurul Afiah
Khairunniza-Bejo, Siti
Mohamed Shariff, Abdul Rashid
Husin, Nur Azuan
Mohd Masri, Mohamed Mazmira
Kamarudin, Noorhazwani
author_facet Mohd Johari, Siti Nurul Afiah
Khairunniza-Bejo, Siti
Mohamed Shariff, Abdul Rashid
Husin, Nur Azuan
Mohd Masri, Mohamed Mazmira
Kamarudin, Noorhazwani
author_sort Mohd Johari, Siti Nurul Afiah
building UPM Institutional Repository
collection Online Access
description A leaf-eating pest, Metisa plana (Lepidoptera: Psychidae), could cause 10–13% leaf defoliation and up to 40% crop losses, which would have a significant detrimental economic influence on Malaysian oil palm on yield production. A manual census was carried out to measure the current level of infestation; however, it became time-consuming when covering a large area. Unmanned aerial vehicles (UAVs) were chosen as the solution due to their rapid assess of the severity of the bagworm infestation. Nevertheless, there is a greater chance of unbalanced data when employing UAV imagery, which may be a problem when determining the degree of infestation. Therefore, this study evaluated the impact of both balanced and imbalanced infestation level data on machine learning classification performance via three combinations of vegetation indices: NDVI-NDRE, NDVI-GNDVI and NDRE-GNDVI. Resampling method was carried out using random oversampling (ROS), synthetic minority oversampling techniques (SMOTE), random undersampling (RUS), 3-interval undersampling and 5-interval undersampling. Results showed that the best performance with 86.84% successful classification of 100% F1-score using imbalanced data of 3-interval undersampling. Fine KNN was constantly well performed in classifying all infestation levels in NDVI-NDRE combination across all datasets. The results unequivocally show that the 66.67% reduction in the sample size increases the chances of successful classification, even in situations where the data are unbalanced.
first_indexed 2025-11-15T14:35:29Z
format Article
id upm-117901
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T14:35:29Z
publishDate 2024
publisher Springer Science and Business Media LLC
recordtype eprints
repository_type Digital Repository
spelling upm-1179012025-06-16T07:50:07Z http://psasir.upm.edu.my/id/eprint/117901/ Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing Mohd Johari, Siti Nurul Afiah Khairunniza-Bejo, Siti Mohamed Shariff, Abdul Rashid Husin, Nur Azuan Mohd Masri, Mohamed Mazmira Kamarudin, Noorhazwani A leaf-eating pest, Metisa plana (Lepidoptera: Psychidae), could cause 10–13% leaf defoliation and up to 40% crop losses, which would have a significant detrimental economic influence on Malaysian oil palm on yield production. A manual census was carried out to measure the current level of infestation; however, it became time-consuming when covering a large area. Unmanned aerial vehicles (UAVs) were chosen as the solution due to their rapid assess of the severity of the bagworm infestation. Nevertheless, there is a greater chance of unbalanced data when employing UAV imagery, which may be a problem when determining the degree of infestation. Therefore, this study evaluated the impact of both balanced and imbalanced infestation level data on machine learning classification performance via three combinations of vegetation indices: NDVI-NDRE, NDVI-GNDVI and NDRE-GNDVI. Resampling method was carried out using random oversampling (ROS), synthetic minority oversampling techniques (SMOTE), random undersampling (RUS), 3-interval undersampling and 5-interval undersampling. Results showed that the best performance with 86.84% successful classification of 100% F1-score using imbalanced data of 3-interval undersampling. Fine KNN was constantly well performed in classifying all infestation levels in NDVI-NDRE combination across all datasets. The results unequivocally show that the 66.67% reduction in the sample size increases the chances of successful classification, even in situations where the data are unbalanced. Springer Science and Business Media LLC 2024 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/117901/1/117901.pdf Mohd Johari, Siti Nurul Afiah and Khairunniza-Bejo, Siti and Mohamed Shariff, Abdul Rashid and Husin, Nur Azuan and Mohd Masri, Mohamed Mazmira and Kamarudin, Noorhazwani (2024) Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing. Journal of Plant Diseases and Protection, 132 (1). art. no. 52. pp. 1-17. ISSN 1861-3829; eISSN: 1861-3837 https://link.springer.com/article/10.1007/s41348-024-01020-x?error=cookies_not_supported&code=698faf39-ff3a-44e2-a5f7-43a56ad12c8f 10.1007/s41348-024-01020-x
spellingShingle Mohd Johari, Siti Nurul Afiah
Khairunniza-Bejo, Siti
Mohamed Shariff, Abdul Rashid
Husin, Nur Azuan
Mohd Masri, Mohamed Mazmira
Kamarudin, Noorhazwani
Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing
title Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing
title_full Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing
title_fullStr Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing
title_full_unstemmed Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing
title_short Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing
title_sort effect of datasets size on the machine learning performance of the bagworm, metisa plana (walker) infestation using uav remote sensing
url http://psasir.upm.edu.my/id/eprint/117901/
http://psasir.upm.edu.my/id/eprint/117901/
http://psasir.upm.edu.my/id/eprint/117901/
http://psasir.upm.edu.my/id/eprint/117901/1/117901.pdf