Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing
A leaf-eating pest, Metisa plana (Lepidoptera: Psychidae), could cause 10–13% leaf defoliation and up to 40% crop losses, which would have a significant detrimental economic influence on Malaysian oil palm on yield production. A manual census was carried out to measure the current level of infestati...
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer Science and Business Media LLC
2024
|
| Online Access: | http://psasir.upm.edu.my/id/eprint/117901/ http://psasir.upm.edu.my/id/eprint/117901/1/117901.pdf |
| _version_ | 1848867374638301184 |
|---|---|
| author | Mohd Johari, Siti Nurul Afiah Khairunniza-Bejo, Siti Mohamed Shariff, Abdul Rashid Husin, Nur Azuan Mohd Masri, Mohamed Mazmira Kamarudin, Noorhazwani |
| author_facet | Mohd Johari, Siti Nurul Afiah Khairunniza-Bejo, Siti Mohamed Shariff, Abdul Rashid Husin, Nur Azuan Mohd Masri, Mohamed Mazmira Kamarudin, Noorhazwani |
| author_sort | Mohd Johari, Siti Nurul Afiah |
| building | UPM Institutional Repository |
| collection | Online Access |
| description | A leaf-eating pest, Metisa plana (Lepidoptera: Psychidae), could cause 10–13% leaf defoliation and up to 40% crop losses, which would have a significant detrimental economic influence on Malaysian oil palm on yield production. A manual census was carried out to measure the current level of infestation; however, it became time-consuming when covering a large area. Unmanned aerial vehicles (UAVs) were chosen as the solution due to their rapid assess of the severity of the bagworm infestation. Nevertheless, there is a greater chance of unbalanced data when employing UAV imagery, which may be a problem when determining the degree of infestation. Therefore, this study evaluated the impact of both balanced and imbalanced infestation level data on machine learning classification performance via three combinations of vegetation indices: NDVI-NDRE, NDVI-GNDVI and NDRE-GNDVI. Resampling method was carried out using random oversampling (ROS), synthetic minority oversampling techniques (SMOTE), random undersampling (RUS), 3-interval undersampling and 5-interval undersampling. Results showed that the best performance with 86.84% successful classification of 100% F1-score using imbalanced data of 3-interval undersampling. Fine KNN was constantly well performed in classifying all infestation levels in NDVI-NDRE combination across all datasets. The results unequivocally show that the 66.67% reduction in the sample size increases the chances of successful classification, even in situations where the data are unbalanced. |
| first_indexed | 2025-11-15T14:35:29Z |
| format | Article |
| id | upm-117901 |
| institution | Universiti Putra Malaysia |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T14:35:29Z |
| publishDate | 2024 |
| publisher | Springer Science and Business Media LLC |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | upm-1179012025-06-16T07:50:07Z http://psasir.upm.edu.my/id/eprint/117901/ Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing Mohd Johari, Siti Nurul Afiah Khairunniza-Bejo, Siti Mohamed Shariff, Abdul Rashid Husin, Nur Azuan Mohd Masri, Mohamed Mazmira Kamarudin, Noorhazwani A leaf-eating pest, Metisa plana (Lepidoptera: Psychidae), could cause 10–13% leaf defoliation and up to 40% crop losses, which would have a significant detrimental economic influence on Malaysian oil palm on yield production. A manual census was carried out to measure the current level of infestation; however, it became time-consuming when covering a large area. Unmanned aerial vehicles (UAVs) were chosen as the solution due to their rapid assess of the severity of the bagworm infestation. Nevertheless, there is a greater chance of unbalanced data when employing UAV imagery, which may be a problem when determining the degree of infestation. Therefore, this study evaluated the impact of both balanced and imbalanced infestation level data on machine learning classification performance via three combinations of vegetation indices: NDVI-NDRE, NDVI-GNDVI and NDRE-GNDVI. Resampling method was carried out using random oversampling (ROS), synthetic minority oversampling techniques (SMOTE), random undersampling (RUS), 3-interval undersampling and 5-interval undersampling. Results showed that the best performance with 86.84% successful classification of 100% F1-score using imbalanced data of 3-interval undersampling. Fine KNN was constantly well performed in classifying all infestation levels in NDVI-NDRE combination across all datasets. The results unequivocally show that the 66.67% reduction in the sample size increases the chances of successful classification, even in situations where the data are unbalanced. Springer Science and Business Media LLC 2024 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/117901/1/117901.pdf Mohd Johari, Siti Nurul Afiah and Khairunniza-Bejo, Siti and Mohamed Shariff, Abdul Rashid and Husin, Nur Azuan and Mohd Masri, Mohamed Mazmira and Kamarudin, Noorhazwani (2024) Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing. Journal of Plant Diseases and Protection, 132 (1). art. no. 52. pp. 1-17. ISSN 1861-3829; eISSN: 1861-3837 https://link.springer.com/article/10.1007/s41348-024-01020-x?error=cookies_not_supported&code=698faf39-ff3a-44e2-a5f7-43a56ad12c8f 10.1007/s41348-024-01020-x |
| spellingShingle | Mohd Johari, Siti Nurul Afiah Khairunniza-Bejo, Siti Mohamed Shariff, Abdul Rashid Husin, Nur Azuan Mohd Masri, Mohamed Mazmira Kamarudin, Noorhazwani Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing |
| title | Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing |
| title_full | Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing |
| title_fullStr | Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing |
| title_full_unstemmed | Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing |
| title_short | Effect of datasets size on the machine learning performance of the bagworm, Metisa plana (Walker) infestation using UAV remote sensing |
| title_sort | effect of datasets size on the machine learning performance of the bagworm, metisa plana (walker) infestation using uav remote sensing |
| url | http://psasir.upm.edu.my/id/eprint/117901/ http://psasir.upm.edu.my/id/eprint/117901/ http://psasir.upm.edu.my/id/eprint/117901/ http://psasir.upm.edu.my/id/eprint/117901/1/117901.pdf |