Labelling strategies for hierarchical multi-label classification techniques
Many hierarchical multi-label classification systems predict a real valued score for every (instance, class) couple, with a higher score reflecting more confidence that the instance belongs to that class. These classifiers leave the conversion of these scores to an actual label set to the user, who appl...
| Main Authors: | , |
|---|---|
| Format: | Article |
| Published: |
Elsevier
2016
|
| Subjects: | |
| Online Access: | https://eprints.nottingham.ac.uk/33847/ |
| _version_ | 1848794717761830912 |
|---|---|
| author | Triguero, Isaac Vens, Celine |
| author_facet | Triguero, Isaac Vens, Celine |
| author_sort | Triguero, Isaac |
| building | Nottingham Research Data Repository |
| collection | Online Access |
| description | Many hierarchical multi-label classification systems predict a real valued score for every (instance, class) couple, with a higher score reflecting more confidence that the instance belongs to that class. These classifiers leave the conversion of these scores to an actual label set to the user, who applies a cut-off value to the scores. The predictive performance of these classifiers is usually evaluated using threshold independent measures like precision-recall curves. However, several applications require actual label sets, and thus an automatic labelling strategy.
In this article, we present and evaluate different alternatives to perform the actual labelling in hierarchical multi-label classification. We investigate the selection of both single and multiple thresholds. Despite the existence of multiple threshold selection strategies in non-hierarchical multi-label classification, they can not be applied directly to the hierarchical context. The proposed strategies are implemented within two main approaches: optimisation of a certain performance measure of interest (such as F-measure or hierarchical loss), and simulating training set properties (such as class distribution or label cardinality) in the predictions. We assess the performance of the proposed labelling schemes on 10 datasets from different application domains. Our results show that selecting multiple thresholds may result in an efficient and effective solution for hierarchical multi-label problems. |
| first_indexed | 2025-11-14T19:20:38Z |
| format | Article |
| id | nottingham-33847 |
| institution | University of Nottingham Malaysia Campus |
| institution_category | Local University |
| last_indexed | 2025-11-14T19:20:38Z |
| publishDate | 2016 |
| publisher | Elsevier |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | nottingham-338472020-05-04T17:42:54Z https://eprints.nottingham.ac.uk/33847/ Labelling strategies for hierarchical multi-label classification techniques Triguero, Isaac Vens, Celine Many hierarchical multi-label classification systems predict a real valued score for every (instance, class) couple, with a higher score reflecting more confidence that the instance belongs to that class. These classifiers leave the conversion of these scores to an actual label set to the user, who applies a cut-off value to the scores. The predictive performance of these classifiers is usually evaluated using threshold independent measures like precision-recall curves. However, several applications require actual label sets, and thus an automatic labelling strategy. In this article, we present and evaluate different alternatives to perform the actual labelling in hierarchical multi-label classification. We investigate the selection of both single and multiple thresholds. Despite the existence of multiple threshold selection strategies in non-hierarchical multi-label classification, they can not be applied directly to the hierarchical context. The proposed strategies are implemented within two main approaches: optimisation of a certain performance measure of interest (such as F-measure or hierarchical loss), and simulating training set properties (such as class distribution or label cardinality) in the predictions. We assess the performance of the proposed labelling schemes on 10 datasets from different application domains. Our results show that selecting multiple thresholds may result in an efficient and effective solution for hierarchical multi-label problems. Elsevier 2016-03-04 Article PeerReviewed Triguero, Isaac and Vens, Celine (2016) Labelling strategies for hierarchical multi-label classification techniques. Pattern Recognition, 56 . pp. 170-183. ISSN 0031-3203 Hierarchical multi-label classification; Threshold optimisation; Hierarchical loss; HMC-loss; F-measure http://www.sciencedirect.com/science/article/pii/S0031320316000881 doi:10.1016/j.patcog.2016.02.017 doi:10.1016/j.patcog.2016.02.017 |
| spellingShingle | Hierarchical multi-label classification; Threshold optimisation; Hierarchical loss; HMC-loss; F-measure Triguero, Isaac Vens, Celine Labelling strategies for hierarchical multi-label classification techniques |
| title | Labelling strategies for hierarchical multi-label classification techniques |
| title_full | Labelling strategies for hierarchical multi-label classification techniques |
| title_fullStr | Labelling strategies for hierarchical multi-label classification techniques |
| title_full_unstemmed | Labelling strategies for hierarchical multi-label classification techniques |
| title_short | Labelling strategies for hierarchical multi-label classification techniques |
| title_sort | labelling strategies for hierarchical multi-label classification techniques |
| topic | Hierarchical multi-label classification; Threshold optimisation; Hierarchical loss; HMC-loss; F-measure |
| url | https://eprints.nottingham.ac.uk/33847/ https://eprints.nottingham.ac.uk/33847/ https://eprints.nottingham.ac.uk/33847/ |