Empirical comparison of tree ensemble variable importance measures
Tree ensembles are becoming well-established as popular and powerful data modelling techniques. Tree ensemble models are essentially black box models, although their individual members may not be, and with their growing popularity, interest in the interpretation of tree ensemble models has also grow...
| Main Authors: | , |
|---|---|
| Format: | Journal Article |
| Published: |
ELSEVIER
2011
|
| Subjects: | |
| Online Access: | http://hdl.handle.net/20.500.11937/47469 |
| _version_ | 1848757841966399488 |
|---|---|
| author | Auret, L. Aldrich, Chris |
| author_facet | Auret, L. Aldrich, Chris |
| author_sort | Auret, L. |
| building | Curtin Institutional Repository |
| collection | Online Access |
| description | Tree ensembles are becoming well-established as popular and powerful data modelling techniques. Tree ensemble models are essentially black box models, although their individual members may not be, and with their growing popularity, interest in the interpretation of tree ensemble models has also grown. This study presents variable importance measures associated with random forests, conditional inference forests and boosted trees, and employs a number of simulated data sets to compare these methods. Overall, variable importance indicators based on bagged conditional inference forests appear to strike a good balance between identification of significant variables and avoiding unnecessary flagging of correlated variables. Data preprocessing and interpretation by experts knowledgeable with a specific data set remain vital. |
| first_indexed | 2025-11-14T09:34:31Z |
| format | Journal Article |
| id | curtin-20.500.11937-47469 |
| institution | Curtin University Malaysia |
| institution_category | Local University |
| last_indexed | 2025-11-14T09:34:31Z |
| publishDate | 2011 |
| publisher | ELSEVIER |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | curtin-20.500.11937-474692017-09-13T16:07:45Z Empirical comparison of tree ensemble variable importance measures Auret, L. Aldrich, Chris Decision trees - Variable importance - Ensemble learning - Random forests - Fault identification - Boosted trees - Conditional inference forests Tree ensembles are becoming well-established as popular and powerful data modelling techniques. Tree ensemble models are essentially black box models, although their individual members may not be, and with their growing popularity, interest in the interpretation of tree ensemble models has also grown. This study presents variable importance measures associated with random forests, conditional inference forests and boosted trees, and employs a number of simulated data sets to compare these methods. Overall, variable importance indicators based on bagged conditional inference forests appear to strike a good balance between identification of significant variables and avoiding unnecessary flagging of correlated variables. Data preprocessing and interpretation by experts knowledgeable with a specific data set remain vital. 2011 Journal Article http://hdl.handle.net/20.500.11937/47469 10.1016/j.chemolab.2010.12.004 ELSEVIER restricted |
| spellingShingle | Decision trees - Variable importance - Ensemble learning - Random forests - Fault identification - Boosted trees - Conditional inference forests Auret, L. Aldrich, Chris Empirical comparison of tree ensemble variable importance measures |
| title | Empirical comparison of tree ensemble variable importance measures |
| title_full | Empirical comparison of tree ensemble variable importance measures |
| title_fullStr | Empirical comparison of tree ensemble variable importance measures |
| title_full_unstemmed | Empirical comparison of tree ensemble variable importance measures |
| title_short | Empirical comparison of tree ensemble variable importance measures |
| title_sort | empirical comparison of tree ensemble variable importance measures |
| topic | Decision trees - Variable importance - Ensemble learning - Random forests - Fault identification - Boosted trees - Conditional inference forests |
| url | http://hdl.handle.net/20.500.11937/47469 |