Sensitivity of missing values in classification tree for large sample
Missing values either in predictor or in response variables are a very common problem in statistics and data mining. Cases with missing values are often ignored which results in loss of information and possible bias. The objectives of our research were to investigate the sensitivity of missing data...
| Main Authors: | , , , |
|---|---|
| Format: | Conference or Workshop Item |
| Language: | English |
| Published: |
American Institute of Physics
2011
|
| Online Access: | http://psasir.upm.edu.my/id/eprint/57334/ http://psasir.upm.edu.my/id/eprint/57334/1/Sensitivity%20of%20missing%20values%20in%20classification%20tree%20for%20large%20sample.pdf |
| Summary: | Missing values either in predictor or in response variables are a very common problem in statistics and data mining. Cases with missing values are often ignored which results in loss of information and possible bias. The objectives of our research were to investigate the sensitivity of missing data in classification tree model for large sample. Data were obtained from one of the high level educational institutions in Malaysia. Students' background data were randomly eliminated and classification tree was used to predict students degree classification. The results showed that for large sample, the structure of the classification tree was sensitive to missing values especially for sample contains more than ten percent missing values. |
|---|