Improving the tool for analyzing Malaysia’s demographic change: data standardization analysis to form geo-demographics classification profiles using k-means algorithms
Clustering is one of the important methods in data exploratory in this era because it is widely applied in data mining.Clustering of data is necessary to produce geo-demographic classification where k-means algorithm is used as cluster algorithm. K-means is one of the methods commonly used in clus...
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
School of Social, Development and Environmental Studies, Faculty of Social Science and Humanities, Universiti Kebangsaan Malaysia
2016
|
| Online Access: | http://journalarticle.ukm.my/10309/ http://journalarticle.ukm.my/10309/1/4x.geografia-siupsi-mei16-Kamarul-edam.pdf |
| Summary: | Clustering is one of the important methods in data exploratory in this era because it is widely applied in data
mining.Clustering of data is necessary to produce geo-demographic classification where k-means algorithm is used
as cluster algorithm. K-means is one of the methods commonly used in cluster algorithm because it is more
significant. However, before any data are executed on cluster analysis it is necessary to conduct some analysis to
ensure the variable used in the cluster analysis is appropriate and does not have a recurring information. One
analysis that needs to be done is the standardization data analysis. This study observed which standardization
method was more effective in the analysis process of Malaysia’s population and housing census data for the Perak
state. The rationale was that standardized data would simplify the execution of k-means algorithm. The standardized
methods chosen to test the data accuracy were the z-score and range standardization method. From the analysis
conducted it was found that the range standardization method was more suitable to be used for the data examined. |
|---|