Evolutionary-based feature construction with substitution for data summarization using DARA
The representation of input data set is important for learning task. In data summarization, the representation of the multi-instances stored in non-target tables that have many-to-one relationship with record stored in target table influences the descriptive accuracy of the summarized data. If the s...
| Main Authors: | , |
|---|---|
| Format: | Conference or Workshop Item |
| Language: | English |
| Published: |
IEEE
2012
|
| Subjects: | |
| Online Access: | http://umpir.ump.edu.my/id/eprint/26997/ http://umpir.ump.edu.my/id/eprint/26997/1/Evolutionary-based%20feature%20construction%20with%20substitution%20for%20data.pdf |
| _version_ | 1848822678157262848 |
|---|---|
| author | Sia, Florence Alfred, Rayner |
| author_facet | Sia, Florence Alfred, Rayner |
| author_sort | Sia, Florence |
| building | UMP Institutional Repository |
| collection | Online Access |
| description | The representation of input data set is important for learning task. In data summarization, the representation of the multi-instances stored in non-target tables that have many-to-one relationship with record stored in target table influences the descriptive accuracy of the summarized data. If the summarized data is fed into a classifier as one of the input features, the predictive accuracy of the classifier will also be affected. This paper proposes an evolutionary-based feature construction approach namely Fixed-Length Feature Construction with Substitution (FLFCWS) to address the problem by means of optimizing the feature construction for relational data summarization. This approach allows initial features to be used more than once in constructing newly constructed features. This is performed in order to exploit all possible interactions among attributes which involves an application of genetic algorithm to find a relevant set of features. The constructed features will be used to generate relevant patterns that characterize non-target records associated to the target record as an input representation for data summarization process. Several feature scoring measures are used as fitness function to find the best set of constructed features. The experimental results show that there is an improvement of predictive accuracy for classifying data summarized based on FLFCWS approach which indirectly improves the descriptive accuracy of the summarized data. It shows that FLFCWS approach can generate promising set of constructed features to describe the characteristics of non-target records for data summarization. |
| first_indexed | 2025-11-15T02:45:03Z |
| format | Conference or Workshop Item |
| id | ump-26997 |
| institution | Universiti Malaysia Pahang |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T02:45:03Z |
| publishDate | 2012 |
| publisher | IEEE |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | ump-269972020-03-22T23:30:15Z http://umpir.ump.edu.my/id/eprint/26997/ Evolutionary-based feature construction with substitution for data summarization using DARA Sia, Florence Alfred, Rayner QA76 Computer software The representation of input data set is important for learning task. In data summarization, the representation of the multi-instances stored in non-target tables that have many-to-one relationship with record stored in target table influences the descriptive accuracy of the summarized data. If the summarized data is fed into a classifier as one of the input features, the predictive accuracy of the classifier will also be affected. This paper proposes an evolutionary-based feature construction approach namely Fixed-Length Feature Construction with Substitution (FLFCWS) to address the problem by means of optimizing the feature construction for relational data summarization. This approach allows initial features to be used more than once in constructing newly constructed features. This is performed in order to exploit all possible interactions among attributes which involves an application of genetic algorithm to find a relevant set of features. The constructed features will be used to generate relevant patterns that characterize non-target records associated to the target record as an input representation for data summarization process. Several feature scoring measures are used as fitness function to find the best set of constructed features. The experimental results show that there is an improvement of predictive accuracy for classifying data summarized based on FLFCWS approach which indirectly improves the descriptive accuracy of the summarized data. It shows that FLFCWS approach can generate promising set of constructed features to describe the characteristics of non-target records for data summarization. IEEE 2012 Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/26997/1/Evolutionary-based%20feature%20construction%20with%20substitution%20for%20data.pdf Sia, Florence and Alfred, Rayner (2012) Evolutionary-based feature construction with substitution for data summarization using DARA. In: IEEE 4th Conference on Data Mining and Optimization (DMO 2012) , 2-4 September 2012 , Langkawi, Kedah. pp. 53-58.. ISBN 978-1-4673-2718-3 (Published) https://doi.org/10.1109/DMO.2012.6329798 |
| spellingShingle | QA76 Computer software Sia, Florence Alfred, Rayner Evolutionary-based feature construction with substitution for data summarization using DARA |
| title | Evolutionary-based feature construction with substitution for data summarization using DARA |
| title_full | Evolutionary-based feature construction with substitution for data summarization using DARA |
| title_fullStr | Evolutionary-based feature construction with substitution for data summarization using DARA |
| title_full_unstemmed | Evolutionary-based feature construction with substitution for data summarization using DARA |
| title_short | Evolutionary-based feature construction with substitution for data summarization using DARA |
| title_sort | evolutionary-based feature construction with substitution for data summarization using dara |
| topic | QA76 Computer software |
| url | http://umpir.ump.edu.my/id/eprint/26997/ http://umpir.ump.edu.my/id/eprint/26997/ http://umpir.ump.edu.my/id/eprint/26997/1/Evolutionary-based%20feature%20construction%20with%20substitution%20for%20data.pdf |