Evolutionary-based feature construction with substitution for data summarization using DARA

The representation of input data set is important for learning task. In data summarization, the representation of the multi-instances stored in non-target tables that have many-to-one relationship with record stored in target table influences the descriptive accuracy of the summarized data. If the s...

Full description

Bibliographic Details
Main Authors: Sia, Florence, Alfred, Rayner
Format: Conference or Workshop Item
Language:English
Published: IEEE 2012
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/26997/
http://umpir.ump.edu.my/id/eprint/26997/1/Evolutionary-based%20feature%20construction%20with%20substitution%20for%20data.pdf
_version_ 1848822678157262848
author Sia, Florence
Alfred, Rayner
author_facet Sia, Florence
Alfred, Rayner
author_sort Sia, Florence
building UMP Institutional Repository
collection Online Access
description The representation of input data set is important for learning task. In data summarization, the representation of the multi-instances stored in non-target tables that have many-to-one relationship with record stored in target table influences the descriptive accuracy of the summarized data. If the summarized data is fed into a classifier as one of the input features, the predictive accuracy of the classifier will also be affected. This paper proposes an evolutionary-based feature construction approach namely Fixed-Length Feature Construction with Substitution (FLFCWS) to address the problem by means of optimizing the feature construction for relational data summarization. This approach allows initial features to be used more than once in constructing newly constructed features. This is performed in order to exploit all possible interactions among attributes which involves an application of genetic algorithm to find a relevant set of features. The constructed features will be used to generate relevant patterns that characterize non-target records associated to the target record as an input representation for data summarization process. Several feature scoring measures are used as fitness function to find the best set of constructed features. The experimental results show that there is an improvement of predictive accuracy for classifying data summarized based on FLFCWS approach which indirectly improves the descriptive accuracy of the summarized data. It shows that FLFCWS approach can generate promising set of constructed features to describe the characteristics of non-target records for data summarization.
first_indexed 2025-11-15T02:45:03Z
format Conference or Workshop Item
id ump-26997
institution Universiti Malaysia Pahang
institution_category Local University
language English
last_indexed 2025-11-15T02:45:03Z
publishDate 2012
publisher IEEE
recordtype eprints
repository_type Digital Repository
spelling ump-269972020-03-22T23:30:15Z http://umpir.ump.edu.my/id/eprint/26997/ Evolutionary-based feature construction with substitution for data summarization using DARA Sia, Florence Alfred, Rayner QA76 Computer software The representation of input data set is important for learning task. In data summarization, the representation of the multi-instances stored in non-target tables that have many-to-one relationship with record stored in target table influences the descriptive accuracy of the summarized data. If the summarized data is fed into a classifier as one of the input features, the predictive accuracy of the classifier will also be affected. This paper proposes an evolutionary-based feature construction approach namely Fixed-Length Feature Construction with Substitution (FLFCWS) to address the problem by means of optimizing the feature construction for relational data summarization. This approach allows initial features to be used more than once in constructing newly constructed features. This is performed in order to exploit all possible interactions among attributes which involves an application of genetic algorithm to find a relevant set of features. The constructed features will be used to generate relevant patterns that characterize non-target records associated to the target record as an input representation for data summarization process. Several feature scoring measures are used as fitness function to find the best set of constructed features. The experimental results show that there is an improvement of predictive accuracy for classifying data summarized based on FLFCWS approach which indirectly improves the descriptive accuracy of the summarized data. It shows that FLFCWS approach can generate promising set of constructed features to describe the characteristics of non-target records for data summarization. IEEE 2012 Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/26997/1/Evolutionary-based%20feature%20construction%20with%20substitution%20for%20data.pdf Sia, Florence and Alfred, Rayner (2012) Evolutionary-based feature construction with substitution for data summarization using DARA. In: IEEE 4th Conference on Data Mining and Optimization (DMO 2012) , 2-4 September 2012 , Langkawi, Kedah. pp. 53-58.. ISBN 978-1-4673-2718-3 (Published) https://doi.org/10.1109/DMO.2012.6329798
spellingShingle QA76 Computer software
Sia, Florence
Alfred, Rayner
Evolutionary-based feature construction with substitution for data summarization using DARA
title Evolutionary-based feature construction with substitution for data summarization using DARA
title_full Evolutionary-based feature construction with substitution for data summarization using DARA
title_fullStr Evolutionary-based feature construction with substitution for data summarization using DARA
title_full_unstemmed Evolutionary-based feature construction with substitution for data summarization using DARA
title_short Evolutionary-based feature construction with substitution for data summarization using DARA
title_sort evolutionary-based feature construction with substitution for data summarization using dara
topic QA76 Computer software
url http://umpir.ump.edu.my/id/eprint/26997/
http://umpir.ump.edu.my/id/eprint/26997/
http://umpir.ump.edu.my/id/eprint/26997/1/Evolutionary-based%20feature%20construction%20with%20substitution%20for%20data.pdf