Extreme learning machine classification of file clusters for evaluating content-based feature vectors

In the digital forensic investigation and missing data files retrieval in general, there is a challenge of recovering files that have missing system information. The recovery process entails applying a number of methods to determine the type, the contents and the structure of each data file clusters...

Full description

Bibliographic Details
Main Authors: Ali, Rabei Raad, Mohamad, Kamaruddin Malik, Jamel, Sapiee, Ahmad Khalid, Shamsul Kamal
Format: Article
Published: Science Publishing Corporation 2018
Subjects:
Online Access:http://eprints.uthm.edu.my/5378/
_version_ 1848888537767739392
author Ali, Rabei Raad
Mohamad, Kamaruddin Malik
Jamel, Sapiee
Ahmad Khalid, Shamsul Kamal
author_facet Ali, Rabei Raad
Mohamad, Kamaruddin Malik
Jamel, Sapiee
Ahmad Khalid, Shamsul Kamal
author_sort Ali, Rabei Raad
building UTHM Institutional Repository
collection Online Access
description In the digital forensic investigation and missing data files retrieval in general, there is a challenge of recovering files that have missing system information. The recovery process entails applying a number of methods to determine the type, the contents and the structure of each data file clusters such as JPEG, DOC, ZIP or TXT. This paper studies the effects of three content-based features extraction methods in improving the classification of JPEG File clusters. The methods are Byte Frequency Distribution, Entropy, and Rate of Change. Consequently, an Extreme Learning Machine (ELM) neural network algorithm is used to evaluate the performance of the three methods in which it classifies the class label of the feature vectors to JPEG and Non-JPEG images for files in different file formats. The files are allocated in a continuous series of clusters. The ELM algorithm is applied to the DFRWS (2006) dataset and the results show that the combination of the three methods produces 93.46% classification accuracy.
first_indexed 2025-11-15T20:11:52Z
format Article
id uthm-5378
institution Universiti Tun Hussein Onn Malaysia
institution_category Local University
last_indexed 2025-11-15T20:11:52Z
publishDate 2018
publisher Science Publishing Corporation
recordtype eprints
repository_type Digital Repository
spelling uthm-53782022-01-09T05:27:21Z http://eprints.uthm.edu.my/5378/ Extreme learning machine classification of file clusters for evaluating content-based feature vectors Ali, Rabei Raad Mohamad, Kamaruddin Malik Jamel, Sapiee Ahmad Khalid, Shamsul Kamal T Technology (General) T58.6-58.62 Management information systems T11.95-12.5 Industrial directories In the digital forensic investigation and missing data files retrieval in general, there is a challenge of recovering files that have missing system information. The recovery process entails applying a number of methods to determine the type, the contents and the structure of each data file clusters such as JPEG, DOC, ZIP or TXT. This paper studies the effects of three content-based features extraction methods in improving the classification of JPEG File clusters. The methods are Byte Frequency Distribution, Entropy, and Rate of Change. Consequently, an Extreme Learning Machine (ELM) neural network algorithm is used to evaluate the performance of the three methods in which it classifies the class label of the feature vectors to JPEG and Non-JPEG images for files in different file formats. The files are allocated in a continuous series of clusters. The ELM algorithm is applied to the DFRWS (2006) dataset and the results show that the combination of the three methods produces 93.46% classification accuracy. Science Publishing Corporation 2018 Article PeerReviewed Ali, Rabei Raad and Mohamad, Kamaruddin Malik and Jamel, Sapiee and Ahmad Khalid, Shamsul Kamal (2018) Extreme learning machine classification of file clusters for evaluating content-based feature vectors. International Journal of Engineering & Technology, 7 (4.36). pp. 167-171. ISSN 2227-524X
spellingShingle T Technology (General)
T58.6-58.62 Management information systems
T11.95-12.5 Industrial directories
Ali, Rabei Raad
Mohamad, Kamaruddin Malik
Jamel, Sapiee
Ahmad Khalid, Shamsul Kamal
Extreme learning machine classification of file clusters for evaluating content-based feature vectors
title Extreme learning machine classification of file clusters for evaluating content-based feature vectors
title_full Extreme learning machine classification of file clusters for evaluating content-based feature vectors
title_fullStr Extreme learning machine classification of file clusters for evaluating content-based feature vectors
title_full_unstemmed Extreme learning machine classification of file clusters for evaluating content-based feature vectors
title_short Extreme learning machine classification of file clusters for evaluating content-based feature vectors
title_sort extreme learning machine classification of file clusters for evaluating content-based feature vectors
topic T Technology (General)
T58.6-58.62 Management information systems
T11.95-12.5 Industrial directories
url http://eprints.uthm.edu.my/5378/