Extreme learning machine classification of file clusters for evaluating content-based feature vectors

In the digital forensic investigation and missing data files retrieval in general, there is a challenge of recovering files that have missing system information. The recovery process entails applying a number of methods to determine the type, the contents and the structure of each data file clusters...

Full description

Bibliographic Details
Main Authors: Ali, Rabei Raad, Mohamad, Kamaruddin Malik, Jamel, Sapiee, Ahmad Khalid, Shamsul Kamal
Format: Article
Published: Science Publishing Corporation 2018
Subjects:
Online Access:http://eprints.uthm.edu.my/5378/
Description
Summary:In the digital forensic investigation and missing data files retrieval in general, there is a challenge of recovering files that have missing system information. The recovery process entails applying a number of methods to determine the type, the contents and the structure of each data file clusters such as JPEG, DOC, ZIP or TXT. This paper studies the effects of three content-based features extraction methods in improving the classification of JPEG File clusters. The methods are Byte Frequency Distribution, Entropy, and Rate of Change. Consequently, an Extreme Learning Machine (ELM) neural network algorithm is used to evaluate the performance of the three methods in which it classifies the class label of the feature vectors to JPEG and Non-JPEG images for files in different file formats. The files are allocated in a continuous series of clusters. The ELM algorithm is applied to the DFRWS (2006) dataset and the results show that the combination of the three methods produces 93.46% classification accuracy.