A Modified Kennard-Stone Algorithm for Optimal Division of Data for Developing Artificial Neural Network Models

This paper proposes a method, namely MDKS (Kennard-Stone algorithm based on Mahalanobis distance), to divide the data into training and testing subsets for developing artificial neural network (ANN) models. This method is a modified version of the Kennard-Stone (KS) algorithm. With this method, bett...

Full description

Bibliographic Details
Main Authors: Saptoro, Agus, Tade, Moses, Vuthaluru, Hari
Format: Journal Article
Published: The Berkeley Electronic Press 2012
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/45101
Description
Summary:This paper proposes a method, namely MDKS (Kennard-Stone algorithm based on Mahalanobis distance), to divide the data into training and testing subsets for developing artificial neural network (ANN) models. This method is a modified version of the Kennard-Stone (KS) algorithm. With this method, better data splitting, in terms of data representation and enhanced performance of developed ANN models, can be achieved. Compared with standard KS algorithm and another improved KS algorithm (data division based on joint x - y distances (SPXY) method), the proposed method has also shown a better performance. Therefore, the proposed technique can be used as an advantageous alternative to other existing methods of data splitting for developing ANN models. Care should be taken when dealing with large amount of dataset since they may increase the computational load for MDKS due to its variance-covariance matrix calculations.