Machine learning and statistical approaches to classification – a case study

The advent of information technology has led to the proliferation of data in disparate databases. Organisations have become data rich but knowledge poor. Users need efficient analysis tools to help them understand their data, predict future trends and relationships and generalise to new situations i...

Full description

Bibliographic Details
Main Authors: Eyoh, Imo, John, Robert
Format: Conference or Workshop Item
Published: 2017
Online Access:https://eprints.nottingham.ac.uk/51551/
_version_ 1848798521459736576
author Eyoh, Imo
John, Robert
author_facet Eyoh, Imo
John, Robert
author_sort Eyoh, Imo
building Nottingham Research Data Repository
collection Online Access
description The advent of information technology has led to the proliferation of data in disparate databases. Organisations have become data rich but knowledge poor. Users need efficient analysis tools to help them understand their data, predict future trends and relationships and generalise to new situations in order to make proactive knowledge-driven decisions in a competitive business world. Thus, there is an urgent need for techniques and tools that intelligently and automatically transform these data into useful information and knowledge for effective decision making. Data mining is considered to be the most appropriate technology for addressing this need. Datamining is the process of extracting or “mining” knowledge from large amounts of data. Regression analysis and classification are two datamining tasks used to predict future trends. In this study, we investigate the behaviour of a statistical model and three machine learning models (artificial neural network, decision tree and support vector machine) on a large electricity dataset. We evaluate their predictive abilities based on this dataset. Results show that machine learning models, for this real world dataset, outperform statistical regression while artificial neural network outperforms support vector machine and decision tree in the classification task. In terms of comprehensibility, decision tree is the best choice. Although not definitive this research indicates that certainly these machine learning methods are an alternative to regression with certain datasets.
first_indexed 2025-11-14T20:21:06Z
format Conference or Workshop Item
id nottingham-51551
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T20:21:06Z
publishDate 2017
recordtype eprints
repository_type Digital Repository
spelling nottingham-515512020-05-04T19:04:56Z https://eprints.nottingham.ac.uk/51551/ Machine learning and statistical approaches to classification – a case study Eyoh, Imo John, Robert The advent of information technology has led to the proliferation of data in disparate databases. Organisations have become data rich but knowledge poor. Users need efficient analysis tools to help them understand their data, predict future trends and relationships and generalise to new situations in order to make proactive knowledge-driven decisions in a competitive business world. Thus, there is an urgent need for techniques and tools that intelligently and automatically transform these data into useful information and knowledge for effective decision making. Data mining is considered to be the most appropriate technology for addressing this need. Datamining is the process of extracting or “mining” knowledge from large amounts of data. Regression analysis and classification are two datamining tasks used to predict future trends. In this study, we investigate the behaviour of a statistical model and three machine learning models (artificial neural network, decision tree and support vector machine) on a large electricity dataset. We evaluate their predictive abilities based on this dataset. Results show that machine learning models, for this real world dataset, outperform statistical regression while artificial neural network outperforms support vector machine and decision tree in the classification task. In terms of comprehensibility, decision tree is the best choice. Although not definitive this research indicates that certainly these machine learning methods are an alternative to regression with certain datasets. 2017-09-07 Conference or Workshop Item PeerReviewed Eyoh, Imo and John, Robert (2017) Machine learning and statistical approaches to classification – a case study. In: 15th UK Workshop on Computational Intelligence (UKCI 2015), 7-9 Sep 2015, Exeter, UK.
spellingShingle Eyoh, Imo
John, Robert
Machine learning and statistical approaches to classification – a case study
title Machine learning and statistical approaches to classification – a case study
title_full Machine learning and statistical approaches to classification – a case study
title_fullStr Machine learning and statistical approaches to classification – a case study
title_full_unstemmed Machine learning and statistical approaches to classification – a case study
title_short Machine learning and statistical approaches to classification – a case study
title_sort machine learning and statistical approaches to classification – a case study
url https://eprints.nottingham.ac.uk/51551/