Learning in imbalanced relational data

Traditional learning techniques learn from flat data files with the assumption that each class has a similar number of examples. However, the majority of real-world data are stored as relational systems with imbalanced data distribution, where one class of data is over-represented as compared with o...

Full description

Bibliographic Details
Main Authors: Ghanem, Amal, Venkatesh, Svetha, West, Geoff
Other Authors: M. Ejiri
Format: Conference Paper
Published: IAPR 2008
Online Access:http://hdl.handle.net/20.500.11937/2826
Description
Summary:Traditional learning techniques learn from flat data files with the assumption that each class has a similar number of examples. However, the majority of real-world data are stored as relational systems with imbalanced data distribution, where one class of data is over-represented as compared with other classes. We propose to extend a relational learning technique called Probabilistic Relational Models (PRMs) to deal with the imbalanced class problem. We address learning from imbalanced relational data using an ensemble of PRMs and propose a new model: the PRMs-IM. We show the performance of PRMs-IM on a real university relational database to identify students at risk.