The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions

Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socioeconomic status. The combination of mass call deta...

Full description

Bibliographic Details
Main Authors: Goulding, James, Smith, Gavin, Engelmann, Gregor
Format: Conference or Workshop Item
Language:English
Published: 2018
Subjects:
Online Access:https://eprints.nottingham.ac.uk/55720/
_version_ 1848799203236511744
author Goulding, James
Smith, Gavin
Engelmann, Gregor
author_facet Goulding, James
Smith, Gavin
Engelmann, Gregor
author_sort Goulding, James
building Nottingham Research Data Repository
collection Online Access
description Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socioeconomic status. The combination of mass call detail records (CDR) data with machine learning has recently been proposed as a way to obtain this data without the expense required by traditional census and household survey methods. Based on a sample of 330k mobile phone subscribers resident in Dar es Salaam, Tanzania (7.6m M-Money records, 450.2m call and SMS event logs) this paper demonstrates the improvements that can be made via an alternate data stream: M-Money transaction records. An alternative to traditional banking services, particularly utilized by citizens unable to obtain a bank account, M-Money transactions provide a currently unexplored but potentially more powerful data set held by the same telecommunication companies. Comparing directly to CDR as used in prior work the results show that M-Money provides an increase in socio-demographic classification accuracy (average F1 score) from 65.9% (0.63) to 71.3% (0.7) at much finer-grained spatial regions than previously examined. Notably, the combined use of M-Money and CDR data only increases prediction accuracy (average F1 score) from 71.3% (0.7) to 72.3% (0.71), providing evidence that M-Money is informationally subsuming CDR data. The reasons for this and the importance/contributions of individual features are subsequently investigated.
first_indexed 2025-11-14T20:31:56Z
format Conference or Workshop Item
id nottingham-55720
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T20:31:56Z
publishDate 2018
recordtype eprints
repository_type Digital Repository
spelling nottingham-557202018-12-13T09:54:14Z https://eprints.nottingham.ac.uk/55720/ The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions Goulding, James Smith, Gavin Engelmann, Gregor Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socioeconomic status. The combination of mass call detail records (CDR) data with machine learning has recently been proposed as a way to obtain this data without the expense required by traditional census and household survey methods. Based on a sample of 330k mobile phone subscribers resident in Dar es Salaam, Tanzania (7.6m M-Money records, 450.2m call and SMS event logs) this paper demonstrates the improvements that can be made via an alternate data stream: M-Money transaction records. An alternative to traditional banking services, particularly utilized by citizens unable to obtain a bank account, M-Money transactions provide a currently unexplored but potentially more powerful data set held by the same telecommunication companies. Comparing directly to CDR as used in prior work the results show that M-Money provides an increase in socio-demographic classification accuracy (average F1 score) from 65.9% (0.63) to 71.3% (0.7) at much finer-grained spatial regions than previously examined. Notably, the combined use of M-Money and CDR data only increases prediction accuracy (average F1 score) from 71.3% (0.7) to 72.3% (0.71), providing evidence that M-Money is informationally subsuming CDR data. The reasons for this and the importance/contributions of individual features are subsequently investigated. 2018-12-14 Conference or Workshop Item PeerReviewed application/pdf en https://eprints.nottingham.ac.uk/55720/1/IEEE_Big_Data___The_Unbanked_and_Poverty.pdf Goulding, James, Smith, Gavin and Engelmann, Gregor (2018) The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions. In: 2018 IEEE International Conference on Big Data, 10-13 Dec 2018, Seattle, USA. M-Money M-Pesa Poverty prediction CDR
spellingShingle M-Money
M-Pesa
Poverty prediction
CDR
Goulding, James
Smith, Gavin
Engelmann, Gregor
The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions
title The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions
title_full The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions
title_fullStr The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions
title_full_unstemmed The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions
title_short The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions
title_sort unbanked and poverty: predicting area-level socio-economic vulnerability from m-money transactions
topic M-Money
M-Pesa
Poverty prediction
CDR
url https://eprints.nottingham.ac.uk/55720/