The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions
Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socioeconomic status. The combination of mass call deta...
| Main Authors: | , , |
|---|---|
| Format: | Conference or Workshop Item |
| Language: | English |
| Published: |
2018
|
| Subjects: | |
| Online Access: | https://eprints.nottingham.ac.uk/55720/ |
| _version_ | 1848799203236511744 |
|---|---|
| author | Goulding, James Smith, Gavin Engelmann, Gregor |
| author_facet | Goulding, James Smith, Gavin Engelmann, Gregor |
| author_sort | Goulding, James |
| building | Nottingham Research Data Repository |
| collection | Online Access |
| description | Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socioeconomic status. The combination of mass call detail records (CDR) data with machine learning has recently been proposed as a way to obtain this data without the expense required by traditional census and household survey methods. Based on a sample of 330k mobile phone subscribers resident in Dar es Salaam, Tanzania (7.6m M-Money records, 450.2m call and SMS event logs) this paper demonstrates the improvements that can be made via an alternate data stream: M-Money transaction records. An alternative to traditional banking services, particularly utilized by citizens unable to obtain a bank account, M-Money transactions provide a currently unexplored but potentially more powerful data set held by the same telecommunication companies. Comparing directly to CDR as used in prior work the results show that M-Money provides an increase in socio-demographic classification accuracy (average F1 score) from 65.9% (0.63) to 71.3% (0.7) at much finer-grained spatial regions than previously examined. Notably, the combined use of M-Money and CDR data only increases prediction accuracy (average F1 score) from 71.3% (0.7) to 72.3% (0.71), providing evidence that M-Money is informationally subsuming CDR data. The reasons for this and the importance/contributions of individual features are subsequently investigated. |
| first_indexed | 2025-11-14T20:31:56Z |
| format | Conference or Workshop Item |
| id | nottingham-55720 |
| institution | University of Nottingham Malaysia Campus |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-14T20:31:56Z |
| publishDate | 2018 |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | nottingham-557202018-12-13T09:54:14Z https://eprints.nottingham.ac.uk/55720/ The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions Goulding, James Smith, Gavin Engelmann, Gregor Emerging economies around the world are often characterized by governments and institutions struggling to keep key demographic data streams up to date. A demographic of interest particularly linked to social vulnerability is that of poverty and socioeconomic status. The combination of mass call detail records (CDR) data with machine learning has recently been proposed as a way to obtain this data without the expense required by traditional census and household survey methods. Based on a sample of 330k mobile phone subscribers resident in Dar es Salaam, Tanzania (7.6m M-Money records, 450.2m call and SMS event logs) this paper demonstrates the improvements that can be made via an alternate data stream: M-Money transaction records. An alternative to traditional banking services, particularly utilized by citizens unable to obtain a bank account, M-Money transactions provide a currently unexplored but potentially more powerful data set held by the same telecommunication companies. Comparing directly to CDR as used in prior work the results show that M-Money provides an increase in socio-demographic classification accuracy (average F1 score) from 65.9% (0.63) to 71.3% (0.7) at much finer-grained spatial regions than previously examined. Notably, the combined use of M-Money and CDR data only increases prediction accuracy (average F1 score) from 71.3% (0.7) to 72.3% (0.71), providing evidence that M-Money is informationally subsuming CDR data. The reasons for this and the importance/contributions of individual features are subsequently investigated. 2018-12-14 Conference or Workshop Item PeerReviewed application/pdf en https://eprints.nottingham.ac.uk/55720/1/IEEE_Big_Data___The_Unbanked_and_Poverty.pdf Goulding, James, Smith, Gavin and Engelmann, Gregor (2018) The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions. In: 2018 IEEE International Conference on Big Data, 10-13 Dec 2018, Seattle, USA. M-Money M-Pesa Poverty prediction CDR |
| spellingShingle | M-Money M-Pesa Poverty prediction CDR Goulding, James Smith, Gavin Engelmann, Gregor The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions |
| title | The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions |
| title_full | The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions |
| title_fullStr | The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions |
| title_full_unstemmed | The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions |
| title_short | The unbanked and poverty: predicting area-level socio-economic vulnerability from M-Money transactions |
| title_sort | unbanked and poverty: predicting area-level socio-economic vulnerability from m-money transactions |
| topic | M-Money M-Pesa Poverty prediction CDR |
| url | https://eprints.nottingham.ac.uk/55720/ |