Unsupervised record matching with noisy and incomplete data
We consider the problem of duplicate detection in noisy and incomplete data: given a large data set in which each record has multiple entries (attributes), detect which distinct records refer to the same real world entity. This task is complicated by noise (such as misspellings) and missing data, wh...
| Main Authors: | van Gennip, Yves, Hunter, Blake, Ma, Anna, Moyer, Dan, de Vera, Ryan, Bertozzi, Andrea L. |
|---|---|
| Format: | Article |
| Published: |
Springer
2018
|
| Subjects: | |
| Online Access: | https://eprints.nottingham.ac.uk/51471/ |
Similar Items
The effect of data cleaning on record linkage quality
by: Randall, Sean, et al.
Published: (2013)
by: Randall, Sean, et al.
Published: (2013)
A transparent and transportable methodology for evaluating Data Linkage software
by: Ferrante, Anna, et al.
Published: (2011)
by: Ferrante, Anna, et al.
Published: (2011)
Matching disparate geospatial datasets and validating matches using spatial logic
by: Du, Heshan
Published: (2015)
by: Du, Heshan
Published: (2015)
Use of graph theory measures to identify errors in record linkage
by: Randall, Sean, et al.
Published: (2014)
by: Randall, Sean, et al.
Published: (2014)
An evaluation framework for comparing geocoding systems
by: Goldberg, D., et al.
Published: (2013)
by: Goldberg, D., et al.
Published: (2013)
Burn injury, gender and cancer risk: population-based cohort study using data from Scotland and Western Australia
by: Duke, Janine, et al.
Published: (2014)
by: Duke, Janine, et al.
Published: (2014)
Technical challenges of providing record linkage services for research
by: Boyd, James, et al.
Published: (2014)
by: Boyd, James, et al.
Published: (2014)
Intelligent imputation method for mix data-type missing values to improve data quality
by: Alabadla, Mustafa R. A.
Published: (2024)
by: Alabadla, Mustafa R. A.
Published: (2024)
Privacy-preserving record linkage on large real world datasets
by: Randall, Sean, et al.
Published: (2014)
by: Randall, Sean, et al.
Published: (2014)
Completeness of primary intracranial tumour recording in the Scottish Cancer Registry 2011-12
by: Morling, Joanne R., et al.
Published: (2016)
by: Morling, Joanne R., et al.
Published: (2016)
Localised Gross-error Detection in the Australian Land Gravity Database
by: Sproule, David, et al.
Published: (2006)
by: Sproule, David, et al.
Published: (2006)
Long-term trends and outcomes of anterior vitrectomy in Western Australia
by: Clark, Antony, et al.
Published: (2015)
by: Clark, Antony, et al.
Published: (2015)
Data linkage infrastructure for cross-jurisdictional health-related research in Australia
by: Boyd, James, et al.
Published: (2012)
by: Boyd, James, et al.
Published: (2012)
Risk for Retinal Detachment After Phacoemulsification: A Whole-Population Study of Cataract Surgery Outcomes
by: Clark, Antony, et al.
Published: (2012)
by: Clark, Antony, et al.
Published: (2012)
Accuracy and completeness of patient pathways – the benefits of national data linkage in Australia
by: Boyd, James, et al.
Published: (2015)
by: Boyd, James, et al.
Published: (2015)
Psychiatric comorbidity in a cohort of heroin and amphetamine users in Perth Western Australia
by: Bartu, Anne, et al.
Published: (2003)
by: Bartu, Anne, et al.
Published: (2003)
Cross-border hospital use: analysis using data linkage across four Australian states
by: Spilsbury, Katrina, et al.
Published: (2015)
by: Spilsbury, Katrina, et al.
Published: (2015)
What is the impact of missing Indigenous status on mortality estimates?: an assessment using record linkage in Western Australia
by: Draper, G., et al.
Published: (2009)
by: Draper, G., et al.
Published: (2009)
Heavy prenatal alcohol exposure and increased risk of stillbirth
by: O'Leary, Colleen, et al.
Published: (2012)
by: O'Leary, Colleen, et al.
Published: (2012)
Prevalence of blindness in children
by: Crewe, Julie, et al.
Published: (2012)
by: Crewe, Julie, et al.
Published: (2012)
Unsupervised Iterative Manifold Alignment via Local Feature Histograms
by: Fan, Ke, et al.
Published: (2014)
by: Fan, Ke, et al.
Published: (2014)
International Health Data Linkage Network
by: Smith, M., et al.
Published: (2011)
by: Smith, M., et al.
Published: (2011)
Duplicate bug report detection using clustering
by: Gopalan, Raj, et al.
Published: (2014)
by: Gopalan, Raj, et al.
Published: (2014)
Impact of population ageing on the cost of hospitalisations for cardiovascular disease: a population-based data linkage study
by: Ha, N., et al.
Published: (2014)
by: Ha, N., et al.
Published: (2014)
Warehousing of object oriented petroleum data for knowledge mapping
by: Nimmagadda, Shastri, et al.
Published: (2005)
by: Nimmagadda, Shastri, et al.
Published: (2005)
Privacy preserving for electronic health record systems
by: Yussuf, Zakariye Mohamed
Published: (2019)
by: Yussuf, Zakariye Mohamed
Published: (2019)
Visualisation of electronic health records
by: Wang, Qiru
Published: (2025)
by: Wang, Qiru
Published: (2025)
Development of CO2 snow cleaning for in situ cleaning of µCMM stylus tips
by: Feng, Xiaobing, et al.
Published: (2016)
by: Feng, Xiaobing, et al.
Published: (2016)
Blockchain-based electronic health record system
by: Saleh Habtor, Saleh Abdulaziz
Published: (2019)
by: Saleh Habtor, Saleh Abdulaziz
Published: (2019)
A web service service for the dynamic linkage and visualisation of multivariate spatiotemporal information
by: Moncrieff, Simon, et al.
Published: (2013)
by: Moncrieff, Simon, et al.
Published: (2013)
The development of a snow cleaning system for micro-CMM stylus tips
by: Feng, Xiaobing, et al.
Published: (2015)
by: Feng, Xiaobing, et al.
Published: (2015)
Towards the use of Semi-structured Annotators for Automated Essay Grading
by: Lam, Hon, et al.
Published: (2010)
by: Lam, Hon, et al.
Published: (2010)
Can the characteristics of emergency department attendances predict poor hospital outcomes inpatients with sepsis?
by: Ibrahim, I., et al.
Published: (2013)
by: Ibrahim, I., et al.
Published: (2013)
Morbidity associated with amphetamine-related presentations to an emergency department: A record linkage study
by: Fatovich, D., et al.
Published: (2012)
by: Fatovich, D., et al.
Published: (2012)
Evolution of the TGF-beta superfamily with emphasis on Nodal
by: Shen, Yuan
Published: (2014)
by: Shen, Yuan
Published: (2014)
Processing skyline queries in centralised and distributed incomplete databases
by: Alwan, Ali Amer
Published: (2013)
by: Alwan, Ali Amer
Published: (2013)
Discovering Concept Mappings by Similarity Propagation among Substructures
by: Pan, Qi, et al.
Published: (2010)
by: Pan, Qi, et al.
Published: (2010)
Enhancing statistical education by using role-plays of consultation
by: Taplin, Ross
Published: (2007)
by: Taplin, Ross
Published: (2007)
The effectiveness of e-Government services: a study on the e-Procurement system at the Ministry of Health (MOH) / Suhaida Mohd Kamirazaman
by: Mohd Kamirazaman, Suhaida
Published: (2019)
by: Mohd Kamirazaman, Suhaida
Published: (2019)
The satisfaction of government servant on e-Government at selected government agencies in Dungun Terengganu / Nor Aqma Nadira Abu Baker et al.
by: Abu Baker, Nor Aqma Nadira, et al.
Published: (2011)
by: Abu Baker, Nor Aqma Nadira, et al.
Published: (2011)
Similar Items
-
The effect of data cleaning on record linkage quality
by: Randall, Sean, et al.
Published: (2013) -
A transparent and transportable methodology for evaluating Data Linkage software
by: Ferrante, Anna, et al.
Published: (2011) -
Matching disparate geospatial datasets and validating matches using spatial logic
by: Du, Heshan
Published: (2015) -
Use of graph theory measures to identify errors in record linkage
by: Randall, Sean, et al.
Published: (2014) -
An evaluation framework for comparing geocoding systems
by: Goldberg, D., et al.
Published: (2013)