A transparent and transportable methodology for evaluating Data Linkage software

There has been substantial growth in DataLinkage (DL) activities in recent years. This reflects growth in both the demand for, and the supply of, linked or linkable data. Increased utilisation of DL “services” has brought with it increased need for impartial information about the suitability and per...

Full description

Bibliographic Details
Main Authors: Ferrante, Anna, Boyd, James
Format: Journal Article
Published: Elsevier 2011
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/4208
_version_ 1848744451128688640
author Ferrante, Anna
Boyd, James
author_facet Ferrante, Anna
Boyd, James
author_sort Ferrante, Anna
building Curtin Institutional Repository
collection Online Access
description There has been substantial growth in DataLinkage (DL) activities in recent years. This reflects growth in both the demand for, and the supply of, linked or linkable data. Increased utilisation of DL “services” has brought with it increased need for impartial information about the suitability and performance capabilities of DL software programs and packages. Although evaluations of DL software exist; most have been restricted to the comparison of two or three packages. Evaluations of a large number of packages are rare because of the time and resource burden placed on the evaluators and the need for a suitable “gold standard” evaluation dataset. In this paper we present an evaluation methodology that overcomes a number of these difficulties. Our approach involves the generation and use of representative synthetic data; the execution of a series of linkages using a pre-defined linkage strategy; and the use of standard linkage quality metrics to assess performance. The methodology is both transparent and transportable, producing genuinely comparable results. The methodology was used by the Centre for DataLinkage (CDL) at Curtin University in an evaluation of ten DL software packages. It is also being used to evaluate larger linkage systems (not just packages). The methodology provides a unique opportunity to benchmark the quality of linkages in different operational environments.
first_indexed 2025-11-14T06:01:40Z
format Journal Article
id curtin-20.500.11937-4208
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T06:01:40Z
publishDate 2011
publisher Elsevier
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-42082019-02-19T05:34:53Z A transparent and transportable methodology for evaluating Data Linkage software Ferrante, Anna Boyd, James Software evaluation Linkage quality Medical record linkage Data matching There has been substantial growth in DataLinkage (DL) activities in recent years. This reflects growth in both the demand for, and the supply of, linked or linkable data. Increased utilisation of DL “services” has brought with it increased need for impartial information about the suitability and performance capabilities of DL software programs and packages. Although evaluations of DL software exist; most have been restricted to the comparison of two or three packages. Evaluations of a large number of packages are rare because of the time and resource burden placed on the evaluators and the need for a suitable “gold standard” evaluation dataset. In this paper we present an evaluation methodology that overcomes a number of these difficulties. Our approach involves the generation and use of representative synthetic data; the execution of a series of linkages using a pre-defined linkage strategy; and the use of standard linkage quality metrics to assess performance. The methodology is both transparent and transportable, producing genuinely comparable results. The methodology was used by the Centre for DataLinkage (CDL) at Curtin University in an evaluation of ten DL software packages. It is also being used to evaluate larger linkage systems (not just packages). The methodology provides a unique opportunity to benchmark the quality of linkages in different operational environments. 2011 Journal Article http://hdl.handle.net/20.500.11937/4208 10.1016/j.jbi.2011.10.006 Elsevier fulltext
spellingShingle Software evaluation
Linkage quality
Medical record linkage
Data matching
Ferrante, Anna
Boyd, James
A transparent and transportable methodology for evaluating Data Linkage software
title A transparent and transportable methodology for evaluating Data Linkage software
title_full A transparent and transportable methodology for evaluating Data Linkage software
title_fullStr A transparent and transportable methodology for evaluating Data Linkage software
title_full_unstemmed A transparent and transportable methodology for evaluating Data Linkage software
title_short A transparent and transportable methodology for evaluating Data Linkage software
title_sort transparent and transportable methodology for evaluating data linkage software
topic Software evaluation
Linkage quality
Medical record linkage
Data matching
url http://hdl.handle.net/20.500.11937/4208