Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover

Volunteered Geographic Information (VGI) offers a potentially inexpensive source of reference data for estimating area and assessing map accuracy in the context of remote-sensing based land-cover monitoring. The quality of observations from VGI and the typical lack of an underlying probability sampl...

Full description

Bibliographic Details
Main Authors: Stehman, Stephen V., Fonte, Cidália C., Foody, Giles M., See, Linda
Format: Article
Published: Elsevier 2018
Subjects:
Online Access:https://eprints.nottingham.ac.uk/51447/
_version_ 1848798498890186752
author Stehman, Stephen V.
Fonte, Cidália C.
Foody, Giles M.
See, Linda
author_facet Stehman, Stephen V.
Fonte, Cidália C.
Foody, Giles M.
See, Linda
author_sort Stehman, Stephen V.
building Nottingham Research Data Repository
collection Online Access
description Volunteered Geographic Information (VGI) offers a potentially inexpensive source of reference data for estimating area and assessing map accuracy in the context of remote-sensing based land-cover monitoring. The quality of observations from VGI and the typical lack of an underlying probability sampling design raise concerns regarding use of VGI in widely-applied design-based statistical inference. This article focuses on the fundamental issue of sampling design used to acquire VGI. Design-based inference requires the sample data to be obtained via a probability sampling design. Options for incorporating VGI within design-based inference include: 1) directing volunteers to obtain data for locations selected by a probability sampling design; 2) treating VGI data as a “certainty stratum” and augmenting the VGI with data obtained from a probability sample; and 3) using VGI to create an auxiliary variable that is then used in a model-assisted estimator to reduce the standard error of an estimate produced from a probability sample. The latter two options can be implemented using VGI data that were obtained from a non-probability sampling design, but require additional sample data to be acquired via a probability sampling design. If the only data available are VGI obtained from a non-probability sample, properties of design-based inference that are ensured by probability sampling must be replaced by assumptions that may be difficult to verify. For example, pseudo-estimation weights can be constructed that mimic weights used in stratified sampling estimators. However, accuracy and area estimates produced using these pseudo-weights still require the VGI data to be representative of the full population, a property known as “external validity”. Because design-based inference requires a probability sampling design, directing volunteers to locations specified by a probability sampling design is the most straightforward option for use of VGI in design-based inference. Combining VGI from a non-probability sample with data from a probability sample using the certainty stratum approach or the model-assisted approach are viable alternatives that meet the conditions required for design-based inference and use the VGI data to advantage to reduce standard errors.
first_indexed 2025-11-14T20:20:44Z
format Article
id nottingham-51447
institution University of Nottingham Malaysia Campus
institution_category Local University
last_indexed 2025-11-14T20:20:44Z
publishDate 2018
publisher Elsevier
recordtype eprints
repository_type Digital Repository
spelling nottingham-514472020-05-04T19:44:16Z https://eprints.nottingham.ac.uk/51447/ Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover Stehman, Stephen V. Fonte, Cidália C. Foody, Giles M. See, Linda Volunteered Geographic Information (VGI) offers a potentially inexpensive source of reference data for estimating area and assessing map accuracy in the context of remote-sensing based land-cover monitoring. The quality of observations from VGI and the typical lack of an underlying probability sampling design raise concerns regarding use of VGI in widely-applied design-based statistical inference. This article focuses on the fundamental issue of sampling design used to acquire VGI. Design-based inference requires the sample data to be obtained via a probability sampling design. Options for incorporating VGI within design-based inference include: 1) directing volunteers to obtain data for locations selected by a probability sampling design; 2) treating VGI data as a “certainty stratum” and augmenting the VGI with data obtained from a probability sample; and 3) using VGI to create an auxiliary variable that is then used in a model-assisted estimator to reduce the standard error of an estimate produced from a probability sample. The latter two options can be implemented using VGI data that were obtained from a non-probability sampling design, but require additional sample data to be acquired via a probability sampling design. If the only data available are VGI obtained from a non-probability sample, properties of design-based inference that are ensured by probability sampling must be replaced by assumptions that may be difficult to verify. For example, pseudo-estimation weights can be constructed that mimic weights used in stratified sampling estimators. However, accuracy and area estimates produced using these pseudo-weights still require the VGI data to be representative of the full population, a property known as “external validity”. Because design-based inference requires a probability sampling design, directing volunteers to locations specified by a probability sampling design is the most straightforward option for use of VGI in design-based inference. Combining VGI from a non-probability sample with data from a probability sample using the certainty stratum approach or the model-assisted approach are viable alternatives that meet the conditions required for design-based inference and use the VGI data to advantage to reduce standard errors. Elsevier 2018-06-30 Article PeerReviewed Stehman, Stephen V., Fonte, Cidália C., Foody, Giles M. and See, Linda (2018) Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover. Remote Sensing of Environment, 212 . pp. 47-59. ISSN 0034-4257 Probability sampling; External validity; Pseudo-weights; Data quality; Model-based inference; Volunteered geographic information (VGI); Crowdsourcing https://www.sciencedirect.com/science/article/pii/S0034425718301627 doi:10.1016/j.rse.2018.04.014 doi:10.1016/j.rse.2018.04.014
spellingShingle Probability sampling; External validity; Pseudo-weights; Data quality; Model-based inference; Volunteered geographic information (VGI); Crowdsourcing
Stehman, Stephen V.
Fonte, Cidália C.
Foody, Giles M.
See, Linda
Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover
title Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover
title_full Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover
title_fullStr Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover
title_full_unstemmed Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover
title_short Using volunteered geographic information (VGI) in design-based statistical inference for area estimation and accuracy assessment of land cover
title_sort using volunteered geographic information (vgi) in design-based statistical inference for area estimation and accuracy assessment of land cover
topic Probability sampling; External validity; Pseudo-weights; Data quality; Model-based inference; Volunteered geographic information (VGI); Crowdsourcing
url https://eprints.nottingham.ac.uk/51447/
https://eprints.nottingham.ac.uk/51447/
https://eprints.nottingham.ac.uk/51447/