Accounting for selection and correlation in the analysis of two-stage genome-wide association studies

The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been propo...

Full description

Bibliographic Details
Main Authors: Robertson, David S., Prevost, A. Toby, Bowden, Jack
Format: Online
Language:English
Published: Oxford University Press 2016
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5031943/
id pubmed-5031943
recordtype oai_dc
spelling pubmed-50319432016-09-23 Accounting for selection and correlation in the analysis of two-stage genome-wide association studies Robertson, David S. Prevost, A. Toby Bowden, Jack Articles The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure. One area of application is the estimation of odds ratios (ORs) when combining a genome-wide scan with a replication study. Our framework explicitly accounts for correlated single nucleotide polymorphisms, as might occur due to linkage disequilibrium. We illustrate our approach on the measurement of the association between 11 genetic variants and the risk of Crohn's disease, as reported in Parkes and others (2007. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Gen. 39(7), 830–832.), and show that the estimated ORs can vary substantially if both selection and correlation are taken into account. Oxford University Press 2016-10 2016-03-18 /pmc/articles/PMC5031943/ /pubmed/26993061 http://dx.doi.org/10.1093/biostatistics/kxw012 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Robertson, David S.
Prevost, A. Toby
Bowden, Jack
spellingShingle Robertson, David S.
Prevost, A. Toby
Bowden, Jack
Accounting for selection and correlation in the analysis of two-stage genome-wide association studies
author_facet Robertson, David S.
Prevost, A. Toby
Bowden, Jack
author_sort Robertson, David S.
title Accounting for selection and correlation in the analysis of two-stage genome-wide association studies
title_short Accounting for selection and correlation in the analysis of two-stage genome-wide association studies
title_full Accounting for selection and correlation in the analysis of two-stage genome-wide association studies
title_fullStr Accounting for selection and correlation in the analysis of two-stage genome-wide association studies
title_full_unstemmed Accounting for selection and correlation in the analysis of two-stage genome-wide association studies
title_sort accounting for selection and correlation in the analysis of two-stage genome-wide association studies
description The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure. One area of application is the estimation of odds ratios (ORs) when combining a genome-wide scan with a replication study. Our framework explicitly accounts for correlated single nucleotide polymorphisms, as might occur due to linkage disequilibrium. We illustrate our approach on the measurement of the association between 11 genetic variants and the risk of Crohn's disease, as reported in Parkes and others (2007. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Gen. 39(7), 830–832.), and show that the estimated ORs can vary substantially if both selection and correlation are taken into account.
publisher Oxford University Press
publishDate 2016
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5031943/
_version_ 1613657575838973952