Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor

Despite important advances from Genome Wide Association Studies (GWAS), for most complex human traits and diseases, a sizable proportion of genetic variance remains unexplained and prediction accuracy (PA) is usually low. Evidence suggests that PA can be improved using Whole-Genome Regression (WGR)...

Full description

Bibliographic Details
Main Authors:	de los Campos, Gustavo, Vazquez, Ana I., Fernando, Rohan, Klimentidis, Yann C., Sorensen, Daniel
Format:	Online
Language:	English
Published:	Public Library of Science 2013
Online Access:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3708840/

id	pubmed-3708840
recordtype	oai_dc
spelling	pubmed-37088402013-07-19 Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor de los Campos, Gustavo Vazquez, Ana I. Fernando, Rohan Klimentidis, Yann C. Sorensen, Daniel Research Article Despite important advances from Genome Wide Association Studies (GWAS), for most complex human traits and diseases, a sizable proportion of genetic variance remains unexplained and prediction accuracy (PA) is usually low. Evidence suggests that PA can be improved using Whole-Genome Regression (WGR) models where phenotypes are regressed on hundreds of thousands of variants simultaneously. The Genomic Best Linear Unbiased Prediction (G-BLUP, a ridge-regression type method) is a commonly used WGR method and has shown good predictive performance when applied to plant and animal breeding populations. However, breeding and human populations differ greatly in a number of factors that can affect the predictive performance of G-BLUP. Using theory, simulations, and real data analysis, we study the performance of G-BLUP when applied to data from related and unrelated human subjects. Under perfect linkage disequilibrium (LD) between markers and QTL, the prediction R-squared (R2) of G-BLUP reaches trait-heritability, asymptotically. However, under imperfect LD between markers and QTL, prediction R2 based on G-BLUP has a much lower upper bound. We show that the minimum decrease in prediction accuracy caused by imperfect LD between markers and QTL is given by (1−b)2, where b is the regression of marker-derived genomic relationships on those realized at causal loci. For pairs of related individuals, due to within-family disequilibrium, the patterns of realized genomic similarity are similar across the genome; therefore b is close to one inducing small decrease in R2. However, with distantly related individuals b reaches very low values imposing a very low upper bound on prediction R2. Our simulations suggest that for the analysis of data from unrelated individuals, the asymptotic upper bound on R2 may be of the order of 20% of the trait heritability. We show how PA can be enhanced with use of variable selection or differential shrinkage of estimates of marker effects. Public Library of Science 2013-07-11 /pmc/articles/PMC3708840/ /pubmed/23874214 http://dx.doi.org/10.1371/journal.pgen.1003608 Text en © 2013 de los Campos et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
repository_type	Open Access Journal
institution_category	Foreign Institution
institution	US National Center for Biotechnology Information
building	NCBI PubMed
collection	Online Access
language	English
format	Online
author	de los Campos, Gustavo Vazquez, Ana I. Fernando, Rohan Klimentidis, Yann C. Sorensen, Daniel
spellingShingle	de los Campos, Gustavo Vazquez, Ana I. Fernando, Rohan Klimentidis, Yann C. Sorensen, Daniel Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
author_facet	de los Campos, Gustavo Vazquez, Ana I. Fernando, Rohan Klimentidis, Yann C. Sorensen, Daniel
author_sort	de los Campos, Gustavo
title	Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
title_short	Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
title_full	Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
title_fullStr	Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
title_full_unstemmed	Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor
title_sort	prediction of complex human traits using the genomic best linear unbiased predictor
description	Despite important advances from Genome Wide Association Studies (GWAS), for most complex human traits and diseases, a sizable proportion of genetic variance remains unexplained and prediction accuracy (PA) is usually low. Evidence suggests that PA can be improved using Whole-Genome Regression (WGR) models where phenotypes are regressed on hundreds of thousands of variants simultaneously. The Genomic Best Linear Unbiased Prediction (G-BLUP, a ridge-regression type method) is a commonly used WGR method and has shown good predictive performance when applied to plant and animal breeding populations. However, breeding and human populations differ greatly in a number of factors that can affect the predictive performance of G-BLUP. Using theory, simulations, and real data analysis, we study the performance of G-BLUP when applied to data from related and unrelated human subjects. Under perfect linkage disequilibrium (LD) between markers and QTL, the prediction R-squared (R2) of G-BLUP reaches trait-heritability, asymptotically. However, under imperfect LD between markers and QTL, prediction R2 based on G-BLUP has a much lower upper bound. We show that the minimum decrease in prediction accuracy caused by imperfect LD between markers and QTL is given by (1−b)2, where b is the regression of marker-derived genomic relationships on those realized at causal loci. For pairs of related individuals, due to within-family disequilibrium, the patterns of realized genomic similarity are similar across the genome; therefore b is close to one inducing small decrease in R2. However, with distantly related individuals b reaches very low values imposing a very low upper bound on prediction R2. Our simulations suggest that for the analysis of data from unrelated individuals, the asymptotic upper bound on R2 may be of the order of 20% of the trait heritability. We show how PA can be enhanced with use of variable selection or differential shrinkage of estimates of marker effects.
publisher	Public Library of Science
publishDate	2013
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3708840/
_version_	1611994166091317248

Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor

Similar Items