Using the Pareto principle in genome-wide breeding value estimation

Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexibl...

Full description

Bibliographic Details
Main Authors:	Yu, Xijiang, Meuwissen, Theo HE
Format:	Online
Language:	English
Published:	BioMed Central 2011
Online Access:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3354342/

id	pubmed-3354342
recordtype	oai_dc
spelling	pubmed-33543422012-05-18 Using the Pareto principle in genome-wide breeding value estimation Yu, Xijiang Meuwissen, Theo HE Research Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexible prior distributions of SNP effects are applied that allow for very large SNP effects although most are small or even zero, but these prior distributions are often also computationally demanding as they rely on Monte Carlo Markov chain sampling. In this study, we adopted the Pareto principle to weight available marker loci, i.e., we consider that x% of the loci explain (100 - x)% of the total genetic variance. Assuming this principle, it is also possible to define the variances of the prior distribution of the 'big' and 'small' SNP. The relatively few large SNP explain a large proportion of the genetic variance and the majority of the SNP show small effects and explain a minor proportion of the genetic variance. We name this method MixP, where the prior distribution is a mixture of two normal distributions, i.e. one with a big variance and one with a small variance. Simulation results, using a real Norwegian Red cattle pedigree, show that MixP is at least as accurate as the other methods in all studied cases. This method also reduces the hyper-parameters of the prior distribution from 2 (proportion and variance of SNP with big effects) to 1 (proportion of SNP with big effects), assuming the overall genetic variance is known. The mixture of normal distribution prior made it possible to solve the equations iteratively, which greatly reduced computation loads by two orders of magnitude. In the era of marker density reaching million(s) and whole-genome sequence data, MixP provides a computationally feasible Bayesian method of analysis. BioMed Central 2011-11-01 /pmc/articles/PMC3354342/ /pubmed/22044555 http://dx.doi.org/10.1186/1297-9686-43-35 Text en Copyright ©2011 Yu and Meuwissen; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
repository_type	Open Access Journal
institution_category	Foreign Institution
institution	US National Center for Biotechnology Information
building	NCBI PubMed
collection	Online Access
language	English
format	Online
author	Yu, Xijiang Meuwissen, Theo HE
spellingShingle	Yu, Xijiang Meuwissen, Theo HE Using the Pareto principle in genome-wide breeding value estimation
author_facet	Yu, Xijiang Meuwissen, Theo HE
author_sort	Yu, Xijiang
title	Using the Pareto principle in genome-wide breeding value estimation
title_short	Using the Pareto principle in genome-wide breeding value estimation
title_full	Using the Pareto principle in genome-wide breeding value estimation
title_fullStr	Using the Pareto principle in genome-wide breeding value estimation
title_full_unstemmed	Using the Pareto principle in genome-wide breeding value estimation
title_sort	using the pareto principle in genome-wide breeding value estimation
description	Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexible prior distributions of SNP effects are applied that allow for very large SNP effects although most are small or even zero, but these prior distributions are often also computationally demanding as they rely on Monte Carlo Markov chain sampling. In this study, we adopted the Pareto principle to weight available marker loci, i.e., we consider that x% of the loci explain (100 - x)% of the total genetic variance. Assuming this principle, it is also possible to define the variances of the prior distribution of the 'big' and 'small' SNP. The relatively few large SNP explain a large proportion of the genetic variance and the majority of the SNP show small effects and explain a minor proportion of the genetic variance. We name this method MixP, where the prior distribution is a mixture of two normal distributions, i.e. one with a big variance and one with a small variance. Simulation results, using a real Norwegian Red cattle pedigree, show that MixP is at least as accurate as the other methods in all studied cases. This method also reduces the hyper-parameters of the prior distribution from 2 (proportion and variance of SNP with big effects) to 1 (proportion of SNP with big effects), assuming the overall genetic variance is known. The mixture of normal distribution prior made it possible to solve the equations iteratively, which greatly reduced computation loads by two orders of magnitude. In the era of marker density reaching million(s) and whole-genome sequence data, MixP provides a computationally feasible Bayesian method of analysis.
publisher	BioMed Central
publishDate	2011
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3354342/
_version_	1611530465827618816

Using the Pareto principle in genome-wide breeding value estimation

Similar Items