Further Improvements to Linear Mixed Models for Genome-Wide Association Studies
We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals...
Main Authors: | , , , , , , , |
---|---|
Format: | Online |
Language: | English |
Published: |
Nature Publishing Group
2014
|
Online Access: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4230738/ |
id |
pubmed-4230738 |
---|---|
recordtype |
oai_dc |
spelling |
pubmed-42307382014-11-17 Further Improvements to Linear Mixed Models for Genome-Wide Association Studies Widmer, Christian Lippert, Christoph Weissbrod, Omer Fusi, Nicolo Kadie, Carl Davidson, Robert Listgarten, Jennifer Heckerman, David Article We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studies across a wide range of synthetic and real data, we find that modifications to this approach improve GWAS performance as measured by type I error control and power. Specifically, when only population structure is present, a GSM constructed from SNPs that well predict the phenotype in combination with principal components as covariates controls type I error and yields more power than the traditional LMM. In any setting, with or without population structure or family relatedness, a GSM consisting of a mixture of two component GSMs, one constructed from all SNPs and another constructed from SNPs that well predict the phenotype again controls type I error and yields more power than the traditional LMM. Software implementing these improvements and the experimental comparisons are available at http://microsoft.com/science. Nature Publishing Group 2014-11-12 /pmc/articles/PMC4230738/ /pubmed/25387525 http://dx.doi.org/10.1038/srep06874 Text en Copyright © 2014, Macmillan Publishers Limited. All rights reserved http://creativecommons.org/licenses/by-nc-nd/4.0/ This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ |
repository_type |
Open Access Journal |
institution_category |
Foreign Institution |
institution |
US National Center for Biotechnology Information |
building |
NCBI PubMed |
collection |
Online Access |
language |
English |
format |
Online |
author |
Widmer, Christian Lippert, Christoph Weissbrod, Omer Fusi, Nicolo Kadie, Carl Davidson, Robert Listgarten, Jennifer Heckerman, David |
spellingShingle |
Widmer, Christian Lippert, Christoph Weissbrod, Omer Fusi, Nicolo Kadie, Carl Davidson, Robert Listgarten, Jennifer Heckerman, David Further Improvements to Linear Mixed Models for Genome-Wide Association Studies |
author_facet |
Widmer, Christian Lippert, Christoph Weissbrod, Omer Fusi, Nicolo Kadie, Carl Davidson, Robert Listgarten, Jennifer Heckerman, David |
author_sort |
Widmer, Christian |
title |
Further Improvements to Linear Mixed Models for Genome-Wide Association
Studies |
title_short |
Further Improvements to Linear Mixed Models for Genome-Wide Association
Studies |
title_full |
Further Improvements to Linear Mixed Models for Genome-Wide Association
Studies |
title_fullStr |
Further Improvements to Linear Mixed Models for Genome-Wide Association
Studies |
title_full_unstemmed |
Further Improvements to Linear Mixed Models for Genome-Wide Association
Studies |
title_sort |
further improvements to linear mixed models for genome-wide association
studies |
description |
We examine improvements to the linear mixed model (LMM) that better correct for population
structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the
estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity
between every two individuals in a cohort. These similarities are estimated from single
nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs
are used to estimate the GSM. In empirical studies across a wide range of synthetic and real
data, we find that modifications to this approach improve GWAS performance as measured by
type I error control and power. Specifically, when only population structure is present, a
GSM constructed from SNPs that well predict the phenotype in combination with principal
components as covariates controls type I error and yields more power than the traditional
LMM. In any setting, with or without population structure or family relatedness, a GSM
consisting of a mixture of two component GSMs, one constructed from all SNPs and another
constructed from SNPs that well predict the phenotype again controls type I error and yields
more power than the traditional LMM. Software implementing these improvements and the
experimental comparisons are available at http://microsoft.com/science. |
publisher |
Nature Publishing Group |
publishDate |
2014 |
url |
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4230738/ |
_version_ |
1613156191896076288 |