Learning Gene Networks under SNP Perturbations Using eQTL Datasets

The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible t...

Full description

Bibliographic Details
Main Authors: Zhang, Lingxue, Kim, Seyoung
Format: Online
Language:English
Published: Public Library of Science 2014
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3937098/
id pubmed-3937098
recordtype oai_dc
spelling pubmed-39370982014-03-04 Learning Gene Networks under SNP Perturbations Using eQTL Datasets Zhang, Lingxue Kim, Seyoung Research Article The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs) that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network identified computationally by our method under SNP perturbations is well supported by the results from experimental perturbation studies related to DNA replication stress response. Public Library of Science 2014-02-27 /pmc/articles/PMC3937098/ /pubmed/24586125 http://dx.doi.org/10.1371/journal.pcbi.1003420 Text en © 2014 Zhang, Kim http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Zhang, Lingxue
Kim, Seyoung
spellingShingle Zhang, Lingxue
Kim, Seyoung
Learning Gene Networks under SNP Perturbations Using eQTL Datasets
author_facet Zhang, Lingxue
Kim, Seyoung
author_sort Zhang, Lingxue
title Learning Gene Networks under SNP Perturbations Using eQTL Datasets
title_short Learning Gene Networks under SNP Perturbations Using eQTL Datasets
title_full Learning Gene Networks under SNP Perturbations Using eQTL Datasets
title_fullStr Learning Gene Networks under SNP Perturbations Using eQTL Datasets
title_full_unstemmed Learning Gene Networks under SNP Perturbations Using eQTL Datasets
title_sort learning gene networks under snp perturbations using eqtl datasets
description The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs) that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network identified computationally by our method under SNP perturbations is well supported by the results from experimental perturbation studies related to DNA replication stress response.
publisher Public Library of Science
publishDate 2014
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3937098/
_version_ 1612062675090538496