Gene Network Reconstruction by Integration of Prior Biological Knowledge

With the development of high-throughput genomic technologies, large, genome-wide datasets have been collected, and the integration of these datasets should provide large-scale, multidimensional, and insightful views of biological systems. We developed a method for gene association network constructi...

Full description

Bibliographic Details
Main Authors: Li, Yupeng, Jackson, Scott A.
Format: Online
Language:English
Published: Genetics Society of America 2015
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4478538/
id pubmed-4478538
recordtype oai_dc
spelling pubmed-44785382015-06-29 Gene Network Reconstruction by Integration of Prior Biological Knowledge Li, Yupeng Jackson, Scott A. Investigations With the development of high-throughput genomic technologies, large, genome-wide datasets have been collected, and the integration of these datasets should provide large-scale, multidimensional, and insightful views of biological systems. We developed a method for gene association network construction based on gene expression data that integrate a variety of biological resources. Assuming gene expression data are from a multivariate Gaussian distribution, a graphical lasso (glasso) algorithm is able to estimate the sparse inverse covariance matrix by a lasso (L1) penalty. The inverse covariance matrix can be seen as direct correlation between gene pairs in the gene association network. In our work, instead of using a single penalty, different penalty values were applied for gene pairs based on a priori knowledge as to whether the two genes should be connected. The a priori information can be calculated or retrieved from other biological data, e.g., Gene Ontology similarity, protein-protein interaction, gene regulatory network. By incorporating prior knowledge, the weighted graphical lasso (wglasso) outperforms the original glasso both on simulations and on data from Arabidopsis. Simulation studies show that even when some prior knowledge is not correct, the overall quality of the wglasso network was still greater than when not incorporating that information, e.g., glasso. Genetics Society of America 2015-03-30 /pmc/articles/PMC4478538/ /pubmed/25823587 http://dx.doi.org/10.1534/g3.115.018127 Text en Copyright © 2015 Li and Jackson http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Li, Yupeng
Jackson, Scott A.
spellingShingle Li, Yupeng
Jackson, Scott A.
Gene Network Reconstruction by Integration of Prior Biological Knowledge
author_facet Li, Yupeng
Jackson, Scott A.
author_sort Li, Yupeng
title Gene Network Reconstruction by Integration of Prior Biological Knowledge
title_short Gene Network Reconstruction by Integration of Prior Biological Knowledge
title_full Gene Network Reconstruction by Integration of Prior Biological Knowledge
title_fullStr Gene Network Reconstruction by Integration of Prior Biological Knowledge
title_full_unstemmed Gene Network Reconstruction by Integration of Prior Biological Knowledge
title_sort gene network reconstruction by integration of prior biological knowledge
description With the development of high-throughput genomic technologies, large, genome-wide datasets have been collected, and the integration of these datasets should provide large-scale, multidimensional, and insightful views of biological systems. We developed a method for gene association network construction based on gene expression data that integrate a variety of biological resources. Assuming gene expression data are from a multivariate Gaussian distribution, a graphical lasso (glasso) algorithm is able to estimate the sparse inverse covariance matrix by a lasso (L1) penalty. The inverse covariance matrix can be seen as direct correlation between gene pairs in the gene association network. In our work, instead of using a single penalty, different penalty values were applied for gene pairs based on a priori knowledge as to whether the two genes should be connected. The a priori information can be calculated or retrieved from other biological data, e.g., Gene Ontology similarity, protein-protein interaction, gene regulatory network. By incorporating prior knowledge, the weighted graphical lasso (wglasso) outperforms the original glasso both on simulations and on data from Arabidopsis. Simulation studies show that even when some prior knowledge is not correct, the overall quality of the wglasso network was still greater than when not incorporating that information, e.g., glasso.
publisher Genetics Society of America
publishDate 2015
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4478538/
_version_ 1613239489183875072