Gene Network Reconstruction by Integration of Prior Biological Knowledge
With the development of high-throughput genomic technologies, large, genome-wide datasets have been collected, and the integration of these datasets should provide large-scale, multidimensional, and insightful views of biological systems. We developed a method for gene association network constructi...
Main Authors: | , |
---|---|
Format: | Online |
Language: | English |
Published: |
Genetics Society of America
2015
|
Online Access: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4478538/ |
id |
pubmed-4478538 |
---|---|
recordtype |
oai_dc |
spelling |
pubmed-44785382015-06-29 Gene Network Reconstruction by Integration of Prior Biological Knowledge Li, Yupeng Jackson, Scott A. Investigations With the development of high-throughput genomic technologies, large, genome-wide datasets have been collected, and the integration of these datasets should provide large-scale, multidimensional, and insightful views of biological systems. We developed a method for gene association network construction based on gene expression data that integrate a variety of biological resources. Assuming gene expression data are from a multivariate Gaussian distribution, a graphical lasso (glasso) algorithm is able to estimate the sparse inverse covariance matrix by a lasso (L1) penalty. The inverse covariance matrix can be seen as direct correlation between gene pairs in the gene association network. In our work, instead of using a single penalty, different penalty values were applied for gene pairs based on a priori knowledge as to whether the two genes should be connected. The a priori information can be calculated or retrieved from other biological data, e.g., Gene Ontology similarity, protein-protein interaction, gene regulatory network. By incorporating prior knowledge, the weighted graphical lasso (wglasso) outperforms the original glasso both on simulations and on data from Arabidopsis. Simulation studies show that even when some prior knowledge is not correct, the overall quality of the wglasso network was still greater than when not incorporating that information, e.g., glasso. Genetics Society of America 2015-03-30 /pmc/articles/PMC4478538/ /pubmed/25823587 http://dx.doi.org/10.1534/g3.115.018127 Text en Copyright © 2015 Li and Jackson http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
repository_type |
Open Access Journal |
institution_category |
Foreign Institution |
institution |
US National Center for Biotechnology Information |
building |
NCBI PubMed |
collection |
Online Access |
language |
English |
format |
Online |
author |
Li, Yupeng Jackson, Scott A. |
spellingShingle |
Li, Yupeng Jackson, Scott A. Gene Network Reconstruction by Integration of Prior Biological Knowledge |
author_facet |
Li, Yupeng Jackson, Scott A. |
author_sort |
Li, Yupeng |
title |
Gene Network Reconstruction by Integration of Prior Biological Knowledge |
title_short |
Gene Network Reconstruction by Integration of Prior Biological Knowledge |
title_full |
Gene Network Reconstruction by Integration of Prior Biological Knowledge |
title_fullStr |
Gene Network Reconstruction by Integration of Prior Biological Knowledge |
title_full_unstemmed |
Gene Network Reconstruction by Integration of Prior Biological Knowledge |
title_sort |
gene network reconstruction by integration of prior biological knowledge |
description |
With the development of high-throughput genomic technologies, large, genome-wide datasets have been collected, and the integration of these datasets should provide large-scale, multidimensional, and insightful views of biological systems. We developed a method for gene association network construction based on gene expression data that integrate a variety of biological resources. Assuming gene expression data are from a multivariate Gaussian distribution, a graphical lasso (glasso) algorithm is able to estimate the sparse inverse covariance matrix by a lasso (L1) penalty. The inverse covariance matrix can be seen as direct correlation between gene pairs in the gene association network. In our work, instead of using a single penalty, different penalty values were applied for gene pairs based on a priori knowledge as to whether the two genes should be connected. The a priori information can be calculated or retrieved from other biological data, e.g., Gene Ontology similarity, protein-protein interaction, gene regulatory network. By incorporating prior knowledge, the weighted graphical lasso (wglasso) outperforms the original glasso both on simulations and on data from Arabidopsis. Simulation studies show that even when some prior knowledge is not correct, the overall quality of the wglasso network was still greater than when not incorporating that information, e.g., glasso. |
publisher |
Genetics Society of America |
publishDate |
2015 |
url |
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4478538/ |
_version_ |
1613239489183875072 |