Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data

We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We ident...

Full description

Bibliographic Details
Main Authors: Cheng, Chao, Yan, Koon-Kiu, Hwang, Woochang, Qian, Jiang, Bhardwaj, Nitin, Rozowsky, Joel, Lu, Zhi John, Niu, Wei, Alves, Pedro, Kato, Masaomi, Snyder, Michael, Gerstein, Mark
Format: Online
Language:English
Published: Public Library of Science 2011
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3219617/
id pubmed-3219617
recordtype oai_dc
spelling pubmed-32196172011-11-28 Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data Cheng, Chao Yan, Koon-Kiu Hwang, Woochang Qian, Jiang Bhardwaj, Nitin Rozowsky, Joel Lu, Zhi John Niu, Wei Alves, Pedro Kato, Masaomi Snyder, Michael Gerstein, Mark Research Article We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications. Public Library of Science 2011-11-17 /pmc/articles/PMC3219617/ /pubmed/22125477 http://dx.doi.org/10.1371/journal.pcbi.1002190 Text en Cheng et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Cheng, Chao
Yan, Koon-Kiu
Hwang, Woochang
Qian, Jiang
Bhardwaj, Nitin
Rozowsky, Joel
Lu, Zhi John
Niu, Wei
Alves, Pedro
Kato, Masaomi
Snyder, Michael
Gerstein, Mark
spellingShingle Cheng, Chao
Yan, Koon-Kiu
Hwang, Woochang
Qian, Jiang
Bhardwaj, Nitin
Rozowsky, Joel
Lu, Zhi John
Niu, Wei
Alves, Pedro
Kato, Masaomi
Snyder, Michael
Gerstein, Mark
Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data
author_facet Cheng, Chao
Yan, Koon-Kiu
Hwang, Woochang
Qian, Jiang
Bhardwaj, Nitin
Rozowsky, Joel
Lu, Zhi John
Niu, Wei
Alves, Pedro
Kato, Masaomi
Snyder, Michael
Gerstein, Mark
author_sort Cheng, Chao
title Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data
title_short Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data
title_full Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data
title_fullStr Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data
title_full_unstemmed Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data
title_sort construction and analysis of an integrated regulatory network derived from high-throughput sequencing data
description We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications.
publisher Public Library of Science
publishDate 2011
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3219617/
_version_ 1611488715783274496