Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation

Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative da...

Full description

Bibliographic Details
Main Authors: Szatkiewicz, Jin P., Wang, WeiBo, Sullivan, Patrick F., Wang, Wei, Sun, Wei
Format: Online
Language:English
Published: Oxford University Press 2013
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561969/
id pubmed-3561969
recordtype oai_dc
spelling pubmed-35619692013-02-01 Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation Szatkiewicz, Jin P. Wang, WeiBo Sullivan, Patrick F. Wang, Wei Sun, Wei Computational Biology Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth–based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth–based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available. Oxford University Press 2013-02 2012-12-26 /pmc/articles/PMC3561969/ /pubmed/23275535 http://dx.doi.org/10.1093/nar/gks1363 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Szatkiewicz, Jin P.
Wang, WeiBo
Sullivan, Patrick F.
Wang, Wei
Sun, Wei
spellingShingle Szatkiewicz, Jin P.
Wang, WeiBo
Sullivan, Patrick F.
Wang, Wei
Sun, Wei
Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation
author_facet Szatkiewicz, Jin P.
Wang, WeiBo
Sullivan, Patrick F.
Wang, Wei
Sun, Wei
author_sort Szatkiewicz, Jin P.
title Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation
title_short Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation
title_full Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation
title_fullStr Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation
title_full_unstemmed Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation
title_sort improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation
description Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth–based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth–based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.
publisher Oxford University Press
publishDate 2013
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561969/
_version_ 1611951941840011264