An ensemble approach to accurately detect somatic mutations using SomaticSeq

SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutat...

Full description

Bibliographic Details
Main Authors: Fang, Li Tai, Afshar, Pegah Tootoonchi, Chhibber, Aparna, Mohiyuddin, Marghoob, Fan, Yu, Mu, John C., Gibeling, Greg, Barr, Sharon, Asadi, Narges Bani, Gerstein, Mark B., Koboldt, Daniel C., Wang, Wenyi, Wong, Wing H., Lam, Hugo YK
Format: Online
Language:English
Published: BioMed Central 2015
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4574535/
id pubmed-4574535
recordtype oai_dc
spelling pubmed-45745352015-09-19 An ensemble approach to accurately detect somatic mutations using SomaticSeq Fang, Li Tai Afshar, Pegah Tootoonchi Chhibber, Aparna Mohiyuddin, Marghoob Fan, Yu Mu, John C. Gibeling, Greg Barr, Sharon Asadi, Narges Bani Gerstein, Mark B. Koboldt, Daniel C. Wang, Wenyi Wong, Wing H. Lam, Hugo YK Software SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses. We validate our results with both synthetic and real data. We report that SomaticSeq is able to achieve better overall accuracy than any individual tool incorporated. BioMed Central 2015-09-17 2015 /pmc/articles/PMC4574535/ /pubmed/26381235 http://dx.doi.org/10.1186/s13059-015-0758-2 Text en © Fang et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Fang, Li Tai
Afshar, Pegah Tootoonchi
Chhibber, Aparna
Mohiyuddin, Marghoob
Fan, Yu
Mu, John C.
Gibeling, Greg
Barr, Sharon
Asadi, Narges Bani
Gerstein, Mark B.
Koboldt, Daniel C.
Wang, Wenyi
Wong, Wing H.
Lam, Hugo YK
spellingShingle Fang, Li Tai
Afshar, Pegah Tootoonchi
Chhibber, Aparna
Mohiyuddin, Marghoob
Fan, Yu
Mu, John C.
Gibeling, Greg
Barr, Sharon
Asadi, Narges Bani
Gerstein, Mark B.
Koboldt, Daniel C.
Wang, Wenyi
Wong, Wing H.
Lam, Hugo YK
An ensemble approach to accurately detect somatic mutations using SomaticSeq
author_facet Fang, Li Tai
Afshar, Pegah Tootoonchi
Chhibber, Aparna
Mohiyuddin, Marghoob
Fan, Yu
Mu, John C.
Gibeling, Greg
Barr, Sharon
Asadi, Narges Bani
Gerstein, Mark B.
Koboldt, Daniel C.
Wang, Wenyi
Wong, Wing H.
Lam, Hugo YK
author_sort Fang, Li Tai
title An ensemble approach to accurately detect somatic mutations using SomaticSeq
title_short An ensemble approach to accurately detect somatic mutations using SomaticSeq
title_full An ensemble approach to accurately detect somatic mutations using SomaticSeq
title_fullStr An ensemble approach to accurately detect somatic mutations using SomaticSeq
title_full_unstemmed An ensemble approach to accurately detect somatic mutations using SomaticSeq
title_sort ensemble approach to accurately detect somatic mutations using somaticseq
description SomaticSeq is an accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses. We validate our results with both synthetic and real data. We report that SomaticSeq is able to achieve better overall accuracy than any individual tool incorporated.
publisher BioMed Central
publishDate 2015
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4574535/
_version_ 1613477090893496320