Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm

Motivation: Accurate alignment of large numbers of sequences is demanding and the computational burden is further increased by downstream analyses depending on these alignments. With the abundance of sequence data, an integrative approach of adding new sequences to existing alignments without their...

Full description

Bibliographic Details
Main Authors: Löytynoja, Ari, Vilella, Albert J., Goldman, Nick
Format: Online
Language:English
Published: Oxford University Press 2012
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3381962/
Description
Summary:Motivation: Accurate alignment of large numbers of sequences is demanding and the computational burden is further increased by downstream analyses depending on these alignments. With the abundance of sequence data, an integrative approach of adding new sequences to existing alignments without their full re-computation and maintaining the relative matching of existing sequences is an attractive option. Another current challenge is the extension of reference alignments with fragmented sequences, as those coming from next-generation metagenomics, that contain relatively little information. Widely used methods for alignment extension are based on profile representation of reference sequences. These do not incorporate and use phylogenetic information and are affected by the composition of the reference alignment and the phylogenetic positions of query sequences.