Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps

We prove that maximum likelihood phylogenetic inference is consistent on gapped multiple sequence alignments (MSAs) as long as substitution rates across each edge are greater than zero, under mild assumptions on the structure of the alignment. Under these assumptions, maximum likelihood will asympto...

Full description

Bibliographic Details
Main Authors: Truszkowski, Jakub, Goldman, Nick
Format: Online
Language:English
Published: Oxford University Press 2016
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4748752/
id pubmed-4748752
recordtype oai_dc
spelling pubmed-47487522016-02-11 Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps Truszkowski, Jakub Goldman, Nick Regular Articles We prove that maximum likelihood phylogenetic inference is consistent on gapped multiple sequence alignments (MSAs) as long as substitution rates across each edge are greater than zero, under mild assumptions on the structure of the alignment. Under these assumptions, maximum likelihood will asymptotically recover the tree with edge lengths corresponding to the mean number of substitutions per site on each edge. This refutes Warnow's recent suggestion (Warnow 2012) that maximum likelihood phylogenetic inference might be statistically inconsistent when gaps are treated as missing data, even if the MSA is correct. We also derive a simple new proof of maximum likelihood consistency of ungapped alignments. Oxford University Press 2016-03 2015-11-28 /pmc/articles/PMC4748752/ /pubmed/26615177 http://dx.doi.org/10.1093/sysbio/syv089 Text en © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Truszkowski, Jakub
Goldman, Nick
spellingShingle Truszkowski, Jakub
Goldman, Nick
Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps
author_facet Truszkowski, Jakub
Goldman, Nick
author_sort Truszkowski, Jakub
title Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps
title_short Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps
title_full Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps
title_fullStr Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps
title_full_unstemmed Maximum Likelihood Phylogenetic Inference is Consistent on Multiple Sequence Alignments, with or without Gaps
title_sort maximum likelihood phylogenetic inference is consistent on multiple sequence alignments, with or without gaps
description We prove that maximum likelihood phylogenetic inference is consistent on gapped multiple sequence alignments (MSAs) as long as substitution rates across each edge are greater than zero, under mild assumptions on the structure of the alignment. Under these assumptions, maximum likelihood will asymptotically recover the tree with edge lengths corresponding to the mean number of substitutions per site on each edge. This refutes Warnow's recent suggestion (Warnow 2012) that maximum likelihood phylogenetic inference might be statistically inconsistent when gaps are treated as missing data, even if the MSA is correct. We also derive a simple new proof of maximum likelihood consistency of ungapped alignments.
publisher Oxford University Press
publishDate 2016
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4748752/
_version_ 1613536749038862336