Molecular phylogenetics before sequences: Oligonucleotide catalogs as k-mer spectra

From 1971 to 1985, Carl Woese and colleagues generated oligonucleotide catalogs of 16S/18S rRNAs from more than 400 organisms. Using these incomplete and imperfect data, Carl and his colleagues developed unprecedented insights into the structure, function, and evolution of the large RNA components o...

Full description

Bibliographic Details
Main Authors: Ragan, Mark A, Bernard, Guillaume, Chan, Cheong Xin
Format: Online
Language:English
Published: Landes Bioscience 2014
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4008546/
id pubmed-4008546
recordtype oai_dc
spelling pubmed-40085462015-03-01 Molecular phylogenetics before sequences: Oligonucleotide catalogs as k-mer spectra Ragan, Mark A Bernard, Guillaume Chan, Cheong Xin Review From 1971 to 1985, Carl Woese and colleagues generated oligonucleotide catalogs of 16S/18S rRNAs from more than 400 organisms. Using these incomplete and imperfect data, Carl and his colleagues developed unprecedented insights into the structure, function, and evolution of the large RNA components of the translational apparatus. They recognized a third domain of life, revealed the phylogenetic backbone of bacteria (and its limitations), delineated taxa, and explored the tempo and mode of microbial evolution. For these discoveries to have stood the test of time, oligonucleotide catalogs must carry significant phylogenetic signal; they thus bear re-examination in view of the current interest in alignment-free phylogenetics based on k-mers. Here we consider the aims, successes, and limitations of this early phase of molecular phylogenetics. We computationally generate oligonucleotide sets (e-catalogs) from 16S/18S rRNA sequences, calculate pairwise distances between them based on D2 statistics, compute distance trees, and compare their performance against alignment-based and k-mer trees. Although the catalogs themselves were superseded by full-length sequences, this stage in the development of computational molecular biology remains instructive for us today. Landes Bioscience 2014-03-01 2014-01-14 /pmc/articles/PMC4008546/ /pubmed/24572375 http://dx.doi.org/10.4161/rna.27505 Text en Copyright © 2014 Landes Bioscience http://creativecommons.org/licenses/by-nc/3.0/ This is an open-access article licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License. The article may be redistributed, reproduced, and reused for non-commercial purposes, provided the original source is properly cited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Ragan, Mark A
Bernard, Guillaume
Chan, Cheong Xin
spellingShingle Ragan, Mark A
Bernard, Guillaume
Chan, Cheong Xin
Molecular phylogenetics before sequences: Oligonucleotide catalogs as k-mer spectra
author_facet Ragan, Mark A
Bernard, Guillaume
Chan, Cheong Xin
author_sort Ragan, Mark A
title Molecular phylogenetics before sequences: Oligonucleotide catalogs as k-mer spectra
title_short Molecular phylogenetics before sequences: Oligonucleotide catalogs as k-mer spectra
title_full Molecular phylogenetics before sequences: Oligonucleotide catalogs as k-mer spectra
title_fullStr Molecular phylogenetics before sequences: Oligonucleotide catalogs as k-mer spectra
title_full_unstemmed Molecular phylogenetics before sequences: Oligonucleotide catalogs as k-mer spectra
title_sort molecular phylogenetics before sequences: oligonucleotide catalogs as k-mer spectra
description From 1971 to 1985, Carl Woese and colleagues generated oligonucleotide catalogs of 16S/18S rRNAs from more than 400 organisms. Using these incomplete and imperfect data, Carl and his colleagues developed unprecedented insights into the structure, function, and evolution of the large RNA components of the translational apparatus. They recognized a third domain of life, revealed the phylogenetic backbone of bacteria (and its limitations), delineated taxa, and explored the tempo and mode of microbial evolution. For these discoveries to have stood the test of time, oligonucleotide catalogs must carry significant phylogenetic signal; they thus bear re-examination in view of the current interest in alignment-free phylogenetics based on k-mers. Here we consider the aims, successes, and limitations of this early phase of molecular phylogenetics. We computationally generate oligonucleotide sets (e-catalogs) from 16S/18S rRNA sequences, calculate pairwise distances between them based on D2 statistics, compute distance trees, and compare their performance against alignment-based and k-mer trees. Although the catalogs themselves were superseded by full-length sequences, this stage in the development of computational molecular biology remains instructive for us today.
publisher Landes Bioscience
publishDate 2014
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4008546/
_version_ 1612084859334819840