RGFA: powerful and convenient handling of assembly graphs

The “Graphical Fragment Assembly” (GFA) is an emerging format for the representation of sequence assembly graphs, which can be adopted by both de Bruijn graph- and string graph-based assemblers. Here we present RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to...

Full description

Bibliographic Details
Main Authors: Gonnella, Giorgio, Kurtz, Stefan
Format: Online
Language:English
Published: PeerJ Inc. 2016
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5103826/
id pubmed-5103826
recordtype oai_dc
spelling pubmed-51038262016-11-14 RGFA: powerful and convenient handling of assembly graphs Gonnella, Giorgio Kurtz, Stefan Bioinformatics The “Graphical Fragment Assembly” (GFA) is an emerging format for the representation of sequence assembly graphs, which can be adopted by both de Bruijn graph- and string graph-based assemblers. Here we present RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to conveniently parse, edit and write GFA files. Complex operations such as the separation of the implicit instances of repeats and the merging of linear paths can be performed. A typical application of RGFA is the editing of a graph, to finish the assembly of a sequence, using information not available to the assembler. We illustrate a use case, in which the assembly of a repetitive metagenomic fosmid insert was completed using a script based on RGFA. Furthermore, we show how the API provided by RGFA can be employed to design complex graph editing algorithms. As an example, we developed a detection algorithm for CRISPRs in a de Bruijn graph. Finally, RGFA can be used for comparing assembly graphs, e.g., to document the changes in a graph after applying a GUI editor. A program, GFAdiff is provided, which compares the information in two graphs, and generate a report or a Ruby script documenting the transformation steps between the graphs. PeerJ Inc. 2016-11-08 /pmc/articles/PMC5103826/ /pubmed/27843717 http://dx.doi.org/10.7717/peerj.2681 Text en ©2016 Gonnella and Kurtz http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author Gonnella, Giorgio
Kurtz, Stefan
spellingShingle Gonnella, Giorgio
Kurtz, Stefan
RGFA: powerful and convenient handling of assembly graphs
author_facet Gonnella, Giorgio
Kurtz, Stefan
author_sort Gonnella, Giorgio
title RGFA: powerful and convenient handling of assembly graphs
title_short RGFA: powerful and convenient handling of assembly graphs
title_full RGFA: powerful and convenient handling of assembly graphs
title_fullStr RGFA: powerful and convenient handling of assembly graphs
title_full_unstemmed RGFA: powerful and convenient handling of assembly graphs
title_sort rgfa: powerful and convenient handling of assembly graphs
description The “Graphical Fragment Assembly” (GFA) is an emerging format for the representation of sequence assembly graphs, which can be adopted by both de Bruijn graph- and string graph-based assemblers. Here we present RGFA, an implementation of the proposed GFA specification in Ruby. It allows the user to conveniently parse, edit and write GFA files. Complex operations such as the separation of the implicit instances of repeats and the merging of linear paths can be performed. A typical application of RGFA is the editing of a graph, to finish the assembly of a sequence, using information not available to the assembler. We illustrate a use case, in which the assembly of a repetitive metagenomic fosmid insert was completed using a script based on RGFA. Furthermore, we show how the API provided by RGFA can be employed to design complex graph editing algorithms. As an example, we developed a detection algorithm for CRISPRs in a de Bruijn graph. Finally, RGFA can be used for comparing assembly graphs, e.g., to document the changes in a graph after applying a GUI editor. A program, GFAdiff is provided, which compares the information in two graphs, and generate a report or a Ruby script documenting the transformation steps between the graphs.
publisher PeerJ Inc.
publishDate 2016
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5103826/
_version_ 1613721449233645568