Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2

Long-read sequencing technologies have improved significantly since their emergence. Their read lengths, potentially spanning entire transcripts, is advantageous for reconstructing transcriptomes. Existing long-read transcriptome assembly methods are primarily reference-based and to date, there is l...

Full description

Bibliographic Details
Main Authors: Nip, K.M., Hafezqorani, S., Gagalova, Kristina, Chiu, R., Yang, C., Warren, R.L., Birol, I.
Format: Journal Article
Language:English
Published: 2023
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/96867
_version_ 1848766201449152512
author Nip, K.M.
Hafezqorani, S.
Gagalova, Kristina
Chiu, R.
Yang, C.
Warren, R.L.
Birol, I.
author_facet Nip, K.M.
Hafezqorani, S.
Gagalova, Kristina
Chiu, R.
Yang, C.
Warren, R.L.
Birol, I.
author_sort Nip, K.M.
building Curtin Institutional Repository
collection Online Access
description Long-read sequencing technologies have improved significantly since their emergence. Their read lengths, potentially spanning entire transcripts, is advantageous for reconstructing transcriptomes. Existing long-read transcriptome assembly methods are primarily reference-based and to date, there is little focus on reference-free transcriptome assembly. We introduce “RNA-Bloom2 [https://github.com/bcgsc/RNA-Bloom]”, a reference-free assembly method for long-read transcriptome sequencing data. Using simulated datasets and spike-in control data, we show that the transcriptome assembly quality of RNA-Bloom2 is competitive to those of reference-based methods. Furthermore, we find that RNA-Bloom2 requires 27.0 to 80.6% of the peak memory and 3.6 to 10.8% of the total wall-clock runtime of a competing reference-free method. Finally, we showcase RNA-Bloom2 in assembling a transcriptome sample of Picea sitchensis (Sitka spruce). Since our method does not rely on a reference, it further sets the groundwork for large-scale comparative transcriptomics where high-quality draft genome assemblies are not readily available.
first_indexed 2025-11-14T11:47:23Z
format Journal Article
id curtin-20.500.11937-96867
institution Curtin University Malaysia
institution_category Local University
language eng
last_indexed 2025-11-14T11:47:23Z
publishDate 2023
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-968672025-02-13T00:58:13Z Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2 Nip, K.M. Hafezqorani, S. Gagalova, Kristina Chiu, R. Yang, C. Warren, R.L. Birol, I. Transcriptome RNA High-Throughput Nucleotide Sequencing Gene Expression Profiling Sequence Analysis, RNA RNA Gene Expression Profiling Sequence Analysis, RNA High-Throughput Nucleotide Sequencing Transcriptome Long-read sequencing technologies have improved significantly since their emergence. Their read lengths, potentially spanning entire transcripts, is advantageous for reconstructing transcriptomes. Existing long-read transcriptome assembly methods are primarily reference-based and to date, there is little focus on reference-free transcriptome assembly. We introduce “RNA-Bloom2 [https://github.com/bcgsc/RNA-Bloom]”, a reference-free assembly method for long-read transcriptome sequencing data. Using simulated datasets and spike-in control data, we show that the transcriptome assembly quality of RNA-Bloom2 is competitive to those of reference-based methods. Furthermore, we find that RNA-Bloom2 requires 27.0 to 80.6% of the peak memory and 3.6 to 10.8% of the total wall-clock runtime of a competing reference-free method. Finally, we showcase RNA-Bloom2 in assembling a transcriptome sample of Picea sitchensis (Sitka spruce). Since our method does not rely on a reference, it further sets the groundwork for large-scale comparative transcriptomics where high-quality draft genome assemblies are not readily available. 2023 Journal Article http://hdl.handle.net/20.500.11937/96867 10.1038/s41467-023-38553-y eng http://creativecommons.org/licenses/by/4.0/ fulltext
spellingShingle Transcriptome
RNA
High-Throughput Nucleotide Sequencing
Gene Expression Profiling
Sequence Analysis, RNA
RNA
Gene Expression Profiling
Sequence Analysis, RNA
High-Throughput Nucleotide Sequencing
Transcriptome
Nip, K.M.
Hafezqorani, S.
Gagalova, Kristina
Chiu, R.
Yang, C.
Warren, R.L.
Birol, I.
Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2
title Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2
title_full Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2
title_fullStr Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2
title_full_unstemmed Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2
title_short Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2
title_sort reference-free assembly of long-read transcriptome sequencing data with rna-bloom2
topic Transcriptome
RNA
High-Throughput Nucleotide Sequencing
Gene Expression Profiling
Sequence Analysis, RNA
RNA
Gene Expression Profiling
Sequence Analysis, RNA
High-Throughput Nucleotide Sequencing
Transcriptome
url http://hdl.handle.net/20.500.11937/96867