Faster algorithms for computing maximal multirepeats in multiple sequences

A repeat in a string is a substring that occurs more than once. A repeat is extendible if every occurrence of the repeat has an identical letter either on the left or on the right; otherwise, it is maximal. A multirepeat is a repeat that occurs at least mmin times (mmin greater than/equal to 2) in e...

Full description

Bibliographic Details
Main Authors: Iliopoulos, Costas, Smyth, Bill, Yusufu, M.
Format: Journal Article
Published: IOS Press 2009
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/22370
_version_ 1848750852181852160
author Iliopoulos, Costas
Smyth, Bill
Yusufu, M.
author_facet Iliopoulos, Costas
Smyth, Bill
Yusufu, M.
author_sort Iliopoulos, Costas
building Curtin Institutional Repository
collection Online Access
description A repeat in a string is a substring that occurs more than once. A repeat is extendible if every occurrence of the repeat has an identical letter either on the left or on the right; otherwise, it is maximal. A multirepeat is a repeat that occurs at least mmin times (mmin greater than/equal to 2) in each of at least q greater than/equal to 1 strings in a given set of strings. In this paper, we describe a family of efficient algorithms based on suffix arrays to compute maximal multirepeats under various constraints. Our algorithms are faster, more flexible and much more space-efficient than algorithms recently proposed for this problem. The results extend recent work by two of the authors computing all maximal repeats in a single string.
first_indexed 2025-11-14T07:43:25Z
format Journal Article
id curtin-20.500.11937-22370
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T07:43:25Z
publishDate 2009
publisher IOS Press
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-223702017-01-30T12:30:58Z Faster algorithms for computing maximal multirepeats in multiple sequences Iliopoulos, Costas Smyth, Bill Yusufu, M. gaps maximal multirepeats biological sequences repeats suffix arrays A repeat in a string is a substring that occurs more than once. A repeat is extendible if every occurrence of the repeat has an identical letter either on the left or on the right; otherwise, it is maximal. A multirepeat is a repeat that occurs at least mmin times (mmin greater than/equal to 2) in each of at least q greater than/equal to 1 strings in a given set of strings. In this paper, we describe a family of efficient algorithms based on suffix arrays to compute maximal multirepeats under various constraints. Our algorithms are faster, more flexible and much more space-efficient than algorithms recently proposed for this problem. The results extend recent work by two of the authors computing all maximal repeats in a single string. 2009 Journal Article http://hdl.handle.net/20.500.11937/22370 IOS Press fulltext
spellingShingle gaps
maximal multirepeats
biological sequences
repeats
suffix arrays
Iliopoulos, Costas
Smyth, Bill
Yusufu, M.
Faster algorithms for computing maximal multirepeats in multiple sequences
title Faster algorithms for computing maximal multirepeats in multiple sequences
title_full Faster algorithms for computing maximal multirepeats in multiple sequences
title_fullStr Faster algorithms for computing maximal multirepeats in multiple sequences
title_full_unstemmed Faster algorithms for computing maximal multirepeats in multiple sequences
title_short Faster algorithms for computing maximal multirepeats in multiple sequences
title_sort faster algorithms for computing maximal multirepeats in multiple sequences
topic gaps
maximal multirepeats
biological sequences
repeats
suffix arrays
url http://hdl.handle.net/20.500.11937/22370