Fast optimal algorithms for computing all the repeats in a string
Given a string x = x[1..n] on an alphabet of size a, and a threshold pmin = 1, we first describe a new algorithm PSY1 that, based on suffix array construction, computes all the complete nonextendible repeats in x of length p = pmin. PSY1 executes in Θ(n) time independent of alphabet size and is an o...
| Main Authors: | , , |
|---|---|
| Other Authors: | |
| Format: | Conference Paper |
| Published: |
PSC
2008
|
| Online Access: | http://www.stringology.org/event/2008/p15.html http://hdl.handle.net/20.500.11937/35833 |
| _version_ | 1848754603632361472 |
|---|---|
| author | Puglisi, Simon Smyth, Bill Yusufu, M. |
| author2 | Jan Holub |
| author_facet | Jan Holub Puglisi, Simon Smyth, Bill Yusufu, M. |
| author_sort | Puglisi, Simon |
| building | Curtin Institutional Repository |
| collection | Online Access |
| description | Given a string x = x[1..n] on an alphabet of size a, and a threshold pmin = 1, we first describe a new algorithm PSY1 that, based on suffix array construction, computes all the complete nonextendible repeats in x of length p = pmin. PSY1 executes in Θ(n) time independent of alphabet size and is an order of magnitude faster than the two other algorithms previously proposed for this problem. Second, we describe a new fast algorithm PSY2 for computing all complete supernonextendible repeats in x that also executes in Θ(n) time independent of alphabet size, thus asymptotically faster than methods previously proposed. Both algorithms require 9n bytes of storage, including preprocessing (with a minor caveat for PSY1). We conclude with a brief discussion of applications to bioinformatics and data compression. |
| first_indexed | 2025-11-14T08:43:02Z |
| format | Conference Paper |
| id | curtin-20.500.11937-35833 |
| institution | Curtin University Malaysia |
| institution_category | Local University |
| last_indexed | 2025-11-14T08:43:02Z |
| publishDate | 2008 |
| publisher | PSC |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | curtin-20.500.11937-358332022-12-07T06:50:49Z Fast optimal algorithms for computing all the repeats in a string Puglisi, Simon Smyth, Bill Yusufu, M. Jan Holub Jan Zdarek Given a string x = x[1..n] on an alphabet of size a, and a threshold pmin = 1, we first describe a new algorithm PSY1 that, based on suffix array construction, computes all the complete nonextendible repeats in x of length p = pmin. PSY1 executes in Θ(n) time independent of alphabet size and is an order of magnitude faster than the two other algorithms previously proposed for this problem. Second, we describe a new fast algorithm PSY2 for computing all complete supernonextendible repeats in x that also executes in Θ(n) time independent of alphabet size, thus asymptotically faster than methods previously proposed. Both algorithms require 9n bytes of storage, including preprocessing (with a minor caveat for PSY1). We conclude with a brief discussion of applications to bioinformatics and data compression. 2008 Conference Paper http://hdl.handle.net/20.500.11937/35833 http://www.stringology.org/event/2008/p15.html PSC restricted |
| spellingShingle | Puglisi, Simon Smyth, Bill Yusufu, M. Fast optimal algorithms for computing all the repeats in a string |
| title | Fast optimal algorithms for computing all the repeats in a string |
| title_full | Fast optimal algorithms for computing all the repeats in a string |
| title_fullStr | Fast optimal algorithms for computing all the repeats in a string |
| title_full_unstemmed | Fast optimal algorithms for computing all the repeats in a string |
| title_short | Fast optimal algorithms for computing all the repeats in a string |
| title_sort | fast optimal algorithms for computing all the repeats in a string |
| url | http://www.stringology.org/event/2008/p15.html http://hdl.handle.net/20.500.11937/35833 |