Fast optimal algorithms for computing all the repeats in a string

Given a string x = x[1..n] on an alphabet of size a, and a threshold pmin = 1, we first describe a new algorithm PSY1 that, based on suffix array construction, computes all the complete nonextendible repeats in x of length p = pmin. PSY1 executes in Θ(n) time independent of alphabet size and is an o...

Full description

Bibliographic Details
Main Authors:	Puglisi, Simon, Smyth, Bill, Yusufu, M.
Other Authors:	Jan Holub
Format:	Conference Paper
Published:	PSC 2008
Online Access:	http://www.stringology.org/event/2008/p15.html http://hdl.handle.net/20.500.11937/35833

_version_	1848754603632361472
author	Puglisi, Simon Smyth, Bill Yusufu, M.
author2	Jan Holub
author_facet	Jan Holub Puglisi, Simon Smyth, Bill Yusufu, M.
author_sort	Puglisi, Simon
building	Curtin Institutional Repository
collection	Online Access
description	Given a string x = x[1..n] on an alphabet of size a, and a threshold pmin = 1, we first describe a new algorithm PSY1 that, based on suffix array construction, computes all the complete nonextendible repeats in x of length p = pmin. PSY1 executes in Θ(n) time independent of alphabet size and is an order of magnitude faster than the two other algorithms previously proposed for this problem. Second, we describe a new fast algorithm PSY2 for computing all complete supernonextendible repeats in x that also executes in Θ(n) time independent of alphabet size, thus asymptotically faster than methods previously proposed. Both algorithms require 9n bytes of storage, including preprocessing (with a minor caveat for PSY1). We conclude with a brief discussion of applications to bioinformatics and data compression.
first_indexed	2025-11-14T08:43:02Z
format	Conference Paper
id	curtin-20.500.11937-35833
institution	Curtin University Malaysia
institution_category	Local University
last_indexed	2025-11-14T08:43:02Z
publishDate	2008
publisher	PSC
recordtype	eprints
repository_type	Digital Repository
spelling	curtin-20.500.11937-358332022-12-07T06:50:49Z Fast optimal algorithms for computing all the repeats in a string Puglisi, Simon Smyth, Bill Yusufu, M. Jan Holub Jan Zdarek Given a string x = x[1..n] on an alphabet of size a, and a threshold pmin = 1, we first describe a new algorithm PSY1 that, based on suffix array construction, computes all the complete nonextendible repeats in x of length p = pmin. PSY1 executes in Θ(n) time independent of alphabet size and is an order of magnitude faster than the two other algorithms previously proposed for this problem. Second, we describe a new fast algorithm PSY2 for computing all complete supernonextendible repeats in x that also executes in Θ(n) time independent of alphabet size, thus asymptotically faster than methods previously proposed. Both algorithms require 9n bytes of storage, including preprocessing (with a minor caveat for PSY1). We conclude with a brief discussion of applications to bioinformatics and data compression. 2008 Conference Paper http://hdl.handle.net/20.500.11937/35833 http://www.stringology.org/event/2008/p15.html PSC restricted
spellingShingle	Puglisi, Simon Smyth, Bill Yusufu, M. Fast optimal algorithms for computing all the repeats in a string
title	Fast optimal algorithms for computing all the repeats in a string
title_full	Fast optimal algorithms for computing all the repeats in a string
title_fullStr	Fast optimal algorithms for computing all the repeats in a string
title_full_unstemmed	Fast optimal algorithms for computing all the repeats in a string
title_short	Fast optimal algorithms for computing all the repeats in a string
title_sort	fast optimal algorithms for computing all the repeats in a string
url	http://www.stringology.org/event/2008/p15.html http://hdl.handle.net/20.500.11937/35833

Fast optimal algorithms for computing all the repeats in a string

Similar Items