Lempel-Ziv factorization using less time & space

For 30 years the Lempel-Ziv factorization LZ x of a string x = x[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZ x was based on T(n)-time...

Full description

Bibliographic Details
Main Authors: Chen, G., Puglisi, Simon, Smyth, B.
Format: Journal Article
Published: Birkhauser 2008
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/45235
_version_ 1848757226609573888
author Chen, G.
Puglisi, Simon
Smyth, B.
author_facet Chen, G.
Puglisi, Simon
Smyth, B.
author_sort Chen, G.
building Curtin Institutional Repository
collection Online Access
description For 30 years the Lempel-Ziv factorization LZ x of a string x = x[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZ x was based on T(n)-time (or, depending on the measure used, O(n log n)-time) processing of the suffix tree ST x of x. Recently Abouelhoda et al. proposed an efficient Lempel-Ziv factorization algorithm based on an 'enhanced' suffix array - that is, a suffix array SA x together with supporting data structures, principally an 'interval tree'. In this paper we introduce a collection of fast space-efficient algorithms for LZ factorization, also based on suffix arrays, that in theory as well as in many practical circumstances are superior to those previously proposed; one family out of this collection achieves true T(n)-time alphabet-independent processing in the worst case by avoiding tree structures altogether.
first_indexed 2025-11-14T09:24:44Z
format Journal Article
id curtin-20.500.11937-45235
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T09:24:44Z
publishDate 2008
publisher Birkhauser
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-452352017-09-13T15:58:41Z Lempel-Ziv factorization using less time & space Chen, G. Puglisi, Simon Smyth, B. LZ factorization suffix tree suffix array Lempel-Ziv factorization For 30 years the Lempel-Ziv factorization LZ x of a string x = x[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZ x was based on T(n)-time (or, depending on the measure used, O(n log n)-time) processing of the suffix tree ST x of x. Recently Abouelhoda et al. proposed an efficient Lempel-Ziv factorization algorithm based on an 'enhanced' suffix array - that is, a suffix array SA x together with supporting data structures, principally an 'interval tree'. In this paper we introduce a collection of fast space-efficient algorithms for LZ factorization, also based on suffix arrays, that in theory as well as in many practical circumstances are superior to those previously proposed; one family out of this collection achieves true T(n)-time alphabet-independent processing in the worst case by avoiding tree structures altogether. 2008 Journal Article http://hdl.handle.net/20.500.11937/45235 10.1007/s11786-007-0024-4 Birkhauser restricted
spellingShingle LZ factorization
suffix tree
suffix array
Lempel-Ziv factorization
Chen, G.
Puglisi, Simon
Smyth, B.
Lempel-Ziv factorization using less time & space
title Lempel-Ziv factorization using less time & space
title_full Lempel-Ziv factorization using less time & space
title_fullStr Lempel-Ziv factorization using less time & space
title_full_unstemmed Lempel-Ziv factorization using less time & space
title_short Lempel-Ziv factorization using less time & space
title_sort lempel-ziv factorization using less time & space
topic LZ factorization
suffix tree
suffix array
Lempel-Ziv factorization
url http://hdl.handle.net/20.500.11937/45235