Lempel-Ziv factorization using less time & space

For 30 years the Lempel-Ziv factorization LZ x of a string x = x[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZ x was based on T(n)-time...

Full description

Bibliographic Details
Main Authors:	Chen, G., Puglisi, Simon, Smyth, B.
Format:	Journal Article
Published:	Birkhauser 2008
Subjects:	LZ factorization suffix tree suffix array Lempel-Ziv factorization
Online Access:	http://hdl.handle.net/20.500.11937/45235

Description
Summary:	For 30 years the Lempel-Ziv factorization LZ x of a string x = x[1..n] has been a fundamental data structure of string processing, especially valuable for string compression and for computing all the repetitions (runs) in x. Traditionally the standard method for computing LZ x was based on T(n)-time (or, depending on the measure used, O(n log n)-time) processing of the suffix tree ST x of x. Recently Abouelhoda et al. proposed an efficient Lempel-Ziv factorization algorithm based on an 'enhanced' suffix array - that is, a suffix array SA x together with supporting data structures, principally an 'interval tree'. In this paper we introduce a collection of fast space-efficient algorithms for LZ factorization, also based on suffix arrays, that in theory as well as in many practical circumstances are superior to those previously proposed; one family out of this collection achieves true T(n)-time alphabet-independent processing in the worst case by avoiding tree structures altogether.

Lempel-Ziv factorization using less time & space

Similar Items