Abstract
We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size. The growing importance of massively stored information requires new approaches to algorithms for compressed texts without decompressing. Denote by LZ(ω) the version of a string ω produced by Lempel-Ziv encoding algorithm. For given compressed strings LZ(T), LZ(P) we give the first known deterministic polynomial time algorithms to compute compressed representations of the set of all occurrences of the pattern P in T, all periods of T, all palindromes of T, and all squares of T. Then we consider several classical language recognition problems:
-
regular language recognition: given LZ(T) and a language L described by a regular expression, test if T ε L,
-
extended regular language recognition: given LZ(T) and a language L described by a LZ-compressed regular expression, test if T ε L, the alphabet is unary,
-
context-free language recognition: given LZ(T) and a language L described by a context-free grammar, test if T ε L, the alphabet is unary.
We show that the first recognition problem has a polynomial time algorithm and the other two problems are NP-hard.
We show also that the LZ encoding can be computed on-line in polynomial time delay and small space (i.e. proportional to the size of the compressed text). Also the compressed representation of a patternmatching automaton for the compressed pattern is computed in polynomial time.
On leave from Institute of Informatics, Warsaw University, ul. Banacha 2, 02097, Warszawa, Poland.
This research was partially supported by the DFG Grant KA 673/4-1, and by the ESPRIT BR Grant 7097 and the ECUS 030.
Supported partially by the grant KBN 8T11C01208.
Supported partially by the grant KBN 8T11C01208.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A. Amir, G. Benson and M. Farach, Let sleeping files lie: pattern-matching in Z-compressed files, in SODA '94.
A. Amir, G. Benson, Efficient two dimensional compressed matching, Proc. of the 2nd IEEE Data Compression Conference 279–288 (1992).
A. Amir, G. Benson and M. Farach, Optimal two-dimensional compressed matching, in ICALP'94.
A. Apostolico, D. Breslauer, Z. Galil, Optimal parallel algorithms for periods, palindromes and squares, in ICALP'92, pp. 296–307.
M. Crochemore and W. Rytter, Text Algorithms, Oxford University Press, New York (1994).
M. Farach and M. Thorup, String matching in Lempel-Ziv compressed strings, in STOC'95, pp. 703–712.
M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman (1979).
L. Gasieniec, M. Karpiński, W. Plandowski and W. Rytter, Randomized Efficient Algorithms for Compressed Strings: the Finger-Print Approach, to appear in proceedings of the 7th Combinatorial Pattern Matching, Laguna Beach (1996).
M. Karpinski, W. Rytter and A. Shinohara, Pattern-matching for strings with short description, in Combinatorial Pattern Matching, 1995.
D. Knuth, The Art of Computing, Vol. II: Seminumerical Algorithms. Second edition. Addison-Wesley, 1981.
A. Lempel and J. Ziv, On the complexity of finite sequences, IEEE Trans. on Inf. Theory 22, 75–81 (1976).
W. Plandowski, Testing equivalence of morphisms on context-free languages, ESA'94, Lecture Notes in Computer Science 855, Springer-Verlag, 460–470 (1994).
J. Storer, Data compression: methods and theory, Computer Science Press, Rockville, Maryland, 1988.
J. Ziv and A. Lempel, A universal algorithm for sequential data compression, IEEE Trans. on Inf. Theory vo. IT-23(3), 337–343, 1977.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gasieniec, L., Karpinski, M., Plandowski, W., Rytter, W. (1996). Efficient algorithms for Lempel-Ziv encoding. In: Karlsson, R., Lingas, A. (eds) Algorithm Theory — SWAT'96. SWAT 1996. Lecture Notes in Computer Science, vol 1097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61422-2_148
Download citation
DOI: https://doi.org/10.1007/3-540-61422-2_148
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61422-7
Online ISBN: 978-3-540-68529-6
eBook Packages: Springer Book Archive