Abstract
The basic string matching problem is to determine the occurrences of a short pattern P=p 1 p 2 ... p m in a large text T=t 1 t 2 ... t n , over an alphabet of size σ. Indexes are structures built on the text to speed up searches, but they used to take up much space. In recent years, succinct text indexes have appeared. A prominent example is the FM-index [2], which takes little space (close to that of the compressed text) and replaces the text, as it can search the text (in optimal O(m) time) and reproduce any text substring without accessing it. The main problem of the FM-index is that its space usage depends exponentially on σ, that is, 5H k n + σ σ o(n) for any k, H k being the k-th order entropy of T.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. DEC SRC Research Report 124 (1994)
Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proc. FOCS 2000, pp. 390–398 (2000)
Grabowski, S., Mäkinen, V., Navarro, G.: First Huffman, then Burrows-Wheeler: A Simple Alphabet-Independent FM-Index. Technical Report TR/DCC-2004-4, Dept. of Computer Science, University of Chile, ftp://ftp.dcc.uchile.cl/pub/users/gnavarro/huffbwt.ps.gz
Munro, I.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37–42. Springer, Heidelberg (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grabowski, S., Mäkinen, V., Navarro, G. (2004). First Huffman, Then Burrows-Wheeler: A Simple Alphabet-Independent FM-Index. In: Apostolico, A., Melucci, M. (eds) String Processing and Information Retrieval. SPIRE 2004. Lecture Notes in Computer Science, vol 3246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30213-1_29
Download citation
DOI: https://doi.org/10.1007/978-3-540-30213-1_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23210-0
Online ISBN: 978-3-540-30213-1
eBook Packages: Springer Book Archive