The Burrows-Wheeler Transform is a text transformation scheme that has found applications in different aspects of the data explosion problem, from data compression to index structures and search. The BWT belongs to a new class of compression algorithms, distinguished by its ability to perform compression by sorted contexts. More recently, the BWT has also found various applications in addition to text data compression, such as in lossless and lossy image compression, tree-source identification, bioinformatics, machine translation, shape matching, and test data compression. This book will serve as a reference for seasoned professionals or researchers in the area, while providing a gentle introduction, making it accessible for senior undergraduate students or first year graduate students embarking upon research in compression, pattern matching, full text retrieval, compressed index structures, or other areas related to the BWT. Key Features Comprehensive resource for information related to different aspects of the Burrows-Wheeler Transform including: Gentle introduction to the BWT History of the development of the BWT Detailed theoretical analysis of algorithmic issues and performance limits Searching on BWT compressed data Hardware architectures for the BWT Explores non-traditional applications of the BWT in areas such as: Bioinformatics Joint source-channel coding Modern information retrieval Machine translation Test data compression for systems-on-chip Teaching materials ideal for classroom use on courses in: Data Compression and Source Coding Modern Information Retrieval Information Science Digital Libraries
Cited By
- Clare A, Daykin J, Mills T and Zarges C Evolutionary search techniques for the Lyndon factorization of biosequences Proceedings of the Genetic and Evolutionary Computation Conference Companion, (1543-1550)
- Mondal U (2019). Achieving lossless compression of audio by encoding its constituted components (LCAEC), Innovations in Systems and Software Engineering, 15:1, (75-85), Online publication date: 1-Mar-2019.
- Adjeroh D, Allaga M, Tan J, Lin J, Jiang Y, Abbasi A and Zhou X String-Based Models for Predicting RNA-Protein Interaction Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics, (661-666)
- Daykin J, Groult R, Guesnet Y, Lecroq T, Lefebvre A, Léonard M and Prieur-Gaston É (2016). Binary block order Rouen Transform, Theoretical Computer Science, 656:PB, (118-134), Online publication date: 20-Dec-2016.
- Beal R and Adjeroh D (2016). Compressed parameterized pattern matching, Theoretical Computer Science, 609:P1, (129-142), Online publication date: 4-Jan-2016.
- Martinez H, Tarraga J, Medina I, Barrachina S, Castillo M, Dopazo J and Quintana-Orti E (2015). Concurrent and Accurate Short Read Mapping on Multicore Processors, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 12:5, (995-1007), Online publication date: 1-Sep-2015.
- Crochemore M, Grossi R, Kärkkäinen J and Landau G (2015). Computing the Burrows-Wheeler transform in place and in small space, Journal of Discrete Algorithms, 32:C, (44-52), Online publication date: 1-May-2015.
- Kalajdzic K, Ali S and Patel A (2015). Rapid lossless compression of short text messages, Computer Standards & Interfaces, 37:C, (53-59), Online publication date: 1-Jan-2015.
- Beal R and Adjeroh D (2015). Efficient pattern matching for RNA secondary structures, Theoretical Computer Science, 592:C, (59-71), Online publication date: 9-Aug-2015.
- Chikhi R, Limasset A, Jackman S, Simpson J and Medvedev P On the Representation of de Bruijn Graphs Proceedings of the 18th Annual International Conference on Research in Computational Molecular Biology - Volume 8394, (35-55)
- Smyth W (2013). Computing regularities in strings, European Journal of Combinatorics, 34:1, (3-14), Online publication date: 1-Jan-2013.
- Tan J and Adjeroh D Predicting Protein Families using Protein Shape Context Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics, (733-734)
- Beal R, Adjeroh D and Abbasi A The Forward Stem Matrix Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics, (575-584)
- Martínez H, Tárraga J, Medina I, Barrachina S, Castillo M, Dopazo J and Quintana-Ortí E A dynamic pipeline for RNA sequencing on multicore processors Proceedings of the 20th European MPI Users' Group Meeting, (235-240)
- Alatabbi A, Crochemore M, Daykin J and Mouchard L Lyndon fountains and the Burrows-Wheeler transform Proceedings of the CUBE International Information Technology Conference, (441-446)
- Kärkkäinen J, Mikkola P and Kempa D Grammar precompression speeds up burrows---wheeler compression Proceedings of the 19th international conference on String Processing and Information Retrieval, (330-335)
- Cox A, Jakobi T, Rosone G and Schulz-Trieglaff O Comparing DNA sequence collections by direct comparison of compressed text indexes Proceedings of the 12th international conference on Algorithms in Bioinformatics, (214-224)
- Kimura K, Koike A and Nakai K Seed-set construction by equi-entropy partitioning for efficient and sensitive short-read mapping Proceedings of the 11th international conference on Algorithms in bioinformatics, (151-162)
- Bauer M, Cox A and Rosone G Lightweight BWT construction for very large string collections Proceedings of the 22nd annual conference on Combinatorial pattern matching, (219-231)
- Nong G, Zhang S and Chan W (2011). Computing the inverse sort transform in linear time, ACM Transactions on Algorithms, 7:2, (1-16), Online publication date: 1-Mar-2011.
- Beal R and Adjeroh D p-Suffix sorting as arithmetic coding Proceedings of the 22nd international conference on Combinatorial Algorithms, (44-56)
- Beal R and Adjeroh D Parameterized longest previous factor Proceedings of the 22nd international conference on Combinatorial Algorithms, (31-43)
- Kärkkäinen J and Puglisi S Medium-space algorithms for inverse BWT Proceedings of the 18th annual European conference on Algorithms: Part I, (451-462)
- Ferragina P Data structures Proceedings of the 18th annual European conference on Algorithms: Part II, (1-16)
- Yokoo H Extension and faster implementation of the GRP transform for lossless compression Proceedings of the 21st annual conference on Combinatorial pattern matching, (338-347)
- Abel J (2010). Post BWT stages of the Burrows–Wheeler compression algorithm, Software—Practice & Experience, 40:9, (751-777), Online publication date: 1-Aug-2010.
Index Terms
- The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching
Recommendations
Universal lossless source coding with the Burrows Wheeler transform
The Burrows Wheeler transform (1994) is a reversible sequence transformation used in a variety of practical lossless source-coding algorithms. In each, the BWT is followed by a lossless source code that attempts to exploit the natural ordering of the ...
Waveform and image compression using the Burrows Wheeler transform and the wavelet transform
ICIP '97: Proceedings of the 1997 International Conference on Image Processing (ICIP '97) 3-Volume Set-Volume 1 - Volume 1Recently, the Burrows Wheeler transform (BWT) has been proposed for text compression. The BWT based compression algorithm performs close to the best algorithm known today on a set of standard text files, and it has less complexity. In this paper we ...