Abstract
This paper compares the difference in performance of the Prediction by Partial Matching family of compressors (PPM* and the original PPMk) and the Lempel-Ziv 78 (LZ) algorithm. We construct an infinite binary sequence whose worst-case compression ratio for PPM* is 0, while PPMk’s and LZ’s best-case compression ratios are at least 1/2 and 1 respectively. This sequence is normal and is an enumeration of all binary strings in order of length, i.e. all strings of length 1 followed by all strings of length 2 and so on. The sequence is built using repetitions of de Bruijn strings of increasing order.
Supported by the Government of Ireland Postgraduate Scholarship Programme managed by the Irish Research Council.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Martin proves that this occurs when \(|x| = 2^n + n -1\).
- 2.
We note that post submission, we developed a proof that for all k, \(\rho _{\mathrm {PPM}_{k}}(S) = 1\) for any normal S. This will appear in a later journal version.
References
Becher, V., Carton, O., Heiber, P.A.: Normality and automata. J. Comput. Syst. Sci. 81(8), 1592–1613 (2015). https://doi.org/10.1016/j.jcss.2015.04.007
Becher, V., Heiber, P.A.: Normal numbers and finite automata. Theor. Comput. Sci. 477, 109–116 (2013). https://doi.org/10.1016/j.tcs.2013.01.019
Bell, T.C., Cleary, J.G., Witten, I.H.: Text Compression. Prentice-Hall Inc., Upper Saddle River, NJ, USA (1990)
Bennett, C.H.: Logical depth and physical complexity. The Universal Turing Machine, A Half-Century Survey, pp. 227–257 (1988)
Émile Borel, M.: Les probabilités dénombrables et leurs applications arithmétiques. Rendiconti del Circolo Matematico di Palermo (1884–1940) 27(1), 247–271 (1909). https://doi.org/10.1007/BF03019651
Bruijn, N.G.D.: A combinatorial problem. Proc. Koninklijke Nederlandse Academie van Wetenschappen 49, 758–764 (1946)
Carton, O., Heiber, P.A.: Normality and two-way automata. Inf. Comput. 241, 264–276 (2015). https://doi.org/10.1016/j.ic.2015.02.001
Cleary, J.G., Teahan, W.J.: Unbounded length contexts for PPM. Comput. J. 40(2/3), 67–75 (1997). https://doi.org/10.1109/DCC.1995.515495
Cleary, J.G., Witten, I.H.: Data compression using adaptive coding and partial string matching. IEEE Trans. Commun. 32(4), 396–402 (1984). https://doi.org/10.1109/TCOM.1984.1096090
Dai, J., Lathrop, J., Lutz, J., Mayordomo, E.: Finite-state dimension. Theoretical Comput. Sci. 310, 1–33 (2004). https://doi.org/10.1016/S0304-3975(03)00244-5
Doty, D., Moser, P.: Feasible depth. In: Cooper, S.B., Löwe, B., Sorbi, A. (eds.) CiE 2007. LNCS, vol. 4497, pp. 228–237. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73001-9_24
Jordon, L., Moser, P.: On the difference between finite-state and pushdown depth. In: Chatzigeorgiou, A., et al. (eds.) SOFSEM 2020. LNCS, vol. 12011, pp. 187–198. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-38919-2_16
Lathrop, J.I., Strauss, M.J.: A universal upper bound on the performance of the Lempel-Ziv algorithm on maliciously-constructed data. In: Proceedings of the Compression and Complexity of Sequences, vol. 1997, pp. 123–135 (1997). https://doi.org/10.1109/SEQUEN.1997.666909
Martin, M.H.: A problem in arrangements. Bull. Amer. Math. Soc. 40(12), 859–864 (1934). https://doi.org/10.1090/S0002-9904-1934-05988-3
Moffat, A.: Implementing the PPM data compression scheme. IEEE Trans. Commun. 38(11), 1917–1921 (1990). https://doi.org/10.1109/26.61469
Pierce, L.A., Shields, P.C.: Sequences incompressible by slz (lzw), yet fully compressible by ulz. In: Numbers, Information and Complexity, pp. 385–390. Springer (2000). https://doi.org/10.1007/978-1-4757-6048-4_32
Schnorr, C., Stimm, H.: Endliche Automaten und Zufallsfolgen. Acta Inf. 1, 345–359 (1972). https://doi.org/10.1007/BF00289514
Witten, I.H., Neal, R.M., Cleary, J.G.: Arithmetic coding for data compression. Commun. ACM 30(6), 520–540 (1987). https://doi.org/10.1145/214762.214771
Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1978). https://doi.org/10.1109/TIT.1978.1055934
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jordon, L., Moser, P. (2021). A Normal Sequence Compressed by PPM* But Not by Lempel-Ziv 78. In: Bureš, T., et al. SOFSEM 2021: Theory and Practice of Computer Science. SOFSEM 2021. Lecture Notes in Computer Science(), vol 12607. Springer, Cham. https://doi.org/10.1007/978-3-030-67731-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-67731-2_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67730-5
Online ISBN: 978-3-030-67731-2
eBook Packages: Computer ScienceComputer Science (R0)