Abstract
A factorization f1,…,fm of a string w is called a repetition factorization of w if each factor fi is a repetition, namely, \(f_{i} = x^{k}x^{\prime }\) for some non-empty string x, an integer k ≥ 2, and \(x^{\prime }\) being a proper prefix of x. Dumitran et al. (Proc. SPIRE 2015) proposed an algorithm which computes an arbitrary repetition factorization of a given string w in O(n) time, where n is the length of w. The number of factors (i.e. repetitions) contained in the output of their algorithm is not known or guaranteed. In this paper, we propose two algorithms for computing smallest/largest repetition factorizations in \(O(n \log n)\) time, which respectively consist of the smallest/largest number of factors. The first algorithm is a simple \(O(n \log n)\)-space algorithm based on a reduction of the problem to the shortest/longest path problem on the DAG of size \(O(n \log n)\). The second one simulates the dynamic programming algorithm for shortest/longest path problem within O(n) space based on the idea of the first algorithm. Moreover, we discuss combinatorial structures of smallest/largest repetition factorizations of Fibonacci strings.
Similar content being viewed by others
Notes
A preliminary version of this paper appeared as [16].
References
Badkobeh, G., Bannai, H., Goto, K., I, T., Iliopoulos, C.S., Inenaga, S., Puglisi, S.J., Sugimoto, S.: Closed factorization. Discret. Appl. Math. 212, 23–29 (2016)
Bannai, H., I, T., Inenaga, S., Nakashima, Y., Takeda, M., Tsuruta, K.: The “runs” theorem. SIAM J. Comput. 46(5), 1501–1514 (2017)
Bannai, H., Gagie, T., Inenaga, S., Kärkkäinen, J., Kempa, D., Piatkowski, M., Sugimoto, S.: Diverse palindromic factorization is NP-complete. Int. J. Found. Comput. Sci. 29(2), 143–164 (2018)
Borozdin, K., Kosolobov, D., Rubinchik, M., Shur, A.M.: Palindromic Length in Linear Time. In: CPM 2017, pp. 23:1–23:12 (2017)
Chen, K.T., Fox, R.H., Lyndon, R.C.: Free differential calculus. iv. The quotient groups of the lower central series. Ann. Math. 68(1), 81–95 (1958)
Crochemore, M., Ilie, L.: Computing longest previous factor in linear time and applications. Inf. Process. Lett. 106(2), 75–80 (2008). https://doi.org/10.1016/j.ipl.2007.10.006
Crochemore, M., Rytter, W.: Sqares, cubes, and time-space efficient string searching. Algorithmica 13(5), 405–425 (1995)
Crochemore, M., Ilie, L., Smyth, W.F.: A simple algorithm for computing the Lempel Ziv factorization. In: Proceedings of DCC 2008, pp. 482–488 (2008)
Cummings, L.J., Moore, D., Karhumäki, J.: Borders of Fibonacci strings. J. Comb. Math. Comb. Comput. 20, 81–88 (1996)
Dumitran, M., Manea, F., Nowotka, D.: On prefix/suffix-square free words. In: Proceedings of SPIRE, pp. 54–66 (2015)
Duval, J.: Factorizing words over an ordered alphabet. J. Algorithms 4(4), 363–381 (1983)
Ellert, J., Fischer, J.: Linear time runs over general ordered alphabets. In: Bansal, N., Merelli, E., Worrell, J. (eds.) 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12–16, 2021, Glasgow, Scotland (Virtual Conference), LIPIcs, vol. 198, pp. 63:1–63:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)
Fici, G., Gagie, T., Kärkkäinen, J., Kempa, D.: A subquadratic algorithm for minimum palindromic factorization. J. Discrete Algorithms 28, 41–48 (2014)
Fine, N.J., Wilf, H.S.: Uniqueness theorems for periodic functions. Proc. Am. Math. Soc. 16(1), 109–114 (1965)
Fraenkel, A.S., Simpson, J.: The exact number of squares in fibonacci words. Theor. Comput. Sci. 218(1), 95–106 (1999)
Inoue, H., Matsuoka, Y., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Computing smallest and largest repetition factorizations in O(n log n) time. In: Holub, J., Zdárek, J. (eds.) Proceedings of the Prague Stringology Conference 2016, Prague, Czech Republic, August 29–31, 2016, pp. 135–145 (2016)
Kolpakov, R.M., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: Proceedings of FOCS 1999, pp. 596–604 (1999)
Kolpakov, R., Podolskiy, M., Posypkin, M., Khrapov, N.: Searching of gapped repeats and subrepetitions in a word. J. Discrete Algorithms 46–47, 1–15 (2017)
Kufleitner, M.: On bijective variants of the Burrows-Wheeler transform. In: Proceedings of PSC 2009, pp. 65–79 (2009)
Lothaire, M.: Combinatorics on Words. Addison-Wesley, Reading (1983)
Matsuoka, Y., Inenaga, S., Bannai, H., Takeda, M., Manea, F.: Factorizing a string into squares in linear time. In: Proceedings of CPM 2016, pp. 27:1–27:12 (2016)
Nakashima, Y., Takagi, T., Inenaga, S., Bannai, H., Takeda, M.: Constructing LZ78 tries and position heaps in linear time for large alphabets. Inf. Process. Lett. 115(9), 655–659 (2015)
Storer, J., Szymanski, T.: Data compression via textual substitution. J. ACM 29(4), 928–951 (1982)
Tanimura, Y., Fujishige, Y., I, T., Inenaga, S., Bannai, H., Takeda, M.: A faster algorithm for computing maximal α-gapped repeats in a string. In: Proceedings of SPIRE 2015, pp. 124–136 (2015)
I, T., Sugimoto, S., Inenaga, S., Bannai, H., Takeda, M.: Computing palindromic factorizations and palindromic covers on-line. In: Proceedings of CPM 2014, pp. 150–161 (2014)
Welch, T.A.: A technique for high performance data compression. IEEE Comput. 17, 8–19 (1984)
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory IT-23(3), 337–349 (1977)
Ziv, J., Lempel, A.: Compression of individual sequences via variable-length coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1978)
Acknowledgments
This work was supported by JSPS KAKENHI Grant Numbers 25240003, 26280003, JP16H02783, JP17H01697, JP18K18002, JP18H04098, and by JST PRESTO Grant Number JPMJPR1922.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Inoue, H., Matsuoka, Y., Nakashima, Y. et al. Factorizing Strings into Repetitions. Theory Comput Syst 66, 484–501 (2022). https://doi.org/10.1007/s00224-022-10070-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00224-022-10070-3