Abstract
Extended BNF grammars (EBNF) allow regular expressions in the right parts of their rules. They are widely used to define languages, and can be represented by recursive Transition Networks (TN) consisting of a set of finite-state machines. We present a novel direct construction of efficient shift-reduce ELR(1) parsers for TNs. We show that such a parser works deterministically if the TN is free from the classical shift-reduce and reduce–reduce conflicts of the LR(1) parsers, and from a new conflict type called convergence conflict. Such a novel condition for determinism is proved correct and is more general than those proposed in the past for EBNF grammars or TNs. Such ELR(1) parsers perform fewer shift moves than the equivalent LR(1) parsers. A simple optimization of the reduction moves is described.
Similar content being viewed by others
Notes
We reserve the term “automata” to the push-down machines of the parsers.
This function is analogous to the “GOTO” component of the canonical parsing table described in the classical textbooks on compilers such as [1].
It would be interesting to supplement the above analysis with experimental measurements for realistic grammars. A few results are mentioned in [3], which point to a greater compactness of the ELR(1) graphs. Just an example: for the Oracle EBNF grammar of the Java language, the TN has 532 machine states and 599 machine transitions. The ELR(1) graph has 1937 p-states and 25,231 transitions, while the standard LR(1) graph has 2946 p-states and 25,837 transitions.
It is well known that the terminal and nonterminal symbols, typically represented in the stack when the LR(1) techniques are taught, are unnecessary.
References
Aho, A., Lam, M., Sethi, R., Ullman, J.: Compilers: Principles, Techniques and Tools. Prentice-Hall, Englewoof Cliffs (2006)
Beatty, J.: On the relationship between the LL(1) and LR(1) grammars. JACM 29(4), 1007–1022 (1982)
Borsotti, A., Breveglieri, L., Crespi Reghizzi, S., Morzenti, A.: Complexity of extended vs classic LR parsers. In: Descriptional Complexity of Formal Systems (DCFS), LNCS, vol. 8614, pp. 77–89. Springer (2014)
Breveglieri, L., Crespi Reghizzi, S., Morzenti, A.: Shift-reduce parsers for transition networks. In: Language and Automata Theory and Applications (LATA), LNCS, vol. 8370, pp. 222–235. Springer (2014)
Celentano, A.: LR parsing technique for extended context-free grammars. Comput. Lang. 6(2), 95–107 (1981)
Chapman, N.: LALR(1,1) parser generation for regular right part grammars. Acta Inform. 21, 29–45 (1984). https://doi.org/10.1007/BF00289138
Conway, M.: Design of a separable transition-diagram compiler. Commun. ACM 6(7), 396–408 (1963)
Crespi Reghizzi, S., Breveglieri, L., Morzenti, A.: Formal Languages and Compilation, 2nd edn. Springer, London (2013)
Engelfriet, J.: Iterating iterated substitution. Theor. Comput. Sci. 5(1), 85–100 (1977). https://doi.org/10.1016/0304-3975(77)90043-3
Gálvez, J.: A note on a proposed LALR parser for extended context-free grammars. Inf. Process. Lett. 50(6), 303–305 (1994). https://doi.org/10.1016/0020-0190(94)00051-4
Grune, D., Jacobs, C.: Parsing Techniques: A Practical Guide, 2nd edn. Springer, London (2009)
Gruska, J.: On a classification of context-free languages. Kybernetika 3(1), 22–29 (1967). http://www.kybernetika.cz/content/1967/1/22
Heilbrunner, S.: On the definition of ELR(k) and ELL(k) grammars. Acta Inform. 11, 169–176 (1979)
Hemerik, K.: Towards a taxonomy for ECFG and RRPG parsing. In: Language and Automata Theory and Applications (LATA), LNCS, vol. 5457, pp. 410–421. Springer (2009). https://doi.org/10.1007/978-3-642-00982-2
Jensen, K., Wirth, N.: Pascal User Manual and Report. LNCS, vol. 18, 2nd edn. Springer, Berlin (1975). https://doi.org/10.1007/3-540-06950-X
Kannapinn, S.: Reconstructing LR theory to eliminate redundance, with an application to the construction of ELR parsers. Ph.D. thesis, Technical University of Berlin (2001). (in German)
Knuth, D.: On the translation of languages from left to right. Inform. Control 8, 607–639 (1965)
Knuth, D.: Top-down syntax analysis. Acta Inform. 1, 79–110 (1971)
LaLonde, W.: Constructing LR parsers for regular right part grammars. Acta Inform. 11, 177–193 (1979). https://doi.org/10.1007/BF00264024
Lee, G., Kim, D.: Characterization of extended LR(k) grammars. Inf. Process. Lett. 64(2), 75–82 (1997). https://doi.org/10.1016/S0020-0190(97)00152-X
Lewi, J., De Vlaminck, K., Huens, J., Huybrechts, M.: A Programming Methodology in Compiler Construction: Part I and II. North-Holland, Amsterdam (1979)
Morimoto, S., Sassa, M.: Yet another generation of LALR parsers for regular right part grammars. Acta Inform. 37, 671–697 (2001)
Rosenkrantz, D., Stearns, R.: Properties of deterministic top-down grammars. Inform. Control 17(3), 226–256 (1970)
Sassa, M., Nakata, I.: A simple realization of LR-parsers for regular right part grammars. Inf. Process. Lett. 24(2), 113–120 (1987). https://doi.org/10.1016/0020-0190(87)90104-9
Wirth, N.: Algorithms \(+\) Data Structures \(=\) Programs. Prentice-Hall, Englewood Cliffs (1975)
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary account of this research is in [4].
Rights and permissions
About this article
Cite this article
Borsotti, A., Breveglieri, L., Crespi Reghizzi, S. et al. Fast deterministic parsers for transition networks. Acta Informatica 55, 547–574 (2018). https://doi.org/10.1007/s00236-017-0308-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00236-017-0308-3