[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

On the Disambiguation of Weighted Automata

  • Conference paper
  • First Online:
Implementation and Application of Automata (CIAA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9223))

Included in the following conference series:

Abstract

We present a disambiguation algorithm for weighted automata. The algorithm admits two main stages: a pre-disambiguation stage followed by a transition removal stage. We give a detailed description of the algorithm and the proof of its correctness. The algorithm is not applicable to all weighted automata but we prove sufficient conditions for its applicability in the case of the tropical semiring by introducing the weak twins property. In particular, the algorithm can be used with all acyclic weighted automata and more generally any determinizable weighted automata. While disambiguation can sometimes be achieved using determinization, our disambiguation algorithm in some cases can return a result that is exponentially smaller than any equivalent deterministic automaton. We also present some empirical evidence of the space benefits of disambiguation over determinization in speech recognition and machine translation applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The removal of ambiguous transitions requires the following key property which is guaranteed by our \(\mathsf R\)-pre-disambiguation algorithm: after removal of ambiguous transitions, the weight of a remaining path must be precisely the same as the weight assigned to the string labeling that path by the original automaton. Let us also emphasize that the procedure of [14] is not a special instance of our algorithm and in particular does not benefit from the crucial use of the relation \(\mathsf R^*\).

  2. 2.

    Our algorithms can be straightforwardly extended to the case of weakly left divisible left semirings [3].

  3. 3.

    This condition can in fact be relaxed: it suffices that there exists a co-reachable state \((q_i, s_i)\) with \(i < j\) since it can be shown that in that case, there exists necessarily such a state with a a-transition to \((q_0, s_0)\).

  4. 4.

    The lemma is stated as processing one list, but from the proof it is clear it applies to multiple lists.

  5. 5.

    In [3], the authors use instead the terminology of cycle-unambiguous weighted automata, which coincides with that of polynomially ambiguous weighted automata.

References

  1. Albert, J., Kari, J.: Digital image compression. In: Handbook of Weighted Automata. Springer, Heidelberg (2009)

    Google Scholar 

  2. Allauzen, C., Benson, E., Chelba, C., Riley, M., Schalkwyk, J.: Voice query refinement. In: Interspeech (2012)

    Google Scholar 

  3. Allauzen, C., Mohri, M.: Efficient algorithms for testing the twins property. J. Automata, Lang. Comb. 8(2), 117–144 (2003)

    MathSciNet  MATH  Google Scholar 

  4. Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., Mohri, M.: OpenFst Library (2007). http://www.openfst.org

  5. Breuel, T.M.: The OCRopus open source OCR system. In: Proceedings of IS&T/SPIE 20th Annual Symposium (2008)

    Google Scholar 

  6. Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.J.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Camb. Univ. Press, Cambridge (1998)

    Book  MATH  Google Scholar 

  7. Eilenberg, S.: Automata, Languages and Machines. Academic Press, New York (1974)

    MATH  Google Scholar 

  8. Eppstein, D.: Finding the \(k\) shortest paths. SIAM J. Comp. 28(2), 652–673 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  9. Iglesias, G., Allauzen, C., Byrne, W., de Gispert, A., Riley, M.: Hierarchical phrase-based translation representations. In: Proceedings of EMNLP, pp. 1373–1383 (2011)

    Google Scholar 

  10. Kaplan, R.M., Kay, M.: Regular models of phonological rule systems. Comput. Linguist. 20(3), 331–378 (1994)

    Google Scholar 

  11. Kirsten, D.: A Burnside approach to the termination of Mohri’s algorithm for polynomially ambiguous min-plus-automata. ITA 42(3), 553–581 (2008)

    MathSciNet  MATH  Google Scholar 

  12. Kirsten, D.: Decidability, undecidability, and pspace-completeness of the twins property in the tropical semiring. Theor. Comput. Sci. 420, 56–63 (2012)

    Article  MathSciNet  Google Scholar 

  13. Kirsten, D., Lombardy, S.: Deciding unambiguity and sequentiality of polynomially ambiguous min-plus automata. In: STACS, pp. 589–600 (2009)

    Google Scholar 

  14. Klimann, I., Lombardy, S., Mairesse, J., Prieur, C.: Deciding unambiguity and sequentiality from a finitely ambiguous max-plus automaton. Theor. Comput. Sci. 327(3), 349–373 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kuich, W., Salomaa, A.: Semirings, Automata, Languages. EATCS Monographs on Theoretical Computer Science, vol. 5. Springer, Germany (1986)

    Google Scholar 

  16. Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)

    MathSciNet  Google Scholar 

  17. Mohri, M.: On the disambiguation of finite automata and functional transducers. Int. J. Found. Comput. Sci. 24(6), 847–862 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  18. Mohri, M., Pereira, F.C.N., Riley, M.: Speech recognition with weighted finite-state transducers. In: Handbook on Speech Proc. and Speech Comm. Springer, Heidelberg (2008)

    Google Scholar 

  19. Mohri, M., Riley, M.: An efficient algorithm for the n-best-strings problem. In Interspeech (2002)

    Google Scholar 

  20. Mohri, M., Riley, M.D.: On the disambiguation of weighted automata. ArXiv 1405.0500, May 2014

    Google Scholar 

  21. Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., Kamvar, M., Strope, B.: Your word is my command: Google search by voice: A case study. In: Advances in Speech Recognition, pp. 61–90. Springer, Heidelberg (2010)

    Google Scholar 

  22. Schmidt, E.M.: Succinctness of description of context-free, regular and unambiguous languages. Ph.D. thesis, Dept. of Comp. Sci., University of Aarhus (1978)

    Google Scholar 

Download references

Acknowledgments

We thank Cyril Allauzen for discussions about the topic of this research. This work was partly funded by the NSF award IIS-1117591.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael D. Riley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Mohri, M., Riley, M.D. (2015). On the Disambiguation of Weighted Automata. In: Drewes, F. (eds) Implementation and Application of Automata. CIAA 2015. Lecture Notes in Computer Science(), vol 9223. Springer, Cham. https://doi.org/10.1007/978-3-319-22360-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22360-5_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22359-9

  • Online ISBN: 978-3-319-22360-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics