[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Asymptotic Log-Loss of Prequential Maximum Likelihood Codes

  • Conference paper
Learning Theory (COLT 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3559))

Included in the following conference series:

Abstract

We analyze the Dawid-Rissanen prequential maximum likelihood codes relative to one-parameter exponential family models \({\mathcal M}\). If data are i.i.d. according to an (essentially) arbitraryP, then the redundancy grows at rate \({\frac{1}{2}} {\rm c} {\rm ln} n\). We show that c = σ \(_{\rm 1}^{\rm 2}\)/ σ \(_{\rm 2}^{\rm 2}\), where σ \(_{\rm 1}^{\rm 2}\) is the variance of P, and σ \(_{\rm 2}^{\rm 2}\) is the variance of the distribution \(M^{*} \in {\mathcal M}\) that is closest to P in KL divergence. This shows that prequential codes behave quite differently from other important universal codes such as the 2-part MDL, Shtarkov and Bayes codes, for which c = 1. This behavior is undesirable in an MDL model selection setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Azoury, K., Warmuth, M.: Relative loss bounds for on-line density estimation with the exponential family of distributions. Machine Learning 43(3) (2001)

    Google Scholar 

  2. Barron, A., Rissanen, J., Yu, B.: The minimum description length principle in coding and modeling. IEEE Trans. Inf. Theory 44(6), 2743–2760 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  3. Cesa-Bianchi, N., Lugosi, G.: Worst-case bounds for the logarithmic loss of predictors. Journal of Machine Learning 43(3), 247–264 (2001)

    Article  MATH  Google Scholar 

  4. Clarke, B.S., Barron, A.R.: Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inf. Theory IT-36(3), 453–471 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  5. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)

    Book  MATH  Google Scholar 

  6. Dawid, A.P.: Present position and potential developments: Some personal views, statistical theory, the prequential approach. Journal of the Royal Statistical Society, Series A 147(2), 278–292 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  7. de Rooij, S., Grünwald, P.: An empirical study of MDL model selection with infinite parametric complexity. Available at the CoRR arXiv (2005), at http://xxx.lanl.gov/abs/cs.LG/0501028abs.cs.LG/0501028

  8. Freund, Y.: Predicting a binary sequence almost as well as the optimal biased coin. In: Proc. Ninth Annual Conf. on Comp. Learning Theory, COLT 1996 (1996)

    Google Scholar 

  9. Grünwald, P.: MDL tutorial. In: Grünwald, P., Myung, J., Pitt, M. (eds.) Advances in Minimum Description Length, MIT Press, Cambridge (2005)

    Google Scholar 

  10. Grünwald, P., de Rooij, S.: Asymptotic log–loss of prequential maximum likelihood codes. Available at the CoRR arXiv (2005), at http://xxx.lanl.gov/

  11. Hemerly, E.M., Davis, M.H.A.: Strong consistency of the PLS criterion for order determination of autoregressive processes. Ann. Statist. 17(2), 941–946 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  12. Kass, R., Vos, P.: Geometric Foundations of Asymptotic Inference. Wiley, Chichester (1997)

    Google Scholar 

  13. Li, L., Yu, B.: Iterated logarithmic expansions of the pathwise code lengths for exponential families. IEEE Trans. Inf. Theory 46(7), 2683–2689 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  14. Rissanen, J.: Universal coding, information, prediction and estimation. IEEE Trans. Inf. Theory 30, 629–636 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  15. Rissanen, J.: A predictive least squares principle. IMA Journal of Mathematical Control and Information 3, 211–222 (1986)

    Article  MATH  Google Scholar 

  16. Rissanen, J.: Stochastic Complexity in Statistical Inquiry. World Scientific, Singapore (1989)

    MATH  Google Scholar 

  17. Wei, C.Z.: On predictive least squares principles. Ann. Statist 20(1), 1–42 (1990)

    Article  Google Scholar 

  18. Whittle, P.: Bounds for the moments of linear and quadratic forms in independent variables. Theory of Probability and its Applications (3) (1960)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grünwald, P., de Rooij, S. (2005). Asymptotic Log-Loss of Prequential Maximum Likelihood Codes. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_44

Download citation

  • DOI: https://doi.org/10.1007/11503415_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26556-6

  • Online ISBN: 978-3-540-31892-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics