[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2034117.2034142guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Minimum neighbor distance estimators of intrinsic dimension

Published: 05 September 2011 Publication History

Abstract

Most of the machine learning techniques suffer the "curse of dimensionality" effect when applied to high dimensional data. To face this limitation, a common preprocessing step consists in employing a dimensionality reduction technique. In literature, a great deal of research work has been devoted to the development of algorithms performing this task. Often, these techniques require as parameter the number of dimensions to be retained; to this aim, they need to estimate the "intrinsic dimensionality" of the given dataset, which refers to the minimum number of degrees of freedom needed to capture all the information carried by the data. Although many estimation techniques have been proposed, most of them fail in case of noisy data or when the intrinsic dimensionality is too high. In this paper we present a family of estimators based on the probability density function of the normalized nearest neighbor distance. We evaluate the proposed techniques on both synthetic and real datasets comparing their performances with those obtained by state of the art algorithms; the achieved results prove that the proposed methods are promising.

References

[1]
Camastra, F., Vinciarelli, A.: Estimating the intrinsic dimension of data with a fractal-based method. IEEE Trans. Pattern Anal. Mach. Intell. 24, 1404-1407 (2002).
[2]
Coleman, T.F., Li, Y.: An interior, trust region approach for nonlinear minimization subject to bounds. SIAM Journal on Optimization 6, 418-445 (1996).
[3]
Costa, J., Hero, A.O.: Learning intrinsic dimension and entropy of shapes. In: Krim, H., Yezzi, T. (eds.) Statistics and analysis of shapes. Birkhäuser, Basel (2005).
[4]
Costa, J.A., Hero, A.O.: Geodesic entropic graphs for dimension and entropy estimation in manifold learning. IEEE Transaction on Signal Processing 52(8), 2210- 2221 (2004).
[5]
Costa, J.A., Hero, A.O.: Learning intrinsic dimension and entropy of highdimensional shape spaces. In: EUSIPCO (2004).
[6]
Eckmann, J.-P., Ruelle, D.: Fundamental limitations for estimating dimensions and lyapunov exponents in dynamical systems. Physica D: Nonlinear Phenomena 56(2- 3), 185-187 (1992).
[7]
Edward, O.: Chaos in Dynamical Systems. Cambridge University Press, Cambridge (1993).
[8]
Farahmand, A.M., Szepesvari, C., Audibert, J.Y.: Manifold-adaptive dimension estimation. In: Proc. of ICML (2007).
[9]
Fishman, G.S.: Monte Carlo: Concepts, Algorithms, and Applications. Springer Series in Operations Research. Springer, New York (1996).
[10]
Fukunaga, K.: An algorithm for finding intrinsic dimensionality of data. IEEE Trans. on Computers 20, 176-183 (1971).
[11]
Fukunaga, K.: Intrinsic Dimensionality Extraction. In: Krishnaiah, P.R., Kanal, L.N. (eds.) Classification, Pattern Recognition and Reduction of Dimensionality. North-Holland, Amsterdam (1982).
[12]
Grassberger, P., Procaccia, I.: Measuring the strangeness of strange attractors. Physica D: Nonlinear Phenomena 9, 189-208 (1983).
[13]
Hein, M.: Intrinsic dimensionality estimation of submanifolds in euclidean space. In: ICML, pp. 289-296 (2005).
[14]
Jollife, I.T.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (1986).
[15]
Kégl, B.: Intrinsic dimension estimation using packing numbers. In: Becker, S., Thrun, S., Obermayer, K. (eds.) NIPS, pp. 681-688. MIT Press, Cambridge (2002).
[16]
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE 86(11), 2278-2324 (1998).
[17]
Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. Ann. Arbor. MI 1, 777-784 (2005).
[18]
MacKay, D.J.C., Ghahramani, Z.: Comments on 'maximum likelihood estimation of intrinsic dimension' by E. Levina and P. Bickel (2005), http://www.inference.phy.cam.ac.uk/mackay/dimension/
[19]
Mordohai, P., Medioni, G.: Dimensionality estimation, manifold learning and function approximation using tensor voting. J. Mach. Learn. Res. 11, 411-450 (2010).
[20]
Pettis, K., Bailey, T., Jain, A., Dubes, R.: An intrinsic dimensionality estimator from near-neighbor information (1979).
[21]
Pineda, F.J., Sommerer, J.C.: Estimating generalized dimensions and choosing time delays: A fast algorithm. In: Forecasting the Future and Understanding the Past. Time Series Prediction, pp. 367-385 (1994).
[22]
Roweis, S.T., Saul, L.K.: Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science 290, 2323-2326 (2000).
[23]
Tenenbaum, J.B., Silva, V., Langford, J.C.: A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290, 2319-2323 (2000).
[24]
Verveer, P.J., Duin, R.P.W.: An evaluation of intrinsic dimensionality estimators. IEEE Transactions on Pattern Analysis and Machine Intelligence 17, 81-86 (1995).
[25]
Wang, Q., Kulkarni, S.R., Verd, S.: A nearest-neighbor approach to estimating divergence between continuous random vector. In: IEEE Int. Symp. Information Theory (ISIT 2006), pp. 242-246 (2006).

Cited By

View all
  • (2012)A Novel Intrinsic Dimensionality Estimator Based on Rank-Order StatisticsRevised Selected Papers of the First International Workshop on Clustering High--Dimensional Data - Volume 762710.1007/978-3-662-48577-4_7(102-117)Online publication date: 15-May-2012
  • (2012)Data Dimensionality EstimationRevised Selected Papers of the First International Workshop on Clustering High--Dimensional Data - Volume 762710.1007/978-3-662-48577-4_6(87-101)Online publication date: 15-May-2012

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ECML PKDD'11: Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
September 2011
676 pages
ISBN:9783642237829

Sponsors

  • Pascal2 Network: Pascal2 Network
  • Xerox
  • Google Inc.
  • COST/MOVE: COST/MOVE
  • Yahoo! Labs

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 05 September 2011

Author Tags

  1. dimensionality reduction
  2. intrinsic dimensionality estimation
  3. manifold learning

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2012)A Novel Intrinsic Dimensionality Estimator Based on Rank-Order StatisticsRevised Selected Papers of the First International Workshop on Clustering High--Dimensional Data - Volume 762710.1007/978-3-662-48577-4_7(102-117)Online publication date: 15-May-2012
  • (2012)Data Dimensionality EstimationRevised Selected Papers of the First International Workshop on Clustering High--Dimensional Data - Volume 762710.1007/978-3-662-48577-4_6(87-101)Online publication date: 15-May-2012

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media