Efficient Approximate Indexing in High-Dimensional Feature Spaces

Simone Santini¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8199))

Included in the following conference series:

International Conference on Similarity Search and Applications

1682 Accesses

Abstract

In this paper we present a fast approximate indexing method for high dimensional feature space that uses the error probability as an independent variable.

The idea of the algorithm is to define a low-dimensional feature space in which a significant portion of the inter-distance variance is concentrated, to search for the nearest neighborhood of the query in this space, and then to extend the search by a factor ζ to include a number of objects “near” this nearest neighborhood. We shall show that, under reasonable hypotheses on the distribution of items in the feature space, it is possible to derive a relation between the value ζ and the error probability.

We study the error probability and the complexity of the algorithm, validate the model using a data set of images, and show how the results can be used to design indexing schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Fusing Vantage Point Trees and Linear Discriminants for Fast Feature Classification

Article 20 March 2017

A meta-indexing method for fast probably approximately correct nearest neighbor searches

Article Open access 06 April 2022

PL-Tree: An Efficient Indexing Method for High-Dimensional Data

References

Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching in xed dimensions. Journal of the ACM 45(6), 891–923 (1998)
Article MathSciNet MATH Google Scholar
Beckmann, N., Kriegel, H.P., Schneider, R., Seeger, B.: The r ^*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the ACM SIGMOD International Conference on the Management of Data (1990)
Google Scholar
Bentley, J.L.: K-D trees for semidynamic point sets. In: Proceedings of the Sixth ACM Annual Symposium on Computational Geometry (1990)
Google Scholar
Burkhard, W.: Interpolation/based index maintenance. In: Proceedings of the ACM Symposium on Principles of database systems (PODS), pp. 76–89 (1983)
Google Scholar
Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Computing Surveys (CSUR) 33(3), 273–321 (2001)
Article Google Scholar
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Proceedings of the 23rd VLDB (Very Large Data Bases) Conference, Athens, Greece (1997)
Google Scholar
Ciaccia, P., Patella, M.: Pac nearest neighbor queries: Approximate and controlled search in high-dimensional and metric spaces. In: Proceedings of the International Conference on Data Engineering, ICDE (2000)
Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)
Article Google Scholar
Gravila, D.M.: The R-Tree index optimization. In: Waugh, T., Healey, R. (eds.) Advances in GIS Research. Taylor & Francis (1994)
Google Scholar
Ordoñez, V., Kulkarni, G., Berg, T.L.: Im2text: Describing images using 1 million captioned photographs. Neural Information Processing Systems (2011)
Google Scholar
Pestov, V.: On the geometry of similarity search: dimensionality curse and concentration of measure. Information Processing Letters 73(1-2), 47–51 (2000)
Article MathSciNet Google Scholar
Seidl, T., Kriegel, H.-P.: Optimal multi-step k-nearest neighbor search. In: Proceedings of the ACM SIGMOD, International Conference on Management of Data, Seattle, USA, pp. 154–165 (1998)
Google Scholar
White, D., Jain, R.: Similarity indexing with the SS-tree. In: Proc. 12th IEEE International Conference on Data Engineering (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Escuela Politécnica Superior, Universidad Autónoma de Madrid, Spain
Simone Santini

Authors

Simone Santini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Database Laboratory, Universidade da Coruña, Spain
Nieves Brisaboa & Oscar Pedreira &
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Pavel Zezula

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Santini, S. (2013). Efficient Approximate Indexing in High-Dimensional Feature Spaces. In: Brisaboa, N., Pedreira, O., Zezula, P. (eds) Similarity Search and Applications. SISAP 2013. Lecture Notes in Computer Science, vol 8199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41062-8_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-41062-8_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41061-1
Online ISBN: 978-3-642-41062-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficient Approximate Indexing in High-Dimensional Feature Spaces

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Fusing Vantage Point Trees and Linear Discriminants for Fast Feature Classification

A meta-indexing method for fast probably approximately correct nearest neighbor searches

PL-Tree: An Efficient Indexing Method for High-Dimensional Data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Efficient Approximate Indexing in High-Dimensional Feature Spaces

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Fusing Vantage Point Trees and Linear Discriminants for Fast Feature Classification

A meta-indexing method for fast probably approximately correct nearest neighbor searches

PL-Tree: An Efficient Indexing Method for High-Dimensional Data

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation