[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3528114.3528123acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdsdeConference Proceedingsconference-collections
research-article

Learning to Prune: General and Efficient Approximate Nearest Neighbor Search with Direction Navigating Graph

Published: 24 June 2022 Publication History

Abstract

The approximate k-Nearest Neighbor (kNN) graph has been proved extremely useful in approximate nearest neighbor search (ANNS). However, under the high-dimensional and large-scale circumstance, the performance of nearest neighbor search is challenging due to its high costs of distance calculation. To solve the problem, most approaches focus on pruning the edges during the phase of graph construction. In our observation, searching a given query in an even well pruned approximate kNN graph, a large proportion of edges can be cut-off during the search procedure. In this paper we propose the concept of Direction Navigating Graph (DNG) to offer an online pruning strategy combined with a heuristic search algorithm. To build the DNG we develop a neural network that project queries and vertices into a low-dimensional direction vector space. With the DNG, we can apply online edge pruning with almost all those state- of-the-art graph-based approaches. In a number of experiments on public datasets we demonstrate that our approach outperforms those original methods by a significant margin. In addition, the approach has been integrated into Alibaba's (Alibaba Group) billion-scale search applications for its superior performance.

References

[1]
AndoniandP.Indyk.2006.Near-optimalhashingalgorithmsfornearneighbor problem in high dimension. FOCS 1 (2006), 459–468. https://doi.org/10.1109/ FOCS.2006.49
[2]
P. Kumar Arora, S. Sinha and A. Bhattacharya. 2018. Hd-index: Pushing the scalability-accuracy boundary for approximate knn search in highdimensional spaces. Proceedings of the VLDB Endowment (2018). https://doi.org/10.14778/ 3204028.3204034
[3]
EB Baum and F Wilczek. 1988. Supervised learning of probability distributions by neural networks. Proceedings of the 1987 International Conference on Neural Information Processing SystemsJanuary (1988), 52–61.
[4]
M. Belkin and P. Niyogi. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. (2001).
[5]
J. L. Bentley. 1975. Multidimensional binary search trees used for associative searching. Commun. ACM (1975). https://doi.org/10.1145/361002.361007
[6]
N. Gonzales D. Dearholt and G. Kurup. 1988. Monotonic search networks for computer vision databases. (1988). https://doi.org/10.1109/ACSSC.1988.754602
[7]
S. Dasgupta and Y. Freund. 2008. Random projection trees and low dimensional manifolds. Proceedings of the fortieth annual ACM symposium on Theory of computing (2008), 537–546. https://doi.org/10.1145/1374376.1374452
[8]
S. Dasgupta and K. Sinha. 2013. Randomized partition trees for exact nearest neighbor search. Proceedings of the 26th Annual Conference on Learning Theory (2013), 317–337.
[9]
Michael I. Jordan Eric P. Xing, Andrew Y.Ng and Stuart J. Russell. 2002. Distance metric, learning with application to clustering with side-information. Proceedings of the Conference on Advances in Neural Information Processing Systems (2002).
[10]
Andoni 2015. Practical and optimal LSH for angular distance. Proceedings of the 28th International Conference on Neural Information Processing Systems 1 (2015), 1225–1233.
[11]
Burges 2005. Learning to rank using gradient descent. Proceedings of the 22nd international conference on Machine learning (2005), 89–96. https: //doi.org/10.1145/1102351.1102363
[12]
Cong Fu .2019.Fast approximate nearest neighbor search with the navigating spreading-out graph. (2019). https://doi.org/10.14778/3303753.3303754
[13]
Datar 2004. Locality-sensitive hashing scheme based on p-stable distributions. Proceedings of the twentieth annual symposium on Computational geometry, ACM (2004).
[14]
Dmitry Baranchuk 2019. Learning to Route in Similarity Graphs. (2019). https://arxiv.org/abs/1905.10987
[15]
Fabien 2015. Cache locality is not enough: high-performance nearest neighbor search with product quantization fast scan. Proc. International Conference on Very Large DataBases (2015), 288–299. https://doi.org/10.14778/2856318.2856324
[16]
J.B. Tenenbaum 2000. A global geometric framework for nonlinear dimensionality reduction. Science290(2000),2319–2323. https://doi.org/10.1126/science.290.5500.2319
[17]
Jie Zhou 2018. Graph neural networks: A review of methods and applications. (2018). https://arxiv.org/abs/1812.08434
[18]
L. He 2018. Neurally-Guided Semantic Navigation in Knowledge Graph. IEEE Trans (2018),1–1. https://doi.org/10.1109/TBDATA.2018.2805363
[19]
McCartin-Lim 2012. Approximate principal direction trees. 29th International Conference on Machine Learning (2012).
[20]
Prosenjit Bose 2012. Proximity graphs. International Journal of Computational Geometry and Applications (2012), 439–469. https://doi.org/10.1142/S0218195912500112
[21]
Jonathan Ho and Stefano Ermon.2016. Generative adversarial imitation learning. Neural Information Processing Systems (2016), 4565–4573.
[22]
P. Indyk and Motwani. 1998. Approximate nearest neighbors: Towards removing the curse of dimensionality. Proceedings of the thirtieth annual ACM symposium on Theory of computing (1998),604–613. https://doi.org/10.1145/276698.276876
[23]
J. Song J. Wang, H.T. Shen and J.Ji.2014. Hashing for similarity search: A survey. (2014). https://arxiv.org/abs/1408.2927
[24]
V. de Silva J.B. Tenenbaum and J.C. Langford. 2000. A global geometric framework for nonlinear dimensionality reduction. Science (2000),2319–2323. https://doi.org/10.1126/science.290.5500.2319
[25]
Richard Socher Jeffrey Pennington and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP)(2014),1532–1543. https://doi.org/10.3115/v1/D14- 1162
[26]
Diederik P. Kingma and Jimmy Lei Ba. 2015. Adam: A method for stochastic optimization. (2015). https://arxiv.org/abs/1412.6980
[27]
J. M. Kleinberg. 2000. Navigation in a small world. Nature 406 (2000), 845–845.
[28]
Eric O. Postma Laurens J.P. van der Maaten and H. Jaap van den Herik. 2009. Dimensionality reduction: a comparative review. jmlr (2009), 2579–2605.
[29]
David G. Lowe. 2004. Distinctive Image Features from Scale-Invariant Keypoints. InternationalJournalofComputerVision60(2004),91–110. https://www.cs.ubc.ca/∼lowe/papers/ijcv04.pdf
[30]
D. Krioukov M. Boguna and K. C. Claffy. 2009. Navigation in a small world. Nature Physics 5 (2009), 74–80.
[31]
Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. Proceedings of the 27th International Conference on International Conference on Machine Learning (2010), 807–814.
[32]
Aude Oliva and Antonio Torralba. 2001. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision 42 (2001), 145–175.
[33]
J.W. Sammon. 1969. A nonlinear mapping for data structure analysis. (1969), 401–499. https://doi.org/10.1109/T-C.1969.222678
[34]
Stuart C. Shapiro. 1987. Encyclopedia of Artificial Intelligence. Wiley-Interscience.
[35]
R. F. Sproull. 1991. Refinements to nearest-neighbor searching in k-dimensional trees. Algorithmica 6 (1991), 579–589.
[36]
van der Maaten LJ and Hinton GE. 2008. Visualizing data using t-SNE. Mach. Learn. Res (2008), 2579–2605.
[37]
Y. Xiao. 2018. Fast Approximate Nearest Neighbor Search via k-Diverse Nearest Neighbor Graph. Thirty-Second AAAI Conference on Artificial Intelligence (2018).
[38]
D. A. Yashunin Yu. A. Malkov1. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. (2018). https: //doi.org/10.1109/TPAMI.2018.2889473

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
DSDE '22: Proceedings of the 2022 5th International Conference on Data Storage and Data Engineering
February 2022
124 pages
ISBN:9781450395724
DOI:10.1145/3528114
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Direction Navigating Graph
  2. Information Retrieval
  3. Machine Learning
  4. kNN Graph

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

DSDE 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 130
    Total Downloads
  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)4
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media