[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1862344.1862353acmotherconferencesArticle/Chapter ViewAbstractPublication PagessisapConference Proceedingsconference-collections
research-article

Indexing inexact proximity search with distance regression in pivot space

Published: 18 September 2010 Publication History

Abstract

We introduce an inexact indexing scheme where, at index building time, training queries drawn from the database are used to fit one linear regression model for each object to be indexed. The response variable is the distance from the object to the query. The predictor variables are the distances from the query to each of a set of pivot objects. At search time, the models can provide distance estimates or probabilities of inclusion in the correct result, either of which can be used to rank the objects for an inexact search where the true distances are calculated in the resulting order, up to a halting point. To reduce storage requirements, the coefficients can be discretized at the cost of some precision in the promise values. We evaluate our scheme on synthetic and real-world data and compare it to a permutation-based scheme that has been reported to outperform other methods in the same experimental setting. We find that, in several of our experiments, the regression-based distance estimates give better query performance than the permutation-based promise values, in some cases even when the pivot set for the regression-based scheme is reduced in order to make its memory size equal to that of the permutation-based index. Limitations of our scheme include high index building cost and vulnerability to deviation from the model assumptions.

References

[1]
}}V. Athitsos, J. Alon, S. Sclaroff, and G. Kollios. BoostMap: An embedding method for efficient nearest neighbor retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1):89--104, 2008.
[2]
}}E. Chávez and G. Navarro. A probabilistic spell for the curse of dimensionality. In Proc. 3rd Workshop on Algorithm Engineering and Experimentation (ALENEX), 2001.
[3]
}}E. Chávez, K. Figueroa, and G. Navarro. Effective proximity retrieval by ordering permutations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(9):1647--1658, 2008.
[4]
}}E. Chávez and G. Navarro. Probabilistic proximity search: Fighting the curse of dimensionality in metric spaces. Information Processing Letters, 85(1):39--46, 2003.
[5]
}}E. Chávez, G. Navarro, R. A. Baeza-Yates, and J. L. Marroquín. Searching in metric spaces. ACM Computing Surveys, 33(3):273--321, 2001.
[6]
}}J. J. Faraway. Linear Models with R. Chapman & Hall/CRC Press, 2005.
[7]
}}D. Harman. Overview of the third text retrieval conference. In Proc. Third Text REtrieval Conf. (TREC-3), 1995.
[8]
}}M. L. Hetland. The basic principles of metric indexing. In Swarm Intelligence for Multi-objective Problems in Data Mining. Springer-Verlag, 2009.
[9]
}}G. R. Hjaltason and H. Samet. Index-driven similarity search in metric spaces. ACM Transactions on Database Systems, 28(4):517--580, 2003.
[10]
}}A. Marzal and E. Vidal. Computation of normalized edit distance and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9):926--932, 1993.
[11]
}}M. L. Micó, J. Oncina, and E. Vidal. A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements. Pattern Recognition Letters, 15(1):9--17, 1994.
[12]
}}R. Paredes and G. Navarro. Optimal incremental sorting. In Proc. 8th Workshop on Algorithm Engineering and Experiments (ALENEX), 2006.
[13]
}}M. Patella and P. Ciaccia. Approximate similarity search: A multi-faceted problem. Journal of Discrete Algorithms, 7(1):36--48, 2009.
[14]
}}P. Phillips, H. Wechsler, J. Huang, and P. Rauss. The FERET database and evaluation procedure for face recognition algorithms. Image and Vision Computing Journal, 16(5):295--306, 1998.
[15]
}}R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2010. ISBN 3-900051-07-0.
[16]
}}T. Skopal and B. Bustos. On nonmetric similarity search problems in complex domains. ACM Computing Surveys, to appear.

Cited By

View all

Index Terms

  1. Indexing inexact proximity search with distance regression in pivot space

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      SISAP '10: Proceedings of the Third International Conference on SImilarity Search and APplications
      September 2010
      130 pages
      ISBN:9781450304207
      DOI:10.1145/1862344
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      • Bilkent University: Bilkent University
      • Mexican Computer Science Society

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 September 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article

      Conference

      SISAP '10
      Sponsor:
      • Bilkent University

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 03 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)Optimal Metric Search Is Equivalent to the Minimum Dominating Set ProblemSimilarity Search and Applications10.1007/978-3-030-60936-8_9(111-125)Online publication date: 30-Sep-2020
      • (2020)Metrics and Ambits and Sprawls, Oh MySimilarity Search and Applications10.1007/978-3-030-60936-8_10(126-139)Online publication date: 30-Sep-2020
      • (2016)PPP-Codes for Large-Scale Similarity SearchingSpecial Issue on Database- and Expert-Systems Applications on Transactions on Large-Scale Data- and Knowledge-Centered Systems XXIV - Volume 951010.1007/978-3-662-49214-7_2(61-87)Online publication date: 1-Jan-2016
      • (2014)A General Framework and Algorithms for Score Level Indexing and Fusion in Biometric IdentificationIEICE Transactions on Information and Systems10.1587/transinf.E97.D.510E97.D:3(510-523)Online publication date: 2014
      • (2013)Learning to prune in metric and non-metric spacesProceedings of the 27th International Conference on Neural Information Processing Systems - Volume 110.5555/2999611.2999787(1574-1582)Online publication date: 5-Dec-2013
      • (2013)Probabilistic enhancement of approximate indexing in metric spacesInformation Systems10.1016/j.is.2012.05.01238:7(1007-1018)Online publication date: Oct-2013
      • (2011)Versatile probability-based indexing for approximate similarity searchProceedings of the Fourth International Conference on SImilarity Search and APplications10.1145/1995412.1995423(51-58)Online publication date: 30-Jun-2011
      • (2011)Multivariate regression for pivot selection: A preliminary study2011 3rd Symposium on Web Society10.1109/SWS.2011.6101281(115-121)Online publication date: Oct-2011
      • (2011)Fast and accurate biometric identification using score level indexing and fusionProceedings of the 2011 International Joint Conference on Biometrics10.1109/IJCB.2011.6117591(1-8)Online publication date: 11-Oct-2011

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media