[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Nearest-Neighbor Searching Under Uncertainty I

Published: 01 October 2017 Publication History

Abstract

Nearest-neighbor queries, which ask for returning the nearest neighbor of a query point in a set of points, are important and widely studied in many fields because of a wide range of applications. In many of these applications, such as sensor databases, location based services, face recognition, and mobile data, the location of data is imprecise. We therefore study nearest-neighbor queries in a probabilistic framework in which the location of each input point and/or query point is specified as a probability density function and the goal is to return the point that minimizes the expected distance, which we refer to as the expected nearest neighbor ($$\mathop {\mathrm {ENN}}$$ENN). We present methods for computing an exact $$\mathop {\mathrm {ENN}}$$ENN or an $$\varepsilon $$ź-approximate $$\mathop {\mathrm {ENN}}$$ENN, for a given error parameter $$0<\varepsilon < 1$$0<ź<1, under different distance functions. These methods build a data structure of near-linear size and answer $$\mathop {\mathrm {ENN}}$$ENN queries in polylogarithmic or sublinear time, depending on the underlying function. As far as we know, these are the first nontrivial methods for answering exact or $$\varepsilon $$ź-approximate $$\mathop {\mathrm {ENN}}$$ENN queries with provable performance guarantees. Moreover, we extend our results to answer exact or $$\varepsilon $$ź-approximate k-$$\mathop {\mathrm {ENN}}$$ENN queries.

References

[1]
Afshani, P., Chan, T.M.: Optimal halfspace range reporting in three dimensions. In: Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'09), pp. 180---186. SIAM, Philadelphia (2009)
[2]
Agarwal, P.K., Aronov, B., Har-Peled, S., Phillips, J.M., Yi, K., Zhang, W.: Nearest neighbor searching under uncertainty II. In: Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS'13), pp. 115---126. ACM, New York (2013)
[3]
Agarwal, P.K., Cheng, S.-W., Yi, K.: Range searching on uncertain data. ACM Trans. Algorithms 8(4), Article No. 43 (2012).
[4]
Agarwal, P.K., Har-Peled, S., Sharir, M., Wang, Y.: Hausdorff distance under translation for points and balls. ACM Trans. Algorithms 6(4), Article No. 71 (2010).
[5]
Agarwal, P.K., Matoušek, J.: Ray shooting and parametric search. SIAM J. Comput. 22(4), 794---806 (1993)
[6]
Agarwal, P.K., Sharir, M.: Arrangements and their applications. In: Sack, J.-R., Urrutia, J. (eds.) Handbook of Computational Geometry, pp. 49---119. North-Holland, Amsterdam (2000)
[7]
Aggarwal, C.C. (ed.): Managing and Mining Uncertain Data. Advances in Database Systems, vol. 35. Springer, Berlin (2009)
[8]
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117---122 (2008)
[9]
Arya, S., Malamatos, T., Mount, D.M.: Space---time tradeoffs for approximate nearest neighbor searching. J. ACM 57(54), Article No. 1 (2009).
[10]
Aurenhammer, F., Klein, R.: Voronoi diagrams. In: Sack, J.-R., Urrutia, J. (eds.) Handbook of Computational Geometry, pp. 201---290. North-Holland, Amsterdam (2000)
[11]
Aurenhammer, F., Klein, R., Lee, D.-T.: Voronoi Diagrams and Delaunay Triangulations. World Scientific, Hackensack (2013)
[12]
Beskales, G., Soliman, M.A., IIyas, I.F.: Efficient search for the top-$$k$$k probable nearest neighbors in uncertain databases. VLDB 1(1), 326---339 (2008)
[13]
Blömer, J.: Computing sums of radicals in polynomial time. In: Proceedings of the 32nd Annual Symposium on Foundations of Computer Science (FOCS'91), pp. 670---677. IEEE, Los Alamitos (1991)
[14]
Cabello, S.: Approximation algorithms for spreading points. J. Algorithms 62(2), 49---73 (2007)
[15]
Cabello, S., van Kreveld, M.J.: Approximation algorithms for aligning points. In: Proceedings of the 19th Annual Symposium on Computational Geometry (SCD'13), pp. 20---28. ACM, New York (2003)
[16]
Chan, T.M.: Low-dimensional linear programming with violations. SIAM J. Comput. 34(4), 879---893 (2005)
[17]
Chazelle, B.: On the convex layers of a planar set. IEEE Trans. Inf. Theory 31(4), 509---517 (1985)
[18]
Chazelle, B., Guibas, L.J.: Fractional cascading. I. A data structuring technique. Algorithmica 1(2), 133---162 (1986)
[19]
Cheng, R., Chen, J., Mokbel, M., Chow, C.: Probabilistic verifiers: evaluating constrained nearest-neighbor queries over uncertain data. In: Proceedings of the IEEE 24th International Conference on Data Engineering (ICDE'08), pp. 973---982. IEEE, Los Alamitos (2008)
[20]
Cheng, R., Chen, L., Chen, J., Xie, X.: Evaluating probability threshold k-nearest-neighbor queries over uncertain data. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT'09), pp. 672---683. ACM, New York (2009)
[21]
Cheng, R., Xie, X., Yiu, M.L., Chen, J., Sun, L.: UV-diagram: a Voronoi diagram for uncertain data. In: Proceedings of the IEEE 26th International Conference on Data Engineering (ICDE'10), pp. 796---807. IEEE, Los Alamitos (2010)
[22]
Cormode, G., Li, F., Yi, K.: Semantics of ranking queries for probabilistic data and expected ranks. In: Proceedings of the IEEE 25th International Conference on Data Engineering (ICDE'09), pp. 305---316. IEEE, Los Alamitos (2009)
[23]
Dalvi, N., Ré, C., Suciu, D.: Probabilistic databases: diamonds in the dirt. Commun. ACM 52(7), 86---94 (2009)
[24]
de Berg, M., Cheong, O., van Kreveld, M., Overmars, M., Schwarzkopf, O.: Computational Geometry: Algorithms and Applications, 2nd edn. Springer, Berlin (2000)
[25]
Driscoll, J.R., Sarnak, N., Sleator, D.D., Tarjan, R.E.: Making data structures persistent. J. Comput. Syst. Sci. 38(1), 86---124 (1989)
[26]
Guibas, L., Hershberger, J., Snoeyink, J.: Compact interval trees: a data structure for convex hulls. In: Proceedings of the 1st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'90), pp. 169---178. SIAM, Philadelphia (1990)
[27]
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data (SIGMOD'84), pp. 47---57. ACM, New York (1984)
[28]
Har-Peled, S.: Geometric Approximation Algorithms. Mathematical Surveys and Monographs, vol. 173. American Mathematical Society, Providence (2011)
[29]
Har-Peled, S., Kumar, N.: Down the rabbit hole: robust proximity search and density estimation in sublinear space. SIAM J. Comput. 43(4), 1486---1511 (2014)
[30]
Hua, M., Pei, J., Zhang, W., Lin, X.: Ranking queries on uncertain data: a probabilistic threshold approach. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD'08), pp. 673---686. ACM, New York (2008)
[31]
Jooyandeh, M., Mohades, A., Mirzakhah, M.: Uncertain Voronoi diagram. Inf. Process. Lett. 109(13), 709---712 (2009)
[32]
Kamousi, P., Chan, T.M., Suri, S.: Closest pair and the post office problem for stochastic points. In: Dehne, F., Iacono, J., Sack, J.-R. (eds.) Algorithms and Data Structures (WADS'11). Lecture Notes in Computer Science, vol. 6844, pp. 548---559. Springer, Heidelberg (2011)
[33]
Kriegel, H., Kunath, P., Renz, M.: Probabilistic nearest-neighbor query on uncertain objects. In: Kotagiri, Ramamohanarao, et al. (eds.) Advances in Databases: Concepts, Systems and Applications (DASFAA'07). Lecture Notes in Computer Science, vol. 4443, pp. 337---348. Springer, Berlin (2007)
[34]
Li, Y., Li, F., Yi, K., Yao, B., Wang, M.: Flexible aggregate similarity search. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD'11), pp. 1009---1020. ACM, New York (2011)
[35]
Li, H., Lu, H., Huang, B., Huang, Z.: Two ellipse-based pruning methods for group nearest neighbor queries. In: Proceedings of the 13th Annual ACM International Workshop on Geographic Information Systems (GIS'05), pp. 192---199. ACM, New York (2005)
[36]
Li, F., Yao, B., Kumar, P.: Group enclosing queries. IEEE Trans. Knowl. Data Eng. 23(10), 1526---1540 (2011)
[37]
Lian, X., Chen, L.: Probabilistic group nearest neighbor queries in uncertain databases. IEEE Trans. Knowl. Data Eng. 20(6), 809---824 (2008)
[38]
Ljosa, V., Singh, A.K.: APLA: indexing arbitrary probability distributions. In: Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE'07), pp. 946---955. IEEE, Los Alamitos (2007)
[39]
Löffler, M., van Kreveld, M.: Largest bounding box, smallest diameter, and related problems on imprecise points. Comput. Geom. 43(4), 419---433 (2010)
[40]
Luo, Y., Chen, H., Furuse, K., Ohbo, N.: Efficient methods in finding aggregate nearest neighbor by projection-based filtering. In: Gervasi, O., Gavrilova, M.L. (eds.) Computational Science and Its Applications (ICCSA'07). Lecture Notes in Computer Science, vol. 4707, pp. 821---833. Springer, Berlin (2007)
[41]
Matoušek, J.: Reporting points in halfspaces. Comput. Geom. 2(3), 169---186 (1992)
[42]
Papadias, D., Shen, Q., Tao, Y., Mouratidis, K.: Group nearest neighbor queries. In: Proceedings of the 20th International Conference on Data Engineering (ICDE'04), pp. 301---312. IEEE, Washington, DC (2004)
[43]
Papadopoulos, A.N., Manolopoulos, Y.: Nearest Neighbor Search: A Database Perspective. Series in Computer Science. Springer, New York (2005)
[44]
Renz, M., Mamoulis, N., Emrich, T., Tang, Y., Cheng, R., Züfle, A., Zhang, P.: Voronoi-based nearest neighbor search for multi-dimensional uncertain databases. In: Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE'13), pp. 158---169. IEEE, Washington, DC (2013)
[45]
Sarnak, N., Tarjan, R.E.: Planar point location using persistent search trees. Commun. ACM 29(7), 669---679 (1986)
[46]
Sember, J., Evans, W.: Guaranteed Voronoi diagrams of uncertain sites. In: Proceedings of the 20th Canadian Conference on Computational Geometry (CCCG'08), pp. 207---210. CCCG (2008)
[47]
Shakhnarovich, G., Darrell, T., Indyk, P.: Nearest-neighbor searching and metric space dimensions. In: Nearest-Neighbor Methods for Learning and Vision: Theory and Practice, pp. 15---59. MIT Press, Cambridge, MA (2006)
[48]
Sharifzadeh, M., Shahabi, C.: VoR-tree: R-trees with Voronoi diagrams for efficient processing of spatial nearest neighbor queries. VLDB 3(1---2), 1231---1242 (2010)
[49]
Sharir, M., Agarwal, P.K.: Davenport---Schinzel Sequences and Their Geometric Applications. Cambridge University Press, Cambridge (1995)
[50]
Soliman, M.A., Ilyas, I.F., Chang, K.C.-.C.: Top-$$k$$k query processing in uncertain databases. In: Proceedings of the IEEE 23rd International Conference on Data Engineering (ICDE'07), pp. 896---905. IEEE, Los Alamitos (2007)
[51]
van Kreveld, M., Löffler, M., Mitchell, J.S.B.: Preprocessing imprecise points and splitting triangulations. SIAM J. Comput. 39(7), 2990---3000 (2010)
[52]
Wang, H., Zhang, W.: On top-$$k$$k weighted sum aggregate nearest and farthest neighbors in the $$L_1$$L1 plane. arXiv:1211.5084 (2012)
[53]
Yiu, M.L., Mamoulis, N., Papadias, D.: Aggregate nearest neighbor queries in road networks. IEEE Trans. Knowl. Data Eng. 17(6), 820---833 (2005)
[54]
Yuen, S.M., Tao, Y., Xiao, X., Pei, J., Zhang, D.: Superseding nearest neighbor search on uncertain spatial databases. IEEE Trans. Knowl. Data Eng. 22(7), 1041---1055 (2010)

Cited By

View all
  • (2023)Fréchet Distance for Uncertain CurvesACM Transactions on Algorithms10.1145/359764019:3(1-47)Online publication date: 14-Jul-2023
  • (2019)One-Dimensional r-Gathering Under UncertaintyAlgorithmic Aspects in Information and Management10.1007/978-3-030-27195-4_1(1-15)Online publication date: 6-Aug-2019
  • (2018)Range-max queries on uncertain dataJournal of Computer and System Sciences10.1016/j.jcss.2017.09.00694:C(118-134)Online publication date: 1-Jun-2018
  1. Nearest-Neighbor Searching Under Uncertainty I

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Discrete & Computational Geometry
    Discrete & Computational Geometry  Volume 58, Issue 3
    October 2017
    250 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 October 2017

    Author Tags

    1. 68P05
    2. 68P10
    3. 68P20
    4. Approximate nearest neighbor $$(\mathop {\mathrm {ANN}})$$(ANN)
    5. Nearest-neighbor queries
    6. Queries on uncertain data

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Fréchet Distance for Uncertain CurvesACM Transactions on Algorithms10.1145/359764019:3(1-47)Online publication date: 14-Jul-2023
    • (2019)One-Dimensional r-Gathering Under UncertaintyAlgorithmic Aspects in Information and Management10.1007/978-3-030-27195-4_1(1-15)Online publication date: 6-Aug-2019
    • (2018)Range-max queries on uncertain dataJournal of Computer and System Sciences10.1016/j.jcss.2017.09.00694:C(118-134)Online publication date: 1-Jun-2018

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media