[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2093973.2093992acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Similarity search on a large collection of point sets

Published: 01 November 2011 Publication History

Abstract

Spatial applications often require the ability to perform similarity search over a collection of point sets. For example, given a geographical distribution of a disease outbreak, find k historical outbreaks with similar spatial distributions from a data collection D. In this paper, we study the problem of similarity search over a collection of point sets using the Hausdorff distance, which is a measure commonly used to determine the maximum discrepancy between two point sets. To avoid computing the Hausdorff distance for all point sets S in D, one may compute an optimistic estimate (i.e., lower bound value) of the actual Hausdorff distance HausDist(Q,S) for each S to rule out sets that are obviously dissimilar to Q. In our investigation, we observed that a commonly used method (called BscLB) to compute an estimate may not produce a result which is indicative of the actual Hausdorff distance. Consequently, we propose a method (called EnhLB) which produces a tighter estimate than the existing one. We then formulate a similarity search algorithm which uses a combination of BscLB and EnhLB to find similar point sets efficiently. In addition, we also extend our method to support an outlier-resistant variant of the Hausdorff distance called the modified Hausdorff distance. We compare our proposed algorithm with an algorithm using only BscLB. The results of our experiments show a reduction in computation time of 72% for searches using the Hausdorff distance and a reduction of 53% using the modified Hausdorff distance.

References

[1]
H. Alt, B. Behrends, and J. Blömer. Approximate matching of polygonal shapes. In SCG, pages 186--193, 1991.
[2]
H. Alt, P. Brass, M. Godau, C. Knauer, and C. Wenk. Computing the Hausdorff distance of geometric patterns and shapes. Discrete and Comput. Geometry, 25:65--76, 2003.
[3]
W. G. Aref and H. Samet. Efficient processing of window queries in the pyramid data structure. In PODS, pages 265--272, 1990.
[4]
N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger. The R*-tree: an efficient and robust access method for points and rectangles. In SIGMOD, pages 322--331, 1990.
[5]
M. J. Cafarella, A. Y. Halevy, Y. Zhang, D. Z. Wang, and E. Wu. Uncovering the relational web. In WebDB, 2008.
[6]
A. Corral, Y. Manolopoulos, Y. Theodoridis, and M. Vassilakopoulos. Closest pair queries in spatial databases. In SIGMOD, pages 189--200, 2000.
[7]
M. Guthe, P. Borodin, and R. Klein. Fast and accurate Hausdorff distance calculation between meshes. In WSCG, pages 41--48, 2005.
[8]
A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD, pages 47--57, 1984.
[9]
G. R. Hjaltason and H. Samet. Index-driven similarity search in metric spaces. ACM Trans. Database Syst., 28(4):517--580, 2003.
[10]
G. R. Hjaltason and H. Samet. Incremental distance join algorithms for spatial databases. In SIGMOD, pages 237--248, 1998.
[11]
G. R. Hjaltason and H. Samet. Distance browsing in spatial databases. ACM Trans. Database Syst., 24(2):265--318, 1999.
[12]
M. D. Lieberman, H. Samet, J. Sankaranarayanan, and J. Sperling. STEWARD: architecture of a spatio-textual search engine. In GIS, pages 186--193, 2007.
[13]
M. D. Lieberman, H. Samet, J. Sankaranaryanan, and J. Sperling. Spatio-textual spreadsheets: Geotagging via spatial coherence. In GIS, pages 524--527, 2009.
[14]
R. Lipikorn, A. Shimizu, and H. Kobatake. A modified ex-oskeleton and a Hausdorff distance matching algorithm for shape-based object recognition. In CISST, pages 507--511, 2003.
[15]
S. Nutanong, E. H. Jacox, and H. Samet. An incremental Hausdorff distance calculation algorithm. PVLDB, 4(8):506--517, 2011.
[16]
D. Papadias, Y. Tao, K. Mouratidis, and C. K. Hui. Aggregate nearest neighbor queries in spatial databases. ACM Trans. Database Syst., 30(2):529--576, 2005.
[17]
N. Roussopoulos, S. Kelley, and F. Vincent. Nearest neighbor queries. In SIGMOD, pages 71--79, 1995.
[18]
H. Samet. Foundations of Multidimensional and Metric Data Structures. Morgan-Kaufmann, San Francisco, 2006.
[19]
H. Shin, B. Moon, and S. Lee. Adaptive multi-stage distance join processing. In SIGMOD, pages 343--354, 2000.
[20]
M. Tang, M. Lee, and Y. J. Kim. Interactive Hausdorff distance computation for general polygonal models. In SIGGRAPH, pages 74:1--74:9, 2009.
[21]
B. Teitler, M. D. Lieberman, D. Panozzo, J. Sankaranarayanan, H. Samet, and J. Sperling. NewsStand: A new view on news. In GIS, pages 144--153, 2008.
[22]
C. Xia, H. Lu, B. C. Ooi, and J. Hu. Gorder: An efficient method for KNN join processing. In VLDB, pages 756--767, 2004.

Cited By

View all
  • (2022)Fast dataset search with earth mover's distanceProceedings of the VLDB Endowment10.14778/3551793.355181115:11(2517-2529)Online publication date: 29-Sep-2022
  • (2021)Recommending Geo-semantically Related Classes for Link DiscoveryJournal on Data Semantics10.1007/s13740-020-00117-49:4(151-177)Online publication date: 7-Jan-2021
  • (2019)A Spatial Insight for UGC Apps: Fast Similarity Search on Keyword-Induced Point Groups2019 20th IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM.2019.00-26(371-372)Online publication date: Jun-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GIS '11: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
November 2011
559 pages
ISBN:9781450310314
DOI:10.1145/2093973
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. query processing
  2. similarity search
  3. spatial databases

Qualifiers

  • Research-article

Funding Sources

Conference

GIS '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Fast dataset search with earth mover's distanceProceedings of the VLDB Endowment10.14778/3551793.355181115:11(2517-2529)Online publication date: 29-Sep-2022
  • (2021)Recommending Geo-semantically Related Classes for Link DiscoveryJournal on Data Semantics10.1007/s13740-020-00117-49:4(151-177)Online publication date: 7-Jan-2021
  • (2019)A Spatial Insight for UGC Apps: Fast Similarity Search on Keyword-Induced Point Groups2019 20th IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM.2019.00-26(371-372)Online publication date: Jun-2019
  • (2018)Fast similarity search on keyword-induced point groupsProceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/3274895.3274920(109-118)Online publication date: 6-Nov-2018
  • (2018)Spatio-textual user matching and clustering based on set similarity joinsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-018-0498-527:3(297-320)Online publication date: 1-Jun-2018
  • (2018)Recommending Spatial Classes for Entity Interlinking in the Web of DataThe Semantic Web: ESWC 2018 Satellite Events10.1007/978-3-319-98192-5_42(225-239)Online publication date: 3-Jun-2018
  • (2017)Location-Based Distance Measures for Geosocial SimilarityACM Transactions on the Web10.1145/305495111:3(1-32)Online publication date: 3-Jul-2017
  • (2016)Automatic user identification method across heterogeneous mobility data sources2016 IEEE 32nd International Conference on Data Engineering (ICDE)10.1109/ICDE.2016.7498306(978-989)Online publication date: May-2016
  • (2012)U2STRAProceedings of the 2012 ACM workshop on City data management workshop10.1145/2390226.2390229(5-12)Online publication date: 29-Oct-2012
  • (2011)Searching web documents as location setsProceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/2093973.2094056(489-492)Online publication date: 1-Nov-2011

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media