[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Mining spatial colocation patterns: a different framework

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Recently, there has been considerable interest in mining spatial colocation patterns from large spatial datasets. Spatial colocation patterns represent the subsets of spatial events whose instances are often located in close geographic proximity. Most studies of spatial colocation mining require the specification of two parameter constraints to find interesting colocation patterns. One is a minimum prevalent threshold of colocations, and the other is a distance threshold to define spatial neighborhood. However, it is difficult for users to decide appropriate threshold values without prior knowledge of their task-specific spatial data. In this paper, we propose a different framework for spatial colocation pattern mining. To remove the first constraint, we propose the problem of finding N-most prevalent colocated event sets, where N is the desired number of colocated event sets with the highest interest measure values per each pattern size. We developed two alternative algorithms for mining the N-most patterns. They reduce candidate events effectively and use a filter-and-refine strategy for efficiently finding colocation instances from a spatial dataset. We prove the algorithms are correct and complete in finding the N-most prevalent colocation patterns. For the second constraint, a distance threshold for spatial neighborhood determination, we present various methods to estimate appropriate distance bounds from user input data. The result can help an user to set a distance for a conceptualization of spatial neighborhood. Our experimental results with real and synthetic datasets show that our algorithmic design is computationally effective in finding the N-most prevalent colocation patterns. The discovered patterns were different depending on the distance threshold, which shows that it is important to select appropriate neighbor distances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Agarwal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the international conference on very large databases (VLDB)

  • Appice A, Ceci M, Lanza A (2003) Discovery of spatial association rules in geo-referenced census data: a relational mining approach. In: Proceedings of the intelligent data analysis

  • Arshad MU, Ayyaz MN (2006) Mining N-most interesting itemsets using support-ordered tries. In: IEEE international conference on computer systems and applications

  • Bailey T, Gaterell A (1995) Interactive spatial data analysis. Longman Scientific & Technical, London

    Google Scholar 

  • Bembenik R, Rybinski H (2008) Mining spatial association rules with no distance parameter. In: Proceedings of the intelligent information processing and web mining, pp 499–508

  • Berg M, Kreveld M, Overmars M, Schwarzkopf O (2000) Computational geometry. Springer, Berlin

    MATH  Google Scholar 

  • Castro V, Murray A (1998) Discovering associations in spatial data—an efficient medoid based approach. In: Proceedings of the international Pacific-Asia conference on knowledge discovery and data mining (PAKDD)

  • Ceci M, Appice A, Malerba D (2004) Spatial associative classification at different levels of granularity: a probabilistic approach. Lecture notes in computer science, vol 3202. Springer, Berlin

  • Chelghoum N, Zeitouni K (2004) Spatial data mining implementation—alternative and performances. In: GeoInfo

  • Cheung Y, Fu AW (2004) Mining frequent itemsets without support threshold: with and without item constraints. IEEE Trans Knowl Data Eng 16(9): 1052–1069

    Article  Google Scholar 

  • Cormen T, Leiserson C, Rivest R, Stein C (2003) Introduction to algorithms. McGraw-Hill Science, New York

    Google Scholar 

  • Cover T (1995) Nearest neighbor pattern classification. Knowl Based Syst 6(8): 373–389

    Google Scholar 

  • Cressie N (1993) Statistics for spatial data. Wiley, New York

    Google Scholar 

  • Crimestat. http://www.icpsr.umich.edu/CrimaStat/

  • Ding W, Jiamthapthaksin1 R, Parmar R, Jiang D, Stepinski TF, Eick CF (2008) Towards region discovery in spatial datasets. In: Proceedings of the international Pacific-Asia conference on knowledge discovery and data mining (PAKDD)

  • Easter MJ (1991) Reasoning about binary topological relations. In: Proceedings of the international symposium on advances in spatial databases

  • Easter M, Kriegel H, Sander J (1999) Knowledge discovery in spatial databases. In: Proceedings of the international conference on artificial intelligence

  • Eick CF, Parmar R, Ding W, Stepinski TF, Nicot J (2008) Finding regional co-location patterns for sets of continuous variables in spatial datasets. In: Proceedings of the 16th ACM SIGSPATIAL international conference on advances in geographic information systems (ACM–GIS)

  • ESRI. Arcgis. http://www.esri.com/software/arcgis/index.html

  • Frank R, Easter M, Knobbe A (2009) A multi-relational approach to spatial classification. In: Proceedings of the ACM international conference on knowledge discovery and data mining (KDD)

  • Fu AW, Kwong RW, Tang J (2000) Mining N-most interesting itemsets. In: International symposium on methodologies for intelligent systems

  • Goodchild M (1986) Spatial autocorrelation. Geo Books, Norwich

    Google Scholar 

  • Griffith D (1987) Spatial autocorrelation: a primier. Resource publications in geography. Association of American Geographers, Washington, DC

    Google Scholar 

  • Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD conference on management of data

  • Hirate Y, Iwahashi E, Yamana H (2004) TF 2 P-growth: an efficient algorithm for mining frequent patterns without any thresholds. In: Proceedings of the IEEE ICDM’04 workshop on alternative techniques for data mining and knowledge discovery

  • Inokuchi A, Washio T, Motoda H (2000) An apriori-based algorithm for mining frequent substructures from graph data. In: Proceedings of the European conference on principles of data mining and knowledge discovery

  • Koperski K, Han J (1995) Discovery of spatial association rules in geographic information databases. In: Proceedings of international symposium on large spatial data bases, Maine, pp 47–66

  • Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceeding of IEEE international conference on data mining

  • Li F, Cheng D, Hadjieleftheriou M, Kollios G, Teng S (2005) On trip planning queries in spatial databases. In: Proceedings of the 9th international symposium on advances in spatial and temporal databases (SSTD)

  • Miller HJ (2006) Geographic data mining and knowledge discovery. In: Wilson JP, Fotheringham AS (eds) Handbook of geographic information science. Blackwell, Oxford

  • Miller HJ, Han J (2001) Geographic data mining and knowledge discovery. Taylor and Francis, London

    Book  Google Scholar 

  • Morimoto Y (2001) Mining frequent neighboring class sets in spatial databases. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining

  • Papadias D, Theodoridis Y (1997) Spatial relations, minimum bounding rectangles, and spatial data structures. Int J Geogr Inf Sci 11: 111–138

    Article  Google Scholar 

  • R for Statistical Computing. http://www.r-project.org/index.html

  • Ripley B (1976) The second-order analysis of stationary point process. J Appl Probab 13: 255–266

    Article  MATH  MathSciNet  Google Scholar 

  • Roddick J, Spiliopoulou M (1999) A bibliography of temporal, spatial and spatio-temporal data mining research. Proc SIGKDD Explor 1(1): 34–38

    Article  Google Scholar 

  • Santos MY, Amaral LA (2005) Geo-spatial data mining in the analysis of a demographic database. Soft Comput Fusion Found Methodol Appl 9(5): 374–384

    Google Scholar 

  • Schlossberg M (2003) GIS, the US census and neighborhood scale analysis. Plan Pract Res 18(2–3):213–218

    Google Scholar 

  • Shekhar S, Chawla S (2003) Spatial databases: a tour. Prentice Hall, Upper Saddle River

    Google Scholar 

  • Shekhar S, Huang Y (2001) Co-location rules mining: a summary of results. In: Proceedings of the international symposium on spatio and temporal database (SSTD)

  • Shekhar S, Zhang P, Huang Y, Vatsavai R (2004) Trends in spatial data mining. In: Kargupta H, Joshi A, Sivakumar K, Yesha Y (eds) Data mining: next generation challenges and future directions. AAAI/MIT Press, Menlo Park

  • Spatstat. http://www.spatstat.org/spatstat/

  • Tobler W (1970) A computer movie simulating urban growth in the detroit region. Econ Geogr 46: 234–240

    Article  Google Scholar 

  • Toots B, Getis A (1998) Point pattern analysis. Sage Publications, Newbury Park

    Google Scholar 

  • U. G. Survey. http://www.usgs.gov/

  • U.S.E.P. Agency. Environmental Interest Type. http://www.epa.gov/enviro/html/frs_demo/interest_types.pdf

  • U.S. EPA (Environmental Protection Agency) FRS (Facility Registry System) facilities. http://www.epa.gov/enviro/html/frs_demo/geospatial_data/geo_data_state_single.html

  • Wan Y, Zhou J (2008) Knfcom-t: a k-nearest features-based co-location pattern mining algorithm for large spatial data sets by using t-trees. Int J Bus Intell Data Min 3(4): 375–389

    Article  MathSciNet  Google Scholar 

  • Wang J, Han J, Lu Y, Tzvetkov P (2005) TFP: an efficient algorithm for mining top-k frequent closed itemsets. IEEE Trans Knowl Data Eng 17(5): 652–664

    Article  Google Scholar 

  • Xiao X, Xie X, Luo Q, Ma W (2008) Density based co-location pattern discovery. In: Proceedings of the 16th ACM SIGSPATIAL international conference on advances in geographic incormation systems (ACM-GIS)

  • Xiong H, Shekhar S, Huang Y, Kumar V, Ma X, Yoo JS (2004) A framework for discovering co-location patterns in data sets with extended spatial objects. In: Proceedings of the SIAM international conference on data mining (SDM)

  • Yan X, Han J (2001) gSpan: graph-baseed substructure pattern mining. In: Proceedings of IEEE international conference on data mining

  • Yoo JS, Bow M (2009) Finding N-most prevalent colocated event sets. In: Proceedings of the international conference on data warehousing and knowledge discovery (DaWak)

  • Yoo JS, Shekhar S (2004) A partial join approach for mining co-location patterns. In: Proceedings of the ACM international symposium on advances in geographic information systems (ACM-GIS)

  • Yoo JS, Shekhar S (2006) A join-less approach for mining spatial co-location patterns. IEEE Trans Knowl Data Eng 18(10): 1323–1337

    Article  Google Scholar 

  • Zhang X, Mamoulis N, Cheung DW, Shouk Y (2004) Fast mining of spatial collocations. In: Proceedings of the ACM international conference on knowledge discovery and data mining (KDD)

  • Zou L, Chen L, Lu Y (2000) Top-k subgraph matching query in a large graph. In: Proceedings of the conference on information and knowledge management

  • Zou S, Zhao Y, Guan J, Huang J (2005) A neighborhood-based clustering algorithm. In: Proceedings of the Pacific-Asia conference on knowledge discovery and data mining

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Soung Yoo.

Additional information

Responsible editor: M.J. Zaki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoo, J.S., Bow, M. Mining spatial colocation patterns: a different framework. Data Min Knowl Disc 24, 159–194 (2012). https://doi.org/10.1007/s10618-011-0223-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-011-0223-0

Keywords

Navigation