[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

PatZip: Pattern-Preserved Spatial Data Compression

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3518))

Included in the following conference series:

  • 2632 Accesses

Abstract

This paper presents a compression method, PatZip, to improve the efficiency of spatial pattern mining methods. PatZip can avoid overcompression and stop automatically before pattern is destroyed. Compared with existing compression methods, PatZip is deterministic and its result is reproducible, and original data can be easily recovered. The compression process is data-driven and parameter-free, and requires only O(nlogn) time for n data points.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bradley, P.S., Fayyad, U., Reina, C.: Scaling clustering algorithms to large databases. In: Proc. of 4th International Conf. on Knowledge Discovery and Data Mining (KDD 1998), pp. 9–15 (1998)

    Google Scholar 

  2. Breunig, M.M., Kriegel, H., Kroger, P., Sander, J.: Data bubbles: quality preserving performance boosting for hierarchical clustering. In: Proc. of Int’l Conf. on Management of Data (SIGMOD 2001), pp. 79–90 (2001)

    Google Scholar 

  3. Clarkson, K.L.: Fast algorithms for the all nearest neighbors problem. In: Proc. of 24th Annual Symposium on Foundations of Computer Science, pp. 226–232 (1983)

    Google Scholar 

  4. DuMouchel, W., Volinsky, C., Johnson, T., Cortez, C., Pregibon, D.: Squashing Flat Files Flatter. In: Proc. of 5th Int’l Conf. on Knowledge Discovery and Data Mining (KDD 1999), pp. 6–15. AAAI press, Menlo Park (1999)

    Chapter  Google Scholar 

  5. Equitz, W.H.: A new vector quantization clustering algorithm. IEEE Trans. on Acoustics, Speech, and Signal Processing 37(10), 1568–1575 (1989)

    Article  Google Scholar 

  6. Franti, P., Kaukoranta, T., Shen, D., Chang, K.: Fast and memory efficient implementation of the exact PNN. IEEE Trans. on Image Processing 9(5), 773–777 (2000)

    Article  Google Scholar 

  7. Han, J., Kamber, M., Tung, A.K.H.: Spatial clustering methods in data mining: a survey. In: Miller, H., Han, J. (eds.) Geographic Data Mining and Knowledge Discovery. Taylor and Francis, Abington (2001)

    Google Scholar 

  8. Karypis, G., Han, E., Kumar, V.: CHAMELEON: a hierarchical clustering algorithm using dynamic modeling. IEEE Computer 32, 68–75 (1999)

    Google Scholar 

  9. Kollios, G., Gunopulos, D., Koudas, N., Berchtold, S.: Efficient biased sampling for approximate clustering and outlier detection in large data sets. IEEE Trans. on Knowledge and Data Eng. 15(5), 1–18 (2003)

    Article  Google Scholar 

  10. Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Trans. on Commun. 28, 84–95 (1980)

    Article  Google Scholar 

  11. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)

    Google Scholar 

  12. Palmer, C.R., Faloutsos, C.: Density biased sampling: an improved method for data mining and clustering. In: Proc. of Int’l Conf. of Management of Data (SIGMOD 2000), pp. 82–92 (2000)

    Google Scholar 

  13. Qian, Y., Zhang, G., Zhang, K.: FAÇADE: a fast and effective approach to discovery of dense clusters in noisy spatial data (demo abstract). In: Proc. of Int’l Conf on Management of Data (SIGMOD 2004), pp. 921–922 (2004)

    Google Scholar 

  14. Qian, Y., Zhang, K.: GraphZip: A fast and automatic compression method for spatial data clustering. In: Proc. of 19th Annual ACM Symposium on Applied Computing (SAC 2004), pp. 571–575 (2004)

    Google Scholar 

  15. Vaidya, P.M.: An O(nlogn) algorithm for the all-nearest-neighbors problem. Discrete & Computational Geometry 4, 101–115 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  16. Zhang, T., Ramakrishnan, R., Linvy, M.: BIRCH: an efficient data clustering method for very large databases. In: Proc. of Int’l Conf. on Management of Data (SIGMOD 1996), pp. 103–114 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qian, Y., Zhang, K., Huynh, D.T. (2005). PatZip: Pattern-Preserved Spatial Data Compression. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_84

Download citation

  • DOI: https://doi.org/10.1007/11430919_84

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26076-9

  • Online ISBN: 978-3-540-31935-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics