[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1562849.1562859acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Heidi matrix: nearest neighbor driven high dimensional data visualization

Published: 28 June 2009 Publication History

Abstract

Identifying patterns in large high dimensional data sets is a challenge. As the number of dimensions increases, the patterns in the data sets tend to be more prominent in the subspaces than the original dimensional space. A system to facilitate presentation of such subspace oriented patterns in high dimensional data sets is required to understand the data.
Heidi is a high dimensional data visualization system that captures and visualizes the closeness of points across various subspaces of the dimensions; thus, helping to understand the data. The core concept behind Heidi is based on prominence of patterns within the nearest neighbor relations between pairs of points across the subspaces.
Given a d-dimensional data set as input, Heidi system generates a 2-D matrix represented as a color image. This representation gives insight into (i) how the clusters are placed with respect to each other, (ii) characteristics of placement of points within a cluster in all the subspaces and (iii) characteristics of overlapping clusters in various subspaces.
A sample of results displayed and discussed in this paper illustrate how Heidi Visualization can be interpreted.

References

[1]
C. C. Aggarwal, A. Hinneburg, and D. A. Keim. On the surprising behavior of distance metrics in high dimensional space. In Lecture Notes in Computer Science, pages 420--434, 2001.
[2]
C. C. Aggarwal, J. L. Wolf, P. S. Yu, C. Procopiuc, and J. S. Park. Fast algorithms for projected clustering. In Proc. ACM SIGMOD, pages 61--72, 1999.
[3]
I. Assent, R. Krieger, E. Muller, and T. Seidl. Visa: Visual subspace clustering analysis. In SIGKDD Explorations (1), 2007.
[4]
U. Axen and H. Edelsbrunner. Auditory morse analysis of triangulated manifolds. In Mathematical Visualization, pages 223--236. Springer-Verlag, 1998.
[5]
C. Baumgartner, C. Plant, K. Kailing, H. P. Kriegel, and P. Kroger. Subspace selection for clustering high-dimensional data. In Proc. ICDM, pages 11--18, 2004.
[6]
C.-H. Chen. Generalized association plots for information visualization: The applications of the convergence of iteratively formed correlation matrices. volume 12, pages 1--23. Statistica Sinica, 2002.
[7]
M. Ester, H. P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. ACM SIGMOD, pages 226--231, 1996.
[8]
K. Kailing, H. P. Kriegel, and P. Kroger. Density-connected subspace clustering for high-dimensional data. In Proc. ICDM, 2004.
[9]
S. Vadapalli, S. Valluri, and K. Karlapalem. A simple yet effective data clustering algorithm. In Proc. ICDM, pages 1108--1112, 2006.
[10]
J. Vennam and S. Vadapalli. Syndeca: Synthetic generation of datasets to evaluate clustering algorithms. In COMAD, 2005.

Cited By

View all
  • (2022)Unsupervised DeepView: Global Explainability of Uncertainties for High Dimensional Data2022 IEEE International Conference on Knowledge Graph (ICKG)10.1109/ICKG55886.2022.00032(196-202)Online publication date: Nov-2022
  • (2022)Unsupervised DeepView: Global Uncertainty Visualization for High Dimensional Data2022 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW58026.2022.00086(1-8)Online publication date: Nov-2022
  • (2013)Efficient Visualization of Large-Scale Data Tables through Reordering and Entropy Minimization2013 IEEE 13th International Conference on Data Mining10.1109/ICDM.2013.63(121-130)Online publication date: Dec-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
VAKD '09: Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration
June 2009
92 pages
ISBN:9781605586700
DOI:10.1145/1562849
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

KDD09
Sponsor:

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)16
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Unsupervised DeepView: Global Explainability of Uncertainties for High Dimensional Data2022 IEEE International Conference on Knowledge Graph (ICKG)10.1109/ICKG55886.2022.00032(196-202)Online publication date: Nov-2022
  • (2022)Unsupervised DeepView: Global Uncertainty Visualization for High Dimensional Data2022 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW58026.2022.00086(1-8)Online publication date: Nov-2022
  • (2013)Efficient Visualization of Large-Scale Data Tables through Reordering and Entropy Minimization2013 IEEE 13th International Conference on Data Mining10.1109/ICDM.2013.63(121-130)Online publication date: Dec-2013
  • (2012)Subspace search and visualization to make sense of alternative clusterings in high-dimensional dataProceedings of the 2012 IEEE Conference on Visual Analytics Science and Technology (VAST)10.1109/VAST.2012.6400488(63-72)Online publication date: 14-Oct-2012
  • (2011)Heidi visualization of R-tree structures over high dimensional dataProceedings of the 23rd international conference on Scientific and statistical database management10.5555/2032397.2032447(565-567)Online publication date: 20-Jul-2011
  • (2011)Heidi Visualization of R-tree Structures over High Dimensional DataScientific and Statistical Database Management10.1007/978-3-642-22351-8_39(565-567)Online publication date: 2011
  • (2010)Visual knowledge representation of conceptual semantic networksSocial Network Analysis and Mining10.1007/s13278-010-0008-21:3(219-229)Online publication date: 25-Nov-2010
  • (2009)BEADS: High dimensional data cluster visualization2009 IEEE Symposium on Visual Analytics Science and Technology10.1109/VAST.2009.5333417(235-236)Online publication date: Oct-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media