More Web Proxy on the site http://driver.im/

research-article

Seeing is Learning in High Dimensions: The Synergy Between Dimensionality Reduction and Machine Learning

Authors:

Alexandru Telea,

Alister Machado,

Yu WangAuthors Info & Claims

SN Computer Science, Volume 5, Issue 3

https://doi.org/10.1007/s42979-024-02604-y

Published: 21 February 2024 Publication History

Abstract

High-dimensional data are a key study object for both machine learning (ML) and information visualization. On the visualization side, dimensionality reduction (DR) methods, also called projections, are the most suited techniques for visual exploration of large and high-dimensional datasets. On the ML side, high-dimensional data are generated and processed by classifiers and regressors, and these techniques increasingly require visualization for explanation and exploration. In this paper, we explore how both fields can help each other in achieving their respective aims. In more detail, we present both examples that show how DR can be used to understand and engineer better ML models (seeing helps learning) and also applications of DL for improving the computation of direct and inverse projections (learning helps seeing). We also identify existing limitations of DR methods used to assist ML and of ML techniques applied to improve DR. Based on the above, we propose several high-impact directions for future work that exploit the analyzed ML-DR synergy.

References

[1]

Munzner T Visualization analysis and design: principles, techniques, and practice 2014 Boca Raton CRC Press

[2]

Telea AC Data visualization—principles and practice 2014 2 Abingdon CRC Press/Taylor and Francis

[3]

Liu S, Maljovec D, Wang B, Bremer P-T, and Pascucci V Visualizing high-dimensional data: advances in the past decade IEEE TVCG 2015 23 3 1249-1268

[4]

Yates A, Webb A, Sharpnack M, Chamberlin H, Huang K, and Machiraju R Visualizing multidimensional data with glyph SPLOMs CGF 2014 33 3 301-310

[5]

Lehmann DJ, Albuquerque G, Eisemann M, Magnor M, and Theisel H Selecting coherent and relevant plots in large scatterplot matrices Comput Graph Forum 2012 31 6 1895-1908

Digital Library

[6]

Inselberg A, Dimsdale B. Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proc. IEEE VIS. 1990. p. 361–78.

[7]

Rao R, Card SK. The table lens: merging graphical and symbolic representations in an interactive focus+context visualization for tabular information. In: Proc. ACM SIGCHI. 1994. p. 318–22.

[8]

Telea AC. Combining extended table lens and treemap techniques for visualizing tabular data. In: Proc. EuroVis. 2006. p. 120–7.

[9]

Borgo R, Kehrer J, Chung DHS, Maguire E, Laramee RS, Hauser H, Ward M, Chen M. Glyph-based visualization: foundations, design guidelines, techniques and applications. 2013.

[10]

Lespinats S and Aupetit M CheckViz: sanity check and topological clues for linear and nonlinear mappings CGF 2011 30 1 113-125

[11]

Sorzano C, Vargas J, Pascual-Montano A. A survey of dimensionality reduction techniques. arXiv:1403.2877 [stat.ML]. 2014.

[12]

Nonato L and Aupetit M Multidimensional projection for visual analytics: linking techniques with distortions, tasks, and layout enrichment IEEE TVCG 2018 25 8 2650-2673

[13]

Cunningham J and Ghahramani Z Linear dimensionality reduction: survey, insights, and generalizations JMLR 2015 16 2859-2900

Digital Library

[14]

Espadoto M, Martins R, Kerren A, Hirata N, and Telea A Toward a quantitative survey of dimension reduction techniques IEEE TVCG 2019 27 3 2153-2173

[15]

Telea A. Beyond the third dimension: how multidimensional projections and machine learning can help each other. In: Proc. IVAPP. 2023.

[16]

Botchkarev A Performance metrics (error measures) in machine learning regression, forecasting and prognostics: properties and typology Interdiscip J Inf Knowl Manag 2019 14 45-79

[17]

Jiang T, Gradus J, and Rosellini A Supervised machine learning: a brief primer Behav Ther 2020 51 5 675-687

[18]

Thiyagalingam J, Shankar M, Fox G, and Hey T Scientific machine learning benchmarks Nat Rev Phys 2022 4 413-420

[19]

Joia P, Coimbra D, Cuminato JA, Paulovich FV, and Nonato LG Local affine multidimensional projection IEEE TVCG 2011 17 12 2563-2571

[20]

Venna J, Kaski S. Visualizing gene interaction graphs with local multidimensional scaling. In: Proc. ESANN. 2006. p. 557–62.

[21]

Martins R, Coimbra D, Minghim R, and Telea AC Visual analysis of dimensionality reduction quality for parameterized projections Comput Graph 2014 41 26-42

[22]

van der Maaten L and Hinton GE Visualizing data using t-SNE JMLR 2008 9 2579-2605

[23]

Paulovich FV, Nonato LG, Minghim R, and Levkowitz H Least square projection: a fast high-precision multidimensional projection technique and its application to document mapping IEEE TVCG 2008 14 3 564-575

[24]

Sips M, Neubert B, Lewis J, and Hanrahan P Selecting good views of high-dimensional data using class consistency CGF 2009 28 3 831-838

Digital Library

[25]

Aupetit M Visualizing distortions and recovering topology in continuous projection techniques Neurocomputing 2007 10 7–9 1304-1330

Digital Library

[26]

Sommerville I Software engineering 2015 Sebastopol O’Reilly Publishing

Digital Library

[27]

da Silva R, Rauber P, Martins R, Minghim R, Telea AC. Attribute-based visual explanation of multidimensional projections. In: Proc. EuroVA. 2015.

[28]

Coimbra D, Martins R, Neves T, Telea A, and Paulovich F Explaining three-dimensional dimensionality reduction plots Inf Vis 2016 15 2 154-172

[29]

Marcilio WE, Eler DM. Explaining dimensionality reduction results using Shapley values. arXiv:2103.05678 [cs.LG]. 2021.

[30]

Tian Z, Zhai X, Driel D, Steenpaal G, Espadoto M, and Telea A Using multiple attribute-based explanations of multidimensional projections to explore high-dimensional data Comput Graph 2021 98 C 93-104

Digital Library

[31]

Thijssen J, Tian Z, Telea A. Scaling up the explanation of multidimensional projections. In: Proc. EuroVA. 2023.

[32]

Vernier E, Comba J, and Telea A Guided stable dynamic projections Comput Graph Forum 2021 40 3 87-98

[33]

Garcia R, Telea A, Silva B, Torresen J, and Comba J A task-and-technique centered survey on visual analytics for deep learning model engineering Comput Graph 2018 77 30-49

[34]

Hohman F, Kahng M, Pienta R, and Chau DH Visual analytics in deep learning: an interrogative survey for the next frontiers IEEE TVCG 2019 25 8 2674-2693

[35]

Yuan J, Chen C, Yang W, Liu M, Xia J, and Liu S A survey of visual analytics techniques for machine learning Comput Visual Media 2020 7 3-36

[36]

Alicioglu G and Sun B A survey of visual analytics for explainable artificial intelligence methods Comput Graph 2022 102 C 502-520

Digital Library

[37]

Rauber PE, Falcão AX, and Telea AC Projections as visual aids for classification system design Inf Vis 2017 17 4 282-305

[38]

Guyon I, Gunn S, Ben-Hur A. Result analysis of the NIPS 2003 feature selection challenge. In: Advances in neural information processing systems; 2004. p. 545–52

[39]

Geurts P, Ernst D, and Wehenkel L Extremely randomized trees Mach Learn 2006 63 1 3-42

Digital Library

[40]

Bernard J, Hutter M, Zeppelzauer M, Fellner D, and Sedlmair M Comparing visual-interactive labeling with active learning: an experimental study IEEE TVCG 2018 24 1 298-308

[41]

Benato B, Telea A, Falcão A. Semi-supervised learning with interactive label propagation guided by feature space projections. In: Proc. SIBGRAPI. 2018. p. 392–9.

[42]

Belkin M, Niyogi P, and Sindhwani V Manifold regularization: a geometric framework for learning from labeled and unlabeled examples J Mach Learn Res 2006 7 2399-2434

Digital Library

[43]

Amorim WP, Falcão AX, Papa JP, and Carvalho MH Improving semi-supervised learning through optimum connectivity Pattern Recognit 2016 60 C 72-85

Digital Library

[44]

Benato B, Gomes J, Telea A, and Falcão A Semi-automatic data annotation guided by feature space projection Pattern Recognit 2020 109 107612

Digital Library

[45]

Shwartz-Ziv R, Tishby N. Opening the black box of deep neural networks via information. arXiv:1703.00810 [cs.LG]. 2017.

[46]

Azodi C, Tang J, and Shiu S Opening the black box: interpretable machine learning for geneticists Trends Genet 2020 36 6 442-455

[47]

Tzeng FY, Ma K-L. Opening the black box—data driven visualization of neural networks. In: Proc. IEEE visualization. 2005.

[48]

Pezzotti N, Höllt T, Van Gemert J, Lelieveldt BPF, Eisemann E, and Vilanova A Deepeyes: progressive visual analytics for designing deep neural networks IEEE TVCG 2017 24 1 98-108

[49]

Alsallakh B, Jourabloo A, Ye M, Liu X, and Ren L Do convolutional neural networks learn class hierarchy? IEEE Trans Vis Comput Graph 2018 24 1 152-62

[50]

Strobelt H, Gehrmann S, Pfister H, and Rush AM LSTMVis: a tool for visual analysis of hidden state dynamics in recurrent neural networks IEEE TVCG 2018 24 1 667-676

[51]

Liu M, Shi J, Li Z, Li C, Zhu J, and Liu S Towards better analysis of deep convolutional neural networks IEEE TVCG 2016 23 1 91-100

[52]

Chattopadhay A, Sarkar A, Howlader P, Balasubramanian V. Grad-CAM++: generalized gradient-based visual explanations for deep convolutional networks. In: Proc. IEEE WACV. 2018.

[53]

Rauber P, Falcao A, Telea A. Visualizing time-dependent data using dynamic t-SNE. In: Proc. EuroVis—short papers; 2016. p. 43–9.

[54]

Zwan M, Codreanu V, and Telea A CUBu: universal real-time bundling for large graphs IEEE TVCG 2016 22 12 2550-2563

[55]

Rauber P, Fadel SG, Falcão A, and Telea A Visualizing the hidden activity of artificial neural networks IEEE TVCG 2017 23 1 101-110

[56]

Rodrigues FCM, Espadoto M, Hirata R Jr, and Telea A Constructing and visualizing high-quality classifier decision boundary maps Information 2019 10 9 280-297

[57]

Oliveira AAM, Espadoto M, Hirata R, Telea A. SDBM: supervised decision boundary maps for machine learning classifiers. In: Proc. IVAPP. 2022.

[58]

Schulz A, Gisbrecht A, and Hammer B Using discriminative dimensionality reduction to visualize classifiers Neural Process Lett 2015 42 1 27-54

Digital Library

[59]

LeCun Y, Cortes C, Burges C. MNIST handwritten digit database. AT &T Labs. http://yann.lecun.com/exdb/mnist. 2010. Accessed 15 Sept 2023.

[60]

Moosavi-Dezfooli S, Fawzi A, Frossard P. Deepfool: a simple and accurate method to fool deep neural networks. In: Proc. IEEE CVPR. 2016. p. 2574–82.

[61]

Schulz A, Hinder F, Hammer B. DeepView: visualizing classification boundaries of deep neural networks as scatter plots using discriminative dimensionality reduction. In: Bessiere C, editor. Proc. IJCAI. 2020. p. 2305–11.

[62]

Colding TH and Minicozzi WP Shapes of embedded minimal surfaces PNAS 2006 103 30 11106-11111

[63]

McInnes L, Healy J, Melville J. UMAP: uniform manifold approximation and projection for dimension reduction. arXiv:1802.03426v2 [stat.ML]. 2018.

[64]

Minghim R, Paulovich FV, Lopes AA. Content-based text mapping using multi-dimensional projections for exploration of document collections. In: Proc. SPIE. 2006. Intl. Society for Optics and Photonics.

[65]

Paulovich FV, Minghim R. Text map explorer: a tool to create and explore document maps. In: Proc. IEEE IV. 2006. p. 245–51.

[66]

Hinton GE and Salakhutdinov RR Reducing the dimensionality of data with neural networks Science 2006 313 5786 504-507

[67]

Pekalska E, Ridder D, Duin RPW, and Kraaijveld MA A new method of generalizing Sammon mapping with application to algorithm speed-up Proc. ASCI 1999 99 221-228

[68]

Espadoto M, Hirata N, and Telea A Deep learning multidimensional projections Inf Vis 2020 9 3 247-269

[69]

Bredius C, Tian Z, Telea A. Visual exploration of neural network projection stability. In: Proc. MLVis. 2022.

[70]

Modrakowski T, Espadoto M, Falcao A, Hirata N, Telea A. Improving deep learning projections by neighborhood analysis. In: Communication in computer and information. 2021.

[71]

Espadoto M, Hirata N, Telea A. Self-supervised dimensionality reduction with neural networks and pseudo-labeling. In: Proc. IVAPP. 2021.

[72]

Machado A, Behrisch M, Telea A. ShaRP: shape-regularized multidimensional projections. In: Proc. EuroVA. 2023.

[73]

Appleby G, Espadoto M, Chen R, Goree S, Telea A, Anderson E, and Chang R HyperNP: interactive visual exploration of multidimensional projection hyperparameters CGF 2022 41 3 169-181

[74]

Kim Y, Espadoto M, Trager S, Roerdink J, Telea A. SDR-NNP: sharpened dimensionality reduction with neural networks. In: Proc. IVAPP. 2022.

[75]

Comaniciu D and Meer P Mean shift: a robust approach toward feature space analysis IEEE TPAMI 2002 24 5 603-619

Digital Library

[76]

Rodrigues FCM, Jr, RH, Telea A. Image-based visualization of classifier decision boundaries. In: Proc. SIBGRAPI. 2018.

[77]

Amorim E, Brazil E, Daniels J, Joia P, Nonato L, Sousa M. iLAMP: exploring high-dimensional spacing through backward multidimensional projection. In: Proc. IEEE VAST. 2012.

[78]

Mamani GMH, Fatore FM, Nonato LG, and Paulovich FV User-driven feature space transformation Comput Graph Forum 2013 32 3 291-299

Digital Library

[79]

Amorim E, Brazil E, Mena-Chalco J, Velho L, Nonato LG, Samavati F, and Sousa M Facing the high-dimensions: inverse projection with radial basis functions Comput Graph 2015 48 35-47

Digital Library

[80]

Espadoto M, Rodrigues FCM, Hirata NST, Jr, RH, Telea A. Deep learning inverse multidimensional projections. In: Proc. EuroVA. 2019.

[81]

Espadoto M, Appleby G, Suh A, Cashman D, Li M, Scheidegger C, Anderson E, Chang R, and Telea A UnProjection: leveraging inverse-projections for visual analytics of high-dimensional data IEEE TVCG 2021 29 2 1559-1572

[82]

Wijk JJ, Liere R. Hyperslice: Visualization of scalar functions of many variables. In: Proc. IEEE visualization. 1993. p. 119–25.

[83]

Espadoto M, Rodrigues FCM, Hirata N, Telea A. OptMap: using dense maps for visualizing multidimensional optimization problems. In: Proc. IVAPP. 2021.

[84]

Espadoto M, Rodrigues FCM, Hirata NST, and Telea AC Visualizing high-dimensional functions with dense maps SN Comput Sci 2023

Digital Library

[85]

Weickert J and Hagen H Visualization and processing of tensor fields 2005 Berlin Springer

[86]

Duarte F, Sikanski F, Fatore F, Fadel S, and Paulovich FV Nmap: a novel neighborhood preservation space-filling algorithm IEEE TVCG 2014 20 12 2063-2071

[87]

Luan F, Paris S, Shechtman E, Bala K. Deep photo style transfer. In: Proc. IEEE CVPR. 2017.

[88]

Vernier E, Garcia R, Silva I, Comba J, and Telea A Quantitative evaluation of time-dependent multidimensional projection techniques Comput Graph Forum 2020 39 3 241-252

[89]

Neves TTT, Martins RM, Coimbra DB, Kucher K, Kerren A, and Paulovich FV Fast and reliable incremental dimensionality reduction for streaming data Comput Graph 2022 102 233-244

Digital Library

Index Terms

Seeing is Learning in High Dimensions: The Synergy Between Dimensionality Reduction and Machine Learning
1. Computing methodologies
  1. Machine learning
2. Human-centered computing
  1. Human computer interaction (HCI)

Index terms have been assigned to the content through auto-classification.

Recommendations

Incremental Nonlinear Dimensionality Reduction by Manifold Learning

Understanding the structure of multidimensional patterns, especially in unsupervised cases, is of fundamental importance in data mining, pattern recognition, and machine learning. Several algorithms have been proposed to analyze the structure of high-...
Nonparametric discriminant multi-manifold learning for dimensionality reduction

Based on that data sampled from the same class locate on one manifold and those labeled different classes reside on the corresponding manifolds, traditional data classification problem can be reasoned to multiply manifolds identification. Thus in this ...
Linear Dimensionality Reduction via a Heteroscedastic Extension of LDA: The Chernoff Criterion

Abstract--We propose an eigenvector-based heteroscedastic linear dimension reduction (LDR) technique for multiclass data. The technique is based on a heteroscedastic two-class technique which utilizes the so-called Chernoff criterion, and successfully ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image SN Computer Science

SN Computer Science Volume 5, Issue 3

Mar 2024

750 pages

EISSN:2661-8907

Issue’s Table of Contents

© The Author(s) 2024.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 21 February 2024

Accepted: 30 December 2023

Received: 04 September 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents