[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Seeing is Learning in High Dimensions: The Synergy Between Dimensionality Reduction and Machine Learning

Published: 21 February 2024 Publication History

Abstract

High-dimensional data are a key study object for both machine learning (ML) and information visualization. On the visualization side, dimensionality reduction (DR) methods, also called projections, are the most suited techniques for visual exploration of large and high-dimensional datasets. On the ML side, high-dimensional data are generated and processed by classifiers and regressors, and these techniques increasingly require visualization for explanation and exploration. In this paper, we explore how both fields can help each other in achieving their respective aims. In more detail, we present both examples that show how DR can be used to understand and engineer better ML models (seeing helps learning) and also applications of DL for improving the computation of direct and inverse projections (learning helps seeing). We also identify existing limitations of DR methods used to assist ML and of ML techniques applied to improve DR. Based on the above, we propose several high-impact directions for future work that exploit the analyzed ML-DR synergy.

References

[1]
Munzner T Visualization analysis and design: principles, techniques, and practice 2014 Boca Raton CRC Press
[2]
Telea AC Data visualization—principles and practice 2014 2 Abingdon CRC Press/Taylor and Francis
[3]
Liu S, Maljovec D, Wang B, Bremer P-T, and Pascucci V Visualizing high-dimensional data: advances in the past decade IEEE TVCG 2015 23 3 1249-1268
[4]
Yates A, Webb A, Sharpnack M, Chamberlin H, Huang K, and Machiraju R Visualizing multidimensional data with glyph SPLOMs CGF 2014 33 3 301-310
[5]
Lehmann DJ, Albuquerque G, Eisemann M, Magnor M, and Theisel H Selecting coherent and relevant plots in large scatterplot matrices Comput Graph Forum 2012 31 6 1895-1908
[6]
Inselberg A, Dimsdale B. Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proc. IEEE VIS. 1990. p. 361–78.
[7]
Rao R, Card SK. The table lens: merging graphical and symbolic representations in an interactive focus+context visualization for tabular information. In: Proc. ACM SIGCHI. 1994. p. 318–22.
[8]
Telea AC. Combining extended table lens and treemap techniques for visualizing tabular data. In: Proc. EuroVis. 2006. p. 120–7.
[9]
Borgo R, Kehrer J, Chung DHS, Maguire E, Laramee RS, Hauser H, Ward M, Chen M. Glyph-based visualization: foundations, design guidelines, techniques and applications. 2013.
[10]
Lespinats S and Aupetit M CheckViz: sanity check and topological clues for linear and nonlinear mappings CGF 2011 30 1 113-125
[11]
Sorzano C, Vargas J, Pascual-Montano A. A survey of dimensionality reduction techniques. arXiv:1403.2877 [stat.ML]. 2014.
[12]
Nonato L and Aupetit M Multidimensional projection for visual analytics: linking techniques with distortions, tasks, and layout enrichment IEEE TVCG 2018 25 8 2650-2673
[13]
Cunningham J and Ghahramani Z Linear dimensionality reduction: survey, insights, and generalizations JMLR 2015 16 2859-2900
[14]
Espadoto M, Martins R, Kerren A, Hirata N, and Telea A Toward a quantitative survey of dimension reduction techniques IEEE TVCG 2019 27 3 2153-2173
[15]
Telea A. Beyond the third dimension: how multidimensional projections and machine learning can help each other. In: Proc. IVAPP. 2023.
[16]
Botchkarev A Performance metrics (error measures) in machine learning regression, forecasting and prognostics: properties and typology Interdiscip J Inf Knowl Manag 2019 14 45-79
[17]
Jiang T, Gradus J, and Rosellini A Supervised machine learning: a brief primer Behav Ther 2020 51 5 675-687
[18]
Thiyagalingam J, Shankar M, Fox G, and Hey T Scientific machine learning benchmarks Nat Rev Phys 2022 4 413-420
[19]
Joia P, Coimbra D, Cuminato JA, Paulovich FV, and Nonato LG Local affine multidimensional projection IEEE TVCG 2011 17 12 2563-2571
[20]
Venna J, Kaski S. Visualizing gene interaction graphs with local multidimensional scaling. In: Proc. ESANN. 2006. p. 557–62.
[21]
Martins R, Coimbra D, Minghim R, and Telea AC Visual analysis of dimensionality reduction quality for parameterized projections Comput Graph 2014 41 26-42
[22]
van der Maaten L and Hinton GE Visualizing data using t-SNE JMLR 2008 9 2579-2605
[23]
Paulovich FV, Nonato LG, Minghim R, and Levkowitz H Least square projection: a fast high-precision multidimensional projection technique and its application to document mapping IEEE TVCG 2008 14 3 564-575
[24]
Sips M, Neubert B, Lewis J, and Hanrahan P Selecting good views of high-dimensional data using class consistency CGF 2009 28 3 831-838
[25]
Aupetit M Visualizing distortions and recovering topology in continuous projection techniques Neurocomputing 2007 10 7–9 1304-1330
[26]
Sommerville I Software engineering 2015 Sebastopol O’Reilly Publishing
[27]
da Silva R, Rauber P, Martins R, Minghim R, Telea AC. Attribute-based visual explanation of multidimensional projections. In: Proc. EuroVA. 2015.
[28]
Coimbra D, Martins R, Neves T, Telea A, and Paulovich F Explaining three-dimensional dimensionality reduction plots Inf Vis 2016 15 2 154-172
[29]
Marcilio WE, Eler DM. Explaining dimensionality reduction results using Shapley values. arXiv:2103.05678 [cs.LG]. 2021.
[30]
Tian Z, Zhai X, Driel D, Steenpaal G, Espadoto M, and Telea A Using multiple attribute-based explanations of multidimensional projections to explore high-dimensional data Comput Graph 2021 98 C 93-104
[31]
Thijssen J, Tian Z, Telea A. Scaling up the explanation of multidimensional projections. In: Proc. EuroVA. 2023.
[32]
Vernier E, Comba J, and Telea A Guided stable dynamic projections Comput Graph Forum 2021 40 3 87-98
[33]
Garcia R, Telea A, Silva B, Torresen J, and Comba J A task-and-technique centered survey on visual analytics for deep learning model engineering Comput Graph 2018 77 30-49
[34]
Hohman F, Kahng M, Pienta R, and Chau DH Visual analytics in deep learning: an interrogative survey for the next frontiers IEEE TVCG 2019 25 8 2674-2693
[35]
Yuan J, Chen C, Yang W, Liu M, Xia J, and Liu S A survey of visual analytics techniques for machine learning Comput Visual Media 2020 7 3-36
[36]
Alicioglu G and Sun B A survey of visual analytics for explainable artificial intelligence methods Comput Graph 2022 102 C 502-520
[37]
Rauber PE, Falcão AX, and Telea AC Projections as visual aids for classification system design Inf Vis 2017 17 4 282-305
[38]
Guyon I, Gunn S, Ben-Hur A. Result analysis of the NIPS 2003 feature selection challenge. In: Advances in neural information processing systems; 2004. p. 545–52
[39]
Geurts P, Ernst D, and Wehenkel L Extremely randomized trees Mach Learn 2006 63 1 3-42
[40]
Bernard J, Hutter M, Zeppelzauer M, Fellner D, and Sedlmair M Comparing visual-interactive labeling with active learning: an experimental study IEEE TVCG 2018 24 1 298-308
[41]
Benato B, Telea A, Falcão A. Semi-supervised learning with interactive label propagation guided by feature space projections. In: Proc. SIBGRAPI. 2018. p. 392–9.
[42]
Belkin M, Niyogi P, and Sindhwani V Manifold regularization: a geometric framework for learning from labeled and unlabeled examples J Mach Learn Res 2006 7 2399-2434
[43]
Amorim WP, Falcão AX, Papa JP, and Carvalho MH Improving semi-supervised learning through optimum connectivity Pattern Recognit 2016 60 C 72-85
[44]
Benato B, Gomes J, Telea A, and Falcão A Semi-automatic data annotation guided by feature space projection Pattern Recognit 2020 109 107612
[45]
Shwartz-Ziv R, Tishby N. Opening the black box of deep neural networks via information. arXiv:1703.00810 [cs.LG]. 2017.
[46]
Azodi C, Tang J, and Shiu S Opening the black box: interpretable machine learning for geneticists Trends Genet 2020 36 6 442-455
[47]
Tzeng FY, Ma K-L. Opening the black box—data driven visualization of neural networks. In: Proc. IEEE visualization. 2005.
[48]
Pezzotti N, Höllt T, Van Gemert J, Lelieveldt BPF, Eisemann E, and Vilanova A Deepeyes: progressive visual analytics for designing deep neural networks IEEE TVCG 2017 24 1 98-108
[49]
Alsallakh B, Jourabloo A, Ye M, Liu X, and Ren L Do convolutional neural networks learn class hierarchy? IEEE Trans Vis Comput Graph 2018 24 1 152-62
[50]
Strobelt H, Gehrmann S, Pfister H, and Rush AM LSTMVis: a tool for visual analysis of hidden state dynamics in recurrent neural networks IEEE TVCG 2018 24 1 667-676
[51]
Liu M, Shi J, Li Z, Li C, Zhu J, and Liu S Towards better analysis of deep convolutional neural networks IEEE TVCG 2016 23 1 91-100
[52]
Chattopadhay A, Sarkar A, Howlader P, Balasubramanian V. Grad-CAM++: generalized gradient-based visual explanations for deep convolutional networks. In: Proc. IEEE WACV. 2018.
[53]
Rauber P, Falcao A, Telea A. Visualizing time-dependent data using dynamic t-SNE. In: Proc. EuroVis—short papers; 2016. p. 43–9.
[54]
Zwan M, Codreanu V, and Telea A CUBu: universal real-time bundling for large graphs IEEE TVCG 2016 22 12 2550-2563
[55]
Rauber P, Fadel SG, Falcão A, and Telea A Visualizing the hidden activity of artificial neural networks IEEE TVCG 2017 23 1 101-110
[56]
Rodrigues FCM, Espadoto M, Hirata R Jr, and Telea A Constructing and visualizing high-quality classifier decision boundary maps Information 2019 10 9 280-297
[57]
Oliveira AAM, Espadoto M, Hirata R, Telea A. SDBM: supervised decision boundary maps for machine learning classifiers. In: Proc. IVAPP. 2022.
[58]
Schulz A, Gisbrecht A, and Hammer B Using discriminative dimensionality reduction to visualize classifiers Neural Process Lett 2015 42 1 27-54
[59]
LeCun Y, Cortes C, Burges C. MNIST handwritten digit database. AT &T Labs. http://yann.lecun.com/exdb/mnist. 2010. Accessed 15 Sept 2023.
[60]
Moosavi-Dezfooli S, Fawzi A, Frossard P. Deepfool: a simple and accurate method to fool deep neural networks. In: Proc. IEEE CVPR. 2016. p. 2574–82.
[61]
Schulz A, Hinder F, Hammer B. DeepView: visualizing classification boundaries of deep neural networks as scatter plots using discriminative dimensionality reduction. In: Bessiere C, editor. Proc. IJCAI. 2020. p. 2305–11.
[62]
Colding TH and Minicozzi WP Shapes of embedded minimal surfaces PNAS 2006 103 30 11106-11111
[63]
McInnes L, Healy J, Melville J. UMAP: uniform manifold approximation and projection for dimension reduction. arXiv:1802.03426v2 [stat.ML]. 2018.
[64]
Minghim R, Paulovich FV, Lopes AA. Content-based text mapping using multi-dimensional projections for exploration of document collections. In: Proc. SPIE. 2006. Intl. Society for Optics and Photonics.
[65]
Paulovich FV, Minghim R. Text map explorer: a tool to create and explore document maps. In: Proc. IEEE IV. 2006. p. 245–51.
[66]
Hinton GE and Salakhutdinov RR Reducing the dimensionality of data with neural networks Science 2006 313 5786 504-507
[67]
Pekalska E, Ridder D, Duin RPW, and Kraaijveld MA A new method of generalizing Sammon mapping with application to algorithm speed-up Proc. ASCI 1999 99 221-228
[68]
Espadoto M, Hirata N, and Telea A Deep learning multidimensional projections Inf Vis 2020 9 3 247-269
[69]
Bredius C, Tian Z, Telea A. Visual exploration of neural network projection stability. In: Proc. MLVis. 2022.
[70]
Modrakowski T, Espadoto M, Falcao A, Hirata N, Telea A. Improving deep learning projections by neighborhood analysis. In: Communication in computer and information. 2021.
[71]
Espadoto M, Hirata N, Telea A. Self-supervised dimensionality reduction with neural networks and pseudo-labeling. In: Proc. IVAPP. 2021.
[72]
Machado A, Behrisch M, Telea A. ShaRP: shape-regularized multidimensional projections. In: Proc. EuroVA. 2023.
[73]
Appleby G, Espadoto M, Chen R, Goree S, Telea A, Anderson E, and Chang R HyperNP: interactive visual exploration of multidimensional projection hyperparameters CGF 2022 41 3 169-181
[74]
Kim Y, Espadoto M, Trager S, Roerdink J, Telea A. SDR-NNP: sharpened dimensionality reduction with neural networks. In: Proc. IVAPP. 2022.
[75]
Comaniciu D and Meer P Mean shift: a robust approach toward feature space analysis IEEE TPAMI 2002 24 5 603-619
[76]
Rodrigues FCM, Jr, RH, Telea A. Image-based visualization of classifier decision boundaries. In: Proc. SIBGRAPI. 2018.
[77]
Amorim E, Brazil E, Daniels J, Joia P, Nonato L, Sousa M. iLAMP: exploring high-dimensional spacing through backward multidimensional projection. In: Proc. IEEE VAST. 2012.
[78]
Mamani GMH, Fatore FM, Nonato LG, and Paulovich FV User-driven feature space transformation Comput Graph Forum 2013 32 3 291-299
[79]
Amorim E, Brazil E, Mena-Chalco J, Velho L, Nonato LG, Samavati F, and Sousa M Facing the high-dimensions: inverse projection with radial basis functions Comput Graph 2015 48 35-47
[80]
Espadoto M, Rodrigues FCM, Hirata NST, Jr, RH, Telea A. Deep learning inverse multidimensional projections. In: Proc. EuroVA. 2019.
[81]
Espadoto M, Appleby G, Suh A, Cashman D, Li M, Scheidegger C, Anderson E, Chang R, and Telea A UnProjection: leveraging inverse-projections for visual analytics of high-dimensional data IEEE TVCG 2021 29 2 1559-1572
[82]
Wijk JJ, Liere R. Hyperslice: Visualization of scalar functions of many variables. In: Proc. IEEE visualization. 1993. p. 119–25.
[83]
Espadoto M, Rodrigues FCM, Hirata N, Telea A. OptMap: using dense maps for visualizing multidimensional optimization problems. In: Proc. IVAPP. 2021.
[84]
Espadoto M, Rodrigues FCM, Hirata NST, and Telea AC Visualizing high-dimensional functions with dense maps SN Comput Sci 2023
[85]
Weickert J and Hagen H Visualization and processing of tensor fields 2005 Berlin Springer
[86]
Duarte F, Sikanski F, Fatore F, Fadel S, and Paulovich FV Nmap: a novel neighborhood preservation space-filling algorithm IEEE TVCG 2014 20 12 2063-2071
[87]
Luan F, Paris S, Shechtman E, Bala K. Deep photo style transfer. In: Proc. IEEE CVPR. 2017.
[88]
Vernier E, Garcia R, Silva I, Comba J, and Telea A Quantitative evaluation of time-dependent multidimensional projection techniques Comput Graph Forum 2020 39 3 241-252
[89]
Neves TTT, Martins RM, Coimbra DB, Kucher K, Kerren A, and Paulovich FV Fast and reliable incremental dimensionality reduction for streaming data Comput Graph 2022 102 233-244

Index Terms

  1. Seeing is Learning in High Dimensions: The Synergy Between Dimensionality Reduction and Machine Learning
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image SN Computer Science
      SN Computer Science  Volume 5, Issue 3
      Mar 2024
      750 pages

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 21 February 2024
      Accepted: 30 December 2023
      Received: 04 September 2023

      Author Tags

      1. Multidimensional projections
      2. Visual quality metrics
      3. Explainable AI

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 21 Dec 2024

      Other Metrics

      Citations

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media