Abstract
Visualizing multidimensional Big Data is defying: high dimensionalities hinder or even preclude visual inspections. A means of tackling this issue is to use DR (Dimensionality Reduction) techniques, producing low-dimensional representations of high-dimensional data. Popular DR algorithms (e.g., Principal Component Analysis, t-Distributed Stochastic Neighbor Embedding), albeit helpful, are computationally expensive. Most have \(\mathcal {O}(n^2)\) or \(\mathcal {O}(n^3)\) ATC (Asymptotic Time Complexity) and/or calculate pairwise distances of the entire data set, exceeding available memory and rendering Big Data DR time-consuming or impracticable. These issues impede the employment of DR for online learning appliances, where recurrent, cumulative model updates are habitual. The stochastic factor of some approaches similarly obstructs any meaningful inspection on how knowledge is spatially disposed. The recently introduced PCS (Polygonal Coordinate System)—an incremental, geometric-based technique with linear ATC—is compelling; however, its restriction to 2-D embeddings amounts to significant information loss. We propose the Big Data ready, incremental PES (Pyramidal Embedding System), which builds on PCS virtues by additionally generating 3-D embeddings through its pyramid-like interspace, mitigating quality degradation. Visual inspections, as well as pairwise distance based statistical analyses, validate the PES ability to retain structural arrangements when embedding high- and low-dimensional data while retaining flexibility in resources consumption.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Blouvshtein, L., Cohen-Or, D.: Outlier detection for robust multi-dimensional scaling. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2273–2279 (2018)
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Flexa, C., Gomes, W., Viademonte, S., Junior, C.S., Alves, R.: A geometry-based approach to visualize high-dimensional data. In: 2019 8th Brazilian Conference on Intelligent Systems (BRACIS), pp. 186–191. IEEE (2019)
Gracia, A., González, S., Robles, V., Menasalvas, E.: A methodology to compare dimensionality reduction algorithms in terms of loss of quality. Inf. Sci. 270, 1–27 (2014)
Gracia, A., González, S., Robles, V., Menasalvas, E., Von Landesberger, T.: New insights into the suitability of the third dimension for visualizing multivariate/multidimensional data: a study based on loss of quality quantification. Inf. Visual. 15(1), 3–30 (2016)
Li, H., Robini, M.C., Yang, F., Magnin, I., Zhu, Y.: Cardiac fiber unfolding by semidefinite programming. IEEE Trans. Biomed. Eng. 62(2), 582–592 (2014)
Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)
Palese, L.L.: A random version of principal component analysis in data clustering. Comput. Biol. Chem. 73, 57–64 (2018)
Praveena, M.A., Bharathi, B.: A survey paper on big data analytics. In: 2017 International Conference on Information Communication and Embedded Systems (ICICES), pp. 1–9. IEEE (2017)
Su, Y., Lin, R., Kuo, C.C.J.: Tree-structured multi-stage principal component analysis (TMPCA): theory and applications. Expert Syst. Appl. 118, 355–364 (2019)
Ultsch, A.: Clustering with som: U*c. In: Proceedings of Workshop on Self-Organizing Maps, Paris, France, pp. 75–82 (2005)
Van Der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)
Wang, Y., et al.: A perception-driven approach to supervised dimensionality reduction for visualization. IEEE Trans. Vis. Comput. Graph. 24(5), 1828–1840 (2017)
Wei, X., et al.: Reconstructible nonlinear dimensionality reduction via joint dictionary learning. IEEE Trans. Neural Netw. Learn. Syst. 30(1), 175–189 (2018)
Weisstein, E.W.: Pyramid. Wolfram MathWorld (2002)
Yang, L., Song, S., Gong, Y., Gao, H., Wu, C.: Nonparametric dimension reduction via maximizing pairwise separation probability. IEEE Trans. Neural Netw. Learn. Syst. 30(10), 3205–3210 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Barreto, A., Moreira, I., Flexa, C., Cardoso, E., Sales, C. (2020). An Online Pyramidal Embedding Technique for High Dimensional Big Data Visualization. In: Cerri, R., Prati, R.C. (eds) Intelligent Systems. BRACIS 2020. Lecture Notes in Computer Science(), vol 12320. Springer, Cham. https://doi.org/10.1007/978-3-030-61380-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-61380-8_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61379-2
Online ISBN: 978-3-030-61380-8
eBook Packages: Computer ScienceComputer Science (R0)