Abstract
An estimated 2.5 quintillion bytes of data are created every day. This data explosion, along with new datatypes, objects, and the wide usage of social media networks, with an estimated 3.8 billion users worldwide, make the exploitation and manipulation of data by relational databases, cumbersome and problematic. NoSQL databases introduce new capabilities aiming at improving the functionalities offered by traditional SQL DBMS. This paper elaborates on ongoing research regarding NoSQL, focusing on the background behind their development, their basic characteristics, their categorization and the noticeable increase in popularity. Functional advantages and data mining capabilities that come with the usage of graph databases are also presented. Common data mining tasks with graphs are presented, facilitating implementation, as well as efficiency. The aim is to highlight concepts necessary for incorporating data mining techniques and graph database functionalities, eventually proposing an analytical framework offering a plethora of domain specific analytics. For example, a virus outbreak analytics framework allowing health and government officials to make appropriate decisions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Codd, E.F.: Relational completeness of data base sublanguages, pp. 65–98. IBM Corporation (1972)
Petrov, C.: 25 Big Data Statistics - How Big It Actually Is in 2020? (2020) https://techjury.net/blog/big-data-statistics/. Accessed 3 Aug 2020
NoSQL, 1 August 2020. https://en.wikipedia.org/wiki/NoSQL. Accessed 4 Aug 2020
Moniruzzaman, A.B.M., Hossain, S.A.: NoSQL database: new era of databases for big data analytics-classification, characteristics and comparison. arXiv preprint arXiv:1307.0191 (2013)
Vaghani, R.: Use of NoSQL in industry, 17 December 2018. https://www.geeksforgeeks.org/use-of-nosql-in-industry. Accessed 5 Aug 2020
Nayak, A., Poriya, A., Poojary, D.: Type of NOSQL databases and its comparison with relational databases. Int. J. Appl. Inf. Syst. 5(4), 16–19 (2013)
NoSQL Databases List by Hosting Data - Updated 2020, 03 July 2020. https://hostingdata.co.uk/nosql-database/. Accessed 5 Aug 2020
Zollmann, J.: NoSQL databases. Software Engineering Research Group (2012). https://www.webcitation.org/6hA9zoqRd
DeCandia, G., et al.: Dynamo: Amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Chang, F., et al.: Bigtable: a distributed storage system for structured data. ACM Trans. Comp. Syst. (TOCS) 26(2), 1–26 (2008)
Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2016)
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Koukaras, P., Tjortjis, C., Rousidis, D.: Social media types: introducing a data driven taxonomy. Computing 102(1), 295–340 (2019). https://doi.org/10.1007/s00607-019-00739-y
Koukaras, P., Tjortjis, C.: Social media analytics, types and methodology. In: Tsihrintzis, G.A., Virvou, M., Sakkopoulos, E., Jain, L.C. (eds.) Machine Learning Paradigms. LAIS, vol. 1, pp. 401–427. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15628-2_12
Rousidis, D., Koukaras, P., Tjortjis, C.: Social media prediction: a literature review. Multimedia Tools Appl. 79(9–10), 6279–6311 (2019). https://doi.org/10.1007/s11042-019-08291-9
Koukaras, P., Berberidis, C., Tjortjis, C.: A semi-supervised learning approach for complex information networks. In: Hemanth, J., Bestak, R., Chen, J.I.Z. (eds.) Intelligent Data Communication Technologies and Internet of Things. Lecture Notes on Data Engineering and Communications Technologies, vol. 57, pp. 1–13. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-9509-7_1
Koukaras, P., Rousidis, D., Tjortjis, C.: Forecasting and prevention mechanisms using social media in health care. In: Maglogiannis, I., Brahnam, S., Jain, L.C. (eds.) Advanced Computational Intelligence in Healthcare-7. SCI, vol. 891, pp. 121–137. Springer, Heidelberg (2020). https://doi.org/10.1007/978-3-662-61114-2_8
Gupta, I., Raghavan, V., Ghosh, M.: Leveraging metadata in no SQL storage systems. In: 2015 IEEE 8th International Conference on Cloud Computing, pp. 57–64. IEEE (2015)
Lofstead, J., Ryan, A., Lawson, M.: Adventures in NoSQL for metadata management. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds.) ISC High Performance 2019. LNCS, vol. 11887, pp. 227–239. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34356-9_19
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Rousidis, D., Koukaras, P., Tjortjis, C. (2021). Examination of NoSQL Transition and Data Mining Capabilities. In: Garoufallou, E., Ovalle-Perandones, MA. (eds) Metadata and Semantic Research. MTSR 2020. Communications in Computer and Information Science, vol 1355. Springer, Cham. https://doi.org/10.1007/978-3-030-71903-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-71903-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71902-9
Online ISBN: 978-3-030-71903-6
eBook Packages: Computer ScienceComputer Science (R0)