[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Data Modeling and NoSQL Databases - A Systematic Mapping Review

Published: 13 July 2021 Publication History

Abstract

Modeling is one of the most important steps in developing a database. In traditional databases, the Entity Relationship (ER) and Unified Modeling Language (UML) models are widely used. But how are NoSQL databases being modeled? We performed a systematic mapping review to answer three research questions to identify and analyze the levels of representation, models used, and contexts where the modeling process occurred in the main categories of NoSQL databases. We found 54 primary studies where we identified that conceptual and logical levels received more attention than the physical level of representation. The UML, ER, and new notation based on ER and UML were adapted to model NoSQL databases, in the same way, formats such as JSON, XML, and XMI were used to generate schemas through the three levels of representation. New contexts such as benchmark, evaluations, migration, and schema generation were identified, as well as new features to be considered for modeling NoSQL databases, such as the number of records by entities, CRUD operations, and system requirements (availability, consistency, or scalability). Additionally, a coupling and co-citation analysis was carried out to identify relevant works and researchers.

References

[1]
F. Abdelhedi, A. Ait Brahim, F. Atigui, and G. Zurfluh. 2017. MDA-based approach for NoSQL databases modelling. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10440 LNCS (2017), 88–102. https://doi.org/10.1007/978-3-319-64283-3_7
[2]
F. Abdelhedi, A. Ait Brahim, and G. Zurfluh. 2018. Formalizing the mapping of UML conceptual schemas to column-oriented databases. International Journal of Data Warehousing and Mining 14, 3 (2018), 44–68. https://doi.org/10.4018/IJDWM.2018070103
[3]
F. Abdelhedi, A. A. Brahim, F. Atigui, and G. Zurfluh. 2016. Big data and knowledge management: How to implement conceptual models in NoSQL systems’. IC3K 2016 - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management 3 (2016), 235–240.
[4]
F. Abdelhedi, A. A. Brahim, F. Atigui, and G. Zurfluh. 2017. Logical unified modeling for NoSQL databases. ICEIS 2017 - Proceedings of the 19th International Conference on Enterprise Information Systems 1 (2017), 249–256.
[5]
F. Abdelhedi, A. A. Brahim, F. Atigui, and G. Zurfluh. 2018. UMLtoNoSQL: Automatic transformation of conceptual schema to NoSQL databases. Proceedings of IEEE/ACS International Conference on Computer Systems and Applications, AICCSA 2017-October (2018), 272–279. https://doi.org/10.1109/AICCSA.2017.76
[6]
S. B. Akintoye, A. B. Bagula, O. E. Isafiade, Y. Djemaiel, and N. Boudriga. 2019. Data model for cloud computing environment. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST 275 (2019), 199–215. https://doi.org/10.1007/978-3-030-16042-5_19
[7]
Jacky Akoka, Isabelle Comyn-Wattiau, and Nabil Laoufi. 2017. Research on Big Data–a systematic mapping study. Computer Standards & Interfaces 54 (2017), 105–115.
[8]
R. Angles. 2018. The property graph database model. CEUR Workshop Proceedings 2100 (2018).
[9]
Chaimae Asaad and Karim Baïna. 2018. NoSQL Databases–Seek for a Design Methodology. In International Conference on Model and Data Engineering. Springer, 25–40.
[10]
Paolo Atzeni. 2016. Data Modelling in the NoSQL world: A contradiction?. In Proceedings of the 17th International Conference on Computer Systems and Technologies 2016. ACM, 1–4.
[11]
S. Banerjee and A. Sarkar. 2016. Ontology Driven Meta-Modeling for NoSQL Databases: A Conceptual Perspective. International Journal of Software Engineering and its Applications 10, 12 (2016), 41–64. https://doi.org/10.14257/ijseia.2016.10.12.05
[12]
S. Banerjee and A. Sarkar. 2017. Logical level design of NoSQL databases. IEEE Region 10 Annual International Conference, Proceedings/TENCON (2017), 2360–2365. https://doi.org/10.1109/TENCON.2016.7848452
[13]
D. Bermbach, S. Müller, J. Eberhardt, and S. Tai. 2016. Informed Schema Design for Column Store-Based Database Services. Proceedings - 2015 IEEE 8th International Conference on Service-Oriented Computing and Applications, SOCA 2015 (2016), 163–172. https://doi.org/10.1109/SOCA.2015.29
[14]
Eric Brewer. 2010. A certain freedom: thoughts on the CAP theorem. In Proceedings of the 29th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing. 335.
[15]
Francesca Bugiotti, Luca Cabibbo, Paolo Atzeni, and Riccardo Torlone. 2014. Database design for NoSQL systems. In International Conference on Conceptual Modeling. Springer. 223–231.
[16]
A. Chebotko, A. Kashlev, and S. Lu. 2015. A Big Data Modeling Methodology for Apache Cassandra. Proceedings - 2015 IEEE International Congress on Big Data, BigData Congress 2015 (2015), 238–245.
[17]
A. H. Chillón, S. F. Morales, D. S. Ruiz, and J. G. Molina. 2017. Exploring the visualization of schemas for aggregate-oriented nosql databases?CEUR Workshop Proceedings 1979 (2017), 72–85.
[18]
A. Chiş-Raţiu and R. A. Buchmann. 2018. Design and implementation of a diagrammatic tool for creating RDF graphs. CEUR Workshop Proceedings 2238 (2018), 37–48.
[19]
Alejandro Corbellini, Cristian Mateos, Alejandro Zunino, Daniela Godoy, and Silvia Schiaffino. 2017. Persisting big-data: The NoSQL landscape. Information Systems 63 (2017), 1–23.
[20]
G. Daniel, G. Sunyé, and J. Cabot. 2016. UMLtographDB: Mapping conceptual schemas to graph databases. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9974 LNCS (2016), 430–444. https://doi.org/10.1007/978-3-319-46397-1_33
[21]
Ali Davoudian, Liu Chen, and Mengchi Liu. 2018. A survey on NoSQL stores. ACM Computing Surveys (CSUR) 51, 2 (2018), 40.
[22]
A. de la Vega, D. García-Saiz, C. Blanco, M. Zorrilla, and P. Sánchez. 2018. Mortadelo: A model-driven framework for NoSQL database design. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11163 LNCS (2018), 41–57. https://doi.org/10.1007/978-3-030-00856-7_3
[23]
C. De Lima and R. Dos Santos Mello. 2015. A workload-driven logical design approach for NoSQL document databases. In Proceedings of 17th International Conference on Information Integration and Web-Based Applications and Services (iiWAS’15). https://doi.org/10.1145/2837185.2837218
[24]
R. De Virgilio, A. Maccioni, and R. Torlone. 2014. Model-driven design of graph databases. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8824 (2014), 172–185.
[25]
Miguel Diogo, Bruno Cabral, and Jorge Bernardino. 2019. Consistency Models of NoSQL Databases. Future Internet 11, 2 (2019), 43.
[26]
David Gil and Il-Yeol Song. 2016. Modeling and management of big data: challenges and opportunities.
[27]
Paola Gómez, Rubby Casallas, and Claudia Roncancio. 2016. Data schema does matter, even in NoSQL systems!. In 2016 IEEE Tenth International Conference on Research Challenges in Information Science (RCIS). IEEE, 1–6.
[28]
Hackolade. 2020. Data modeling tool for NoSQL databases. https://hackolade.com/. (Accessed on 10/10/2020).
[29]
S. Hamouda and Z. Zainol. 2018. Document-Oriented Data Schema for Relational Database Migration to NoSQL. Proceedings - 2017 International Conference on Big Data Innovations and Applications, Innovate-Data 2017 2018-January (2018), 43–50. https://doi.org/10.1109/Innovate-Data.2017.13
[30]
Jing Han, E Haihong, Guan Le, and Jian Du. 2011. Survey on NoSQL database. In 2011 6th international conference on pervasive computing and applications. IEEE, 363–366.
[31]
Robin Hecht and Stefan Jablonski. 2011. NoSQL evaluation: A use case oriented survey. In 2011 International Conference on Cloud and Service Computing. IEEE, 336–341.
[32]
M. Hewasinghage, N.B. Seghouani, and F. Bugiotti. 2018. Modeling strategies for storing data in distributed heterogeneous NoSQL databases. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 11157 LNCS (2018), 488–496. https://doi.org/10.1007/978-3-030-00847-5_35
[33]
A. A. Imam, S. Basri, R. Ahmad, N. Aziz, and M. T. Gonzalez-Aparicio. 2017. New cardinality notations and styles for modeling NoSQL document-store databases. IEEE Region 10 Annual International Conference, Proceedings/TENCON 2017-December (2017), 2765–2770. https://doi.org/10.1109/TENCON.2017.8228332
[34]
A. A. Imam, S. Basri, R. Ahmad, and María Teresa González Aparicio. 2019. Schema proposition model for NoSQL applications. Recent Trends in Data Science and Soft Computing (2019).
[35]
A. A. Imam, S. Basri, R. Ahmad, J. Watada, and M. T. González-Aparicio. 2018. Automatic schema suggestion model for NoSQL document-stores databases. Journal of Big Data 5, 1 (2018). https://doi.org/10.1186/s40537-018-0156-1
[36]
Abdullahi Abubakar Imam, Shuib Basri, Rohiza Ahmad, Junzo Watada, María Teresa González Aparicio, and Malek Ahmad Almomani. 2018. Data modeling guidelines for NoSQL document-store databases. International Journal of Advanced Computer Science and Applications, 9 (2018).
[37]
K. Kaur and R. Rani. 2013. Modeling and querying data in NoSQL databases. Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013 (2013), 1–7. https://doi.org/10.1109/BigData.2013.6691765
[38]
KDM. 2020. The Kashliev Data Modeler. https://www.datafluent.org/. [Online; accessed 10/10/2020].
[39]
Barbara Kitchenham, Rialette Pretorius, David Budgen, O. Pearl Brereton, Mark Turner, Mahmood Niazi, and Stephen Linkman. 2010. Systematic literature reviews in software engineering–a tertiary study. Information and software technology 52, 8 (2010), 792–805.
[40]
X. Li, Z. Ma, and H. Chen. 2014. QODM: A query-oriented data modeling approach for NoSQL databases. Proceedings - 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications, WARTIA 2014 (2014), 338–345. https://doi.org/10.1109/WARTIA.2014.6976265
[41]
Yan Li, Ping Gu, and Chao Zhang. 2014. Transforming UML class diagrams into HBase based on meta-model. In 2014 International Conference on Information Science, Electronics and Electrical Engineering, Vol. 2. IEEE, 720–724.
[42]
C. Lima and R. S. Mello. 2016. On proposing and evaluating a NoSQL document database logical approach. International Journal of Web Information Systems 12, 4 (2016), 398–417. https://doi.org/10.1108/IJWIS-04-2016-0018
[43]
V. Martins de Sousa and L. M. del Val Cura. 2018. Logical design of graph databases from an entity-relationship conceptual model. ACM International Conference Proceeding Series (2018), 183–189. https://doi.org/10.1145/3282373.3282375
[44]
M. J. Mior. 2014. Automated schema design for NoSQL databases. Proceedings of the ACM SIGMOD International Conference on Management of Data (2014), 41–45. https://doi.org/10.1145/2602622.2602624
[45]
Michael Joseph Mior, Kenneth Salem, Ashraf Aboulnaga, and Rui Liu. 2017. NoSE: Schema design for NoSQL applications. IEEE Transactions on Knowledge and Data Engineering 29, 10 (2017), 2275–2289.
[46]
I. D. Nogueira, M. Romdhane, and J. Darmont. 2018. Modeling data lake metadata with a data vault. ACM International Conference Proceeding Series (2018), 253–261. https://doi.org/10.1145/3216122.3216130
[47]
O. Orel, S. Zakošek, and M. Baranović. 2017. Property oriented relational-to-graph database conversion [Konverzija relacijskih u grafovske baze podataka orijentirana na svojstva]. Automatika 57, 3 (2017), 836–845. https://doi.org/10.7305/automatika.2017.02.1581
[48]
Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz. 2015. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and Software Technology 64 (2015), 1–18.
[49]
J. Pokorný. 2016. Conceptual and database modelling of graph databases. ACM International Conference Proceeding Series 11-13-July-2016 (2016), 370–377. https://doi.org/10.1145/2938503.2938547
[50]
Debora G. Reis, Fabio S. Gasparoni, Maristela Holanda, Marcio Victorino, Marcelo Ladeira, and Edward O Ribeiro. 2018. An evaluation of data model for NoSQL document-based databases. In World Conference on Information Systems and Technologies. Springer, 616–625.
[51]
V. Reniers, D. Van Landuyt, A. Rafique, and W. Joosen. 2018. Schema design support for semi-structured data: Finding the sweet spot between NF and De-NF. Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 2018-January (2018), 2921–2930. https://doi.org/10.1109/BigData.2017.8258261
[52]
N. Roy-Hubara, L. Rokach, B. Shapira, and P. Shoval. 2017. Modeling Graph Database Schema. IT Professional 19, 6 (2017), 34–43. https://doi.org/10.1109/MITP.2017.4241458
[53]
N. Roy-Hubara, L. Rokach, B. Shapira, and P. Shoval. 2018. Evaluation of a design method for graph database. Lecture Notes in Business Information Processing 318 (2018), 291–303. https://doi.org/10.1007/978-3-319-91704-7_19
[54]
J. Santisteban and R. Ticona-Herrera. 2018. Modeling a persistent graph. Proceedings of a Special Session - 16th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence, MICAI 2017 (2018), 15–22. https://doi.org/10.1109/MICAI-2017.2017.00011
[55]
M. Y. Santos and C. Costa. 2016. Data models in NoSQL databases for big data contexts. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9714 LNCS (2016), 475–485. https://doi.org/10.1007/978-3-319-40973-3_48
[56]
A. Schram and K. M. Anderson. 2012. MySQL to NoSQL data modeling challenges in supporting scalability. SPLASH’12 - Proceedings of the 2012 ACM Conference on Systems, Programming, and Applications: Software for Humanity (2012), 191–202. https://doi.org/10.1145/2384716.2384773
[57]
M. Sedlmeier and M. Gogolla. 2014. Design and prototypical implementation of an integrated graph-based conceptual data model. Frontiers in Artificial Intelligence and Applications 272 (2014), 376–395. https://doi.org/10.3233/978-1-61499-472-5-376
[58]
K. Shin, C. Hwang, and H. Jung. 2017. NoSQL database design using UML conceptual data model based on peter chen’s framework. International Journal of Applied Engineering Research 12, 5 (2017), 632–636.
[59]
P. Shoval. 2018. A method for modeling a schema for graph databases. Digital Presentation and Preservation of Cultural and Scientific Heritage 8 (2018), 99–104.
[60]
Graeme Simsion and Graham Witt. 2004. Data Modeling Essentials. Elsevier.
[61]
P. Suárez-Otero, M. J. Suárez-Cabal, and J. Tuya. 2018. Leveraging conceptual data models for keeping cassandra database integrity. In Proceedings of the 14th International Conference on Web Information Systems and Technologies (WEBIST’18). 398–403.
[62]
Tamás Vajk, László Deák, Krisztián Fekete, and Gergely Mezei. 2013. Automatic NoSQL schema development: A case study. In Artificial Intelligence and Applications. Actapress, 656–663.
[63]
G. Van Erven, W. Silva, R. Carvalho, and M. Holanda. 2018. GRAPHED: A graph description diagram for graph databases. Advances in Intelligent Systems and Computing 745 (2018), 1141–1151. https://doi.org/10.1007/978-3-319-77703-0_111
[64]
V. Varga, K. T. Jánosi-Rancz, and B. Kálmán. 2016. Conceptual design of document NoSQL database with formal concept analysis. Acta Polytechnica Hungarica 13, 2 (2016), 229–248.
[65]
V. Varga, C. Sǎcǎrea, and A. E. Molnar. 2018. Conceptual Graphs Based Modeling of Semi-structured Data. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 10872 LNAI (2018), 167–175. https://doi.org/10.1007/978-3-319-91379-7_13
[66]
H. Vera, W. Boaventura, M. Holanda, V. Guimarães, and F. Hondo. 2015. Data modeling for NoSQL document-oriented databases. CEUR Workshop Proceedings 1478 (2015), 129–135.
[67]
Fernán Villa, Francisco Moreno, and Jaime Guzmán. 2018. An Analysis of a Methodology that Transforms the Entity-Relationship Model into a Conceptual Model for a Graph Database. In International Conference for Emerging Technologies in Computing. Springer, 70–83.
[68]
A. Vágner. 2018. Store and visualize EeR in Neo4j. ACM International Conference Proceeding Series (2018). https://doi.org/10.1145/3284557.3284694
[69]
K. M. Yoo, S. Park, and S.-G. Lee. 2014. RDB2Graph: A generic framework for modeling relational databases as graphs. CEUR Workshop Proceedings 1312 (2014), 148–151.
[70]
Z. J. Zhang. 2017. Graph Databases for Knowledge Management. IT Professional 19, 6 (2017), 26–32. https://doi.org/10.1109/MITP.2017.4241463
[71]
Gansen Zhao, Weichai Huang, Shunlin Liang, and Yong Tang. 2013. Modeling MongoDB with relational model. In 2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies. IEEE, 115–121.
[72]
M. Zhao, Y. Liu, and P. Zhou. 2016. Towards a systematic approach to graph data modeling: Scenario-based design and experiences. Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE 2016-January (2016), 634–637. https://doi.org/10.18293/SEKE2016-119

Cited By

View all
  • (2025)MTable: Visual query interface for browsing and navigation in NoSQL data storesJournal of Computer Languages10.1016/j.cola.2024.10131282(101312)Online publication date: Mar-2025
  • (2024)Schema Extraction in NoSQL Databases: A Systematic Literature ReviewRecent Advances in Computer Science and Communications10.2174/012666255827343723120406110617:8Online publication date: Nov-2024
  • (2024)Self-tuning Database Systems: A Systematic Literature Review of Automatic Database Schema Design and TuningACM Computing Surveys10.1145/366532356:11(1-37)Online publication date: 29-Jun-2024
  • Show More Cited By

Index Terms

  1. Data Modeling and NoSQL Databases - A Systematic Mapping Review

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 54, Issue 6
    Invited Tutorial
    July 2022
    799 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/3475936
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 July 2021
    Accepted: 01 March 2021
    Revised: 01 January 2021
    Received: 01 December 2019
    Published in CSUR Volume 54, Issue 6

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Data modeling
    2. NoSQL databases
    3. systematic mapping

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)309
    • Downloads (Last 6 weeks)40
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)MTable: Visual query interface for browsing and navigation in NoSQL data storesJournal of Computer Languages10.1016/j.cola.2024.10131282(101312)Online publication date: Mar-2025
    • (2024)Schema Extraction in NoSQL Databases: A Systematic Literature ReviewRecent Advances in Computer Science and Communications10.2174/012666255827343723120406110617:8Online publication date: Nov-2024
    • (2024)Self-tuning Database Systems: A Systematic Literature Review of Automatic Database Schema Design and TuningACM Computing Surveys10.1145/366532356:11(1-37)Online publication date: 29-Jun-2024
    • (2024)Create, Read, Update, Delete: Implications on Security and Privacy Principles regarding GDPRProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3670898(1-7)Online publication date: 30-Jul-2024
    • (2024)Plugging and Playing with Variety of Data using Multi-Model Database and Polyglot Persistence2024 Third International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT)10.1109/ICEEICT61591.2024.10718481(1-8)Online publication date: 24-Jul-2024
    • (2024)A user-friendly NoSQL framework for managing agricultural field trial dataScientific Reports10.1038/s41598-024-81609-214:1Online publication date: 30-Nov-2024
    • (2024)Author name disambiguation literature review with consolidated meta-analytic approachInternational Journal on Digital Libraries10.1007/s00799-024-00398-125:4(765-785)Online publication date: 1-Dec-2024
    • (2024)Understanding Big Data in NeurosurgeryComputational Neurosurgery10.1007/978-3-031-64892-2_10(157-175)Online publication date: 11-Nov-2024
    • (2023)Towards Leveraging Artificial Intelligence for NoSQL Data Modeling, Querying and Quality Characterization2023 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C)10.1109/MODELS-C59198.2023.00047(192-198)Online publication date: 1-Oct-2023
    • (2023)Toward Building Edge Learning PipelinesIEEE Internet Computing10.1109/MIC.2022.317164327:1(61-69)Online publication date: 1-Jan-2023
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media