[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

A Workload-Driven Document Database Schema Recommender (DBSR)

  • Conference paper
  • First Online:
Conceptual Modeling (ER 2020)

Abstract

Database schema design requires careful consideration of the application’s data model, workload, and target database technology to optimize for performance and data size. Traditional normalization schemes used in relational databases minimize data redundancy, whereas NoSQL document-oriented databases favor redundancy and optimize for horizontal scalability and performance.

Systematic NoSQL schema design involves multiple dimensions, and a database designer is in practice required to carefully consider (i) which data elements to copy and co-locate, (ii) which data elements to normalize, and (iii) how to encode data, while taking into account factors such as the workload and data model.

In this paper, we present a workload-driven document database schema recommender (DBSR), which takes a systematic, search-based approach in exploring the complex schema design space. The recommender takes as main inputs the application’s data model and its read workload, and outputs (i) the suggested document schema (featuring secondary indexing), (ii) query plan recommendations, and (iii) a document utility matrix that encodes insights on their respective costs and relative utility. We evaluate recommended schema in MongoDB using YCSB, and show significant benefits to read query performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    DBSR framework repository: https://github.com/vreniers/DBSR.

  2. 2.

    Benchmark repository: https://github.com/vreniers/YCSB-MongoDB-RUBiS.

References

  1. Entity relationships and document design. https://docs.couchbase.com/server/4.6/data-modeling/entity-relationship-doc-design.html. Accessed 25 May 2020

  2. MongoDB: Data model design. https://docs.mongodb.com/manual/core/data-model-design/. Accessed 25 May 2020

  3. Atzeni, P., Bugiotti, F., Cabibbo, L., Torlone, R.: Data modeling in the NoSQL world. Comput. Stand. Interfaces 67, 103149 (2020)

    Article  Google Scholar 

  4. Banerjee, S., Sarkar, A.: Logical level design of NOSQL databases. In: 2016 IEEE Region 10 Conference (TENCON), pp. 2360–2365 (2016)

    Google Scholar 

  5. Bermbach, D., Müller, S., Eberhardt, J., Tai, S.: Informed schema design for column store-based database services. In: 2015 IEEE 8th International Conference on Service-Oriented Computing and Applications (SOCA), pp. 163–172, October 2015

    Google Scholar 

  6. Cecchet, E., Marguerite, J., Zwaenepoel, W.: Performance and scalability of EJB applications. ACM SIGPLAN Not. 37(11), 246–261 (2002)

    Article  Google Scholar 

  7. Chebotko, A., Kashlev, A., Lu, S.: A big data modeling methodology for apache Cassandra. In: IEEE International Congress on Big Data (2015)

    Google Scholar 

  8. Cheng, Chun-Hung., Lee, Wing-Kin, Wong, Kam-Fai: A genetic algorithm-based clustering approach for database partitioning. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 32(3), 215–230 (2002)

    Article  Google Scholar 

  9. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154 (2010)

    Google Scholar 

  10. Gómez, P., Casallas, R., Roncancio, C.: Data schema does matter, even in NoSQL systems! In: IEEE Tenth International Conference on Research Challenges in Information Science (RCIS) (2016)

    Google Scholar 

  11. Gómez, P., Roncancio, C., Casallas, R.: Towards quality analysis for document oriented bases. In: International Conference on Conceptual Modeling (2018)

    Google Scholar 

  12. Grolinger, Katarina., Higashino, Wilson A., Tiwari, Abhinav, Capretz, Miriam A.M.: Data management in cloud environments: NoSQL and NewSQL data stores. J. Cloud Comput. Adv. Syst. Appl. 2(1), 1–24 (2013). https://doi.org/10.1186/2192-113X-2-22

    Article  Google Scholar 

  13. Jia, T., Zhao, X., Wang, Z., Gong, D., Ding, G.: Model transformation and data migration from relational database to MongoDB. In: IEEE International Congress on Big Data (BigData Congress) (2016)

    Google Scholar 

  14. Kanade, A., Gopal, A., Kanade, S.: A study of normalization and embedding in MongoDB. In: IEEE International Advance Computing Conference (IACC) (2014)

    Google Scholar 

  15. Lee, C., Zheng, Y.: Automatic SQL-to-NoSQL schema transformation over the MySQL and HBase databases. In: 2015 IEEE International Conference on Consumer Electronics - Taiwan (2015)

    Google Scholar 

  16. Li, X., Ma, Z., Chen, H.: QODM: a query-oriented data modeling approach for NoSQL databases. In: 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA), pp. 338–345. IEEE (2014)

    Google Scholar 

  17. de Lima, C., dos Santos Mello, R.: A workload-driven logical design approach for NoSQL document databases. In: Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services (2015)

    Google Scholar 

  18. Mior, M.J., Salem, K., Aboulnaga, A., Liu, R.: NoSE: Schema Design for NoSQL Applications. IEEE Transactions on Knowledge and Data Engineering (Oct 2017)

    Google Scholar 

  19. Pasqualin, D., Souza, G., Buratti, E.L., de Almeida, E.C., Del Fabro, M.D., Weingaertner, D.: A case study of the aggregation query model in read-mostly NoSQL document stores. In: Proceedings of the 20th International Database Engineering & Applications Symposium (2016)

    Google Scholar 

  20. Reniers, V., Van Landuyt, D., Rafique, A., Joosen, W.: Schema design support for semi-structured data: Finding the sweet spot between NF and De-NF. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 2921–2930 (2017)

    Google Scholar 

  21. Stonebraker, M.: SQL databases v. NoSQL databases. Commun. ACM 53(4), 10–11 (2010). https://doi.org/10.1145/1721654.1721659

    Article  Google Scholar 

  22. de la Vega, A., García-Saiz, D., Blanco, C., Zorrilla, M., Sánchez, P.: Mortadelo: automatic generation of NoSQL stores from platform-independent data models. Future Gener. Comput. Syst. 105, 455–474 (2020)

    Article  Google Scholar 

  23. Zhao, G., Lin, Q., Li, L., Li, Z.: Schema conversion model of SQL database to NoSQL. In: 2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 355–362, November 2014

    Google Scholar 

Download references

Acknowledgments

This work has been funded by the KU Leuven Research Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vincent Reniers .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Reniers, V., Van Landuyt, D., Rafique, A., Joosen, W. (2020). A Workload-Driven Document Database Schema Recommender (DBSR). In: Dobbie, G., Frank, U., Kappel, G., Liddle, S.W., Mayr, H.C. (eds) Conceptual Modeling. ER 2020. Lecture Notes in Computer Science(), vol 12400. Springer, Cham. https://doi.org/10.1007/978-3-030-62522-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62522-1_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62521-4

  • Online ISBN: 978-3-030-62522-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics