Disk Space Consumption by Triple Storage Systems

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 556))

Included in the following conference series:

Novel & Intelligent Digital Systems Conferences

450 Accesses

Abstract

Resource Description Framework (RDF) is a widespread standard and flexible method of data representation. RDF storage systems are actively used for storing, sharing, and publishing RDF data on the Internet. RDF models are used in business applications and by research teams who wish to share their data with the community. Generally, most RDF stores are optimized for queries, but usually, this is achieved at the cost of increased disk space consumption. Renting a dedicated server with a large volume of local storage is quite expensive, especially for small research teams and business startups which makes it an important factor in choosing the data storage. In this study we compared disk space usage of four popular triple storage which can serve as SPARQL (an SQL-like query language) endpoints, depending on the amount and structure of the loaded RDF data. To the best of our knowledge, no previous work has compared the disk space occupied by triple stores. We found that all of the compared open-source solutions, namely Apache Jena Fuseki, consume large amounts of hard disk space and should be used with caution in resource-limited environments. The data structure – one large graph or many smaller named graphs – strongly affected Parliament’s disk space usage so it also should be taken into account when selecting an RDF storage. Free versions of commercial systems show adequate disk consumption and appear to be weakly dependent on data structure, but Ontotext GraphDb is deliberately limited in performance, and Stardog is limited in license term and may need additional manual maintenance.

The reported study was funded by RFBR, project number 20-07-00764.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 103.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 129.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Universal Storage Adaption for Distributed RDF-Triple Stores

An empirical study on the evaluation of the RDF storage systems

Article Open access 10 July 2021

Challenge Accepted: QUAD Meets MOCHA2017

Notes

1.
https://jena.apache.org/documentation/fuseki2.
2.
https://docs.stardog.com.
3.
https://graphdb.ontotext.com.
4.
https://github.com/SemWebCentral/parliament/releases.
5.
https://graphdb.ontotext.com/documentation/9.11/pdf/GraphDB-Free.pdf.
6.
MiB: 1 mebibyte equals to \(1024^3\) bytes
7.
RDF serialization format:https://www.w3.org/TR/turtle.
8.
https://www.w3.org/wiki/RdfStoreBenchmarking.

References

Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Chapter Google Scholar
Ben Mahria, B., Chaker, I., Zahi, A.: An empirical study on the evaluation of the RDF storage systems. J. Big Data 8(1), 100 (2021)
Google Scholar
Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Seman. Web Inf. Syst.5, 1–24 (2009). https://doi.org/10.4018/jswis.2009040101
Bonduel, M.: RDF triplestores and SPARQL endpoints (2019). http://www.linkedbuildingdata.net/ldac2019/summerschool/files/07_Bonduel_triplestores_SPARQL_endpoints.pdf
Deb Nath, R.P., Hose, K., Pedersen, T.B., Romero, O., Bhattacharjee, A.: SETLBI: An Integrated Platform for Semantic Business Intelligence, pp. 167-171. Association for Computing Machinery, New York (2020), https://doi.org/10.1145/3366424.3383533
Fellbaum, C.: WordNet, pp. 231–243. Springer Netherlands, Dordrecht (2010). https://doi.org/10.1007/978-90-481-8847-5_10
Fernández, J.D., Umbrich, J., Polleres, A., Knuth, M.: Evaluating query and storage strategies for RDF archives. Semantic Web 10, 247–291 (2019). https://doi.org/10.3233/SW-180309
Article Google Scholar
Ilievski, F., et al.: KGTK: a toolkit for large knowledge graph manipulation and analysis. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12507, pp. 278–293. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62466-8_18
Chapter Google Scholar
Kirchhoff, M., Geihs, K.: Querying SAP ERP with SPARQL. In: Proceedings of the 8th International Conference on Semantic Systems, pp. 173–176. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2362499.2362525
Pan, J.Z.: Resource Description Framework, pp. 71–90. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-92673-3_3
Ramesh, A., Pradhan, V., Lamkuche, H.: Understanding and analysing resource utilization, costing strategies and pricing models in cloud computing. J. Phys. Conf. Ser. 1964(4), 042049 (2021). https://doi.org/10.1088/1742-6596/1964/4/042049
Sellami, S., Dkaki, T., Zarour, N.E., Charrel, P.J.: MidSemI. Int. J. Inf. Syst. Model. Des. 10(2), 1–25 (2019). https://doi.org/10.4018/ijismd.2019040101
Article Google Scholar
Storage and indexing of RDF data. In: Curé, O., Blin, G. (eds.) RDF Database Systems, pp. 105–144. Morgan Kaufmann, Boston (2015). https://doi.org/10.1016/B978-0-12-799957-9.00005-5
Sychev, O.A., Anikin, A., Denisov, M.: Inference engines performance in reasoning tasks for intelligent tutoring systems. In: Gervasi, O., et al. (eds.) ICCSA 2021. LNCS, vol. 12950, pp. 471–482. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86960-1_33
Chapter Google Scholar
Sychev, O., Penskoy, N., Anikin, A., Denisov, M., Prokudin, A.: Improving comprehension: Intelaligent tutoring system explaining the domain rules when students break them. Educ. Sci. 11(11) (2021). https://doi.org/10.3390/educsci11110719
Wishart, D.S., et al.: DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46(D1), D1074–D1082 (2017). https://doi.org/10.1093/nar/gkx1037
Wylot, M., Hauswirth, M., Cudré-Mauroux, P., Sakr, S.: Rdf data storage and query processing schemes: A survey. ACM Comput. Surv. 51(4) (2018). https://doi.org/10.1145/3177850

Download references

Author information

Authors and Affiliations

Volgograd State Technical University, Lenin Avenue, 28, Volgograd, 400005, Russia
Artem Prokudin, Mikhail Denisov & Oleg Sychev

Authors

Artem Prokudin
View author publications
You can also search for this author in PubMed Google Scholar
Mikhail Denisov
View author publications
You can also search for this author in PubMed Google Scholar
Oleg Sychev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Oleg Sychev .

Editor information

Editors and Affiliations

University of West Attica, Aigaleo, Greece
Akrivi Krouska
University of West Attica, Aigaleo, Greece
Christos Troussas
College of Engineering, University of the Philippines, Diliman, Philippines
Jaime Caro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Prokudin, A., Denisov, M., Sychev, O. (2023). Disk Space Consumption by Triple Storage Systems. In: Krouska, A., Troussas, C., Caro, J. (eds) Novel & Intelligent Digital Systems: Proceedings of the 2nd International Conference (NiDS 2022). NiDS 2022. Lecture Notes in Networks and Systems, vol 556. Springer, Cham. https://doi.org/10.1007/978-3-031-17601-2_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-17601-2_26
Published: 23 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17600-5
Online ISBN: 978-3-031-17601-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Disk Space Consumption by Triple Storage Systems

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Universal Storage Adaption for Distributed RDF-Triple Stores

An empirical study on the evaluation of the RDF storage systems

Challenge Accepted: QUAD Meets MOCHA2017

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Disk Space Consumption by Triple Storage Systems

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Universal Storage Adaption for Distributed RDF-Triple Stores

An empirical study on the evaluation of the RDF storage systems

Challenge Accepted: QUAD Meets MOCHA2017

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation