Abstract
Billions of RDF triples are currently available on the Web through the Linked Open Data cloud (e.g., DBpedia, LinkedGeoData and New York Times). Governments, universities as well as companies (e.g., BBC, CNN) are also producing huge collections of RDF triples and exchanging them through different serialization formats (e.g., RDF/XML, Turtle, N-Triple, etc.). However, RDF descriptions (i.e., graphs and serializations) are verbose in syntax, often contain redundancies, and could be generated differently even when describing the same resources, which would have a negative impact on their processing. Hence, we propose here an approach to clean and eliminate redundancies from such RDF descriptions as a means of transforming different descriptions of the same information into one representation, which can then be tuned, depending on the target application (information retrieval, compression, etc.). Experimental tests show significant improvements, namely in reducing RDF description loading time and file size.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
We use disparities to designate different serializations of the same information.
- 3.
- 4.
Following the W3C Recommendation, we consider that all the prefixes have to be unique for each namespace.
- 5.
DT is a set of datatypes: string, number, date, etc.
- 6.
Lang is a set of language tags: @fr, @en, etc.
- 7.
\(st_{i}^{+}\), \(u_{i}\), \(p_{i}\), \(bn_{i}\), and \(l_{i}\) represent corresponding extended statements, IRIs, predicates, blank nodes, and literals.
- 8.
An unused namespace is a namespace which is mention in the serialization file but which is not use in any of the statements, i.e., it will not appear in the Graph.
- 9.
This is comparable to the notion of map function in [4] except that the authors do not consider namespaces.
- 10.
- 11.
Available at http://rdfn.sigappfr.org/.
References
Belleau, F., et al.: Bio2rdf: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)
Fernández, J.D., et al.: Binary rdf representation for publication and exchange (HDT). J. Web Semant. 19, 22–41 (2013)
Gutierrez, C., et al.: Foundations of semantic web databases. In: PODS 2004, pp. 95–106. ACM (2004)
Gutierrez, C., et al.: Foundations of semantic web databases. J. Comput. Syst. Sci. 77(3), 520–541 (2011)
Hayes, J., Gutierrez, C.: Bipartite graphs as intermediate model for RDF. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 47–61. Springer, Heidelberg (2004)
Jiang, G., et al.: Using semantic web technology to support ICD-11 textual definitions authoring. J. Biomed. Semant. 4, 11 (2013)
Kerzazi, A., et al.: A model-based mediator system for biological data integration. In: Journes Scientifiques en Bio-Informatique, pp. 70–77 (2007)
Kerzazi, A., et al.: A semantic mediation architecture for RDF data integration. In: SWAP, p. 3 (2008)
Longley, D.: RDF dataset normalization (2015). http://json-ld.org/spec/latest/rdf-dataset-normalization/
Nolin, M.-A., et al.: Building an hiv data mashup using Bio2RDF. Briefings Bioinform. 13(1), 98–106 (2012)
Pathak, J., et al.: Lexgrid: a framework for representing, storing, and querying biomedical terminologies from simple to sublime. J. Am. Med. Inform. Assoc. 16(3), 305–315 (2009)
Salameh, K., Tekli, J., Chbeir, R.: SVG-to-RDF image Semantization. In: Traina, A.J.M., Traina Jr., C., Cordeiro, R.L.F. (eds.) SISAP 2014. LNCS, vol. 8821, pp. 214–228. Springer, Heidelberg (2014)
Sporny, M., Longley, D.: RDF graph normalization (2013). http://json-ld.org/spec/ED/rdf-graph-normalization/20111016/
Tao, C., et al.: A RDF-base normalized model for biomedical lexical grid. In: The 8th International Semantic Web Conference, p. 2 (2009)
Ticona-Herrera, R., et al.: Rdf similarity. Technical report (2015). http://rdfn.sigappfr.org/RDFN-TR-15.pdf
Vrandecic, D., et al.: RDF syntax normalization using XML validation. In: Proceedings of the SemRUs, p. 11 (2009)
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. Proc. VLDB Endow. 1(1), 1008–1019 (2008)
Acknowledgments
This work has been partly supported by FINCyT (Fund for Innovation, Science and Technology) of Peru.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ticona-Herrera, R., Tekli, J., Chbeir, R., Laborie, S., Dongo, I., Guzman, R. (2015). Toward RDF Normalization. In: Johannesson, P., Lee, M., Liddle, S., Opdahl, A., Pastor López, Ó. (eds) Conceptual Modeling. ER 2015. Lecture Notes in Computer Science(), vol 9381. Springer, Cham. https://doi.org/10.1007/978-3-319-25264-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-25264-3_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25263-6
Online ISBN: 978-3-319-25264-3
eBook Packages: Computer ScienceComputer Science (R0)