Abstract
A query over RDF data is usually expressed in terms of matching between a graph representing the target and a huge graph representing the source. Unfortunately, graph matching is typically performed in terms of subgraph isomorphism, which makes semantic data querying a hard problem. In this paper we illustrate a novel technique for querying RDF data in which the answers are built by combining paths of the underlying data graph that align with paths specified by the query. The approach is approximate and generates the combinations of the paths that best align with the query. We show that, in this way, the complexity of the overall process is significantly reduced and verify experimentally that our framework exhibits an excellent behavior with respect to other approaches in terms of both efficiency and effectiveness.
Similar content being viewed by others
Notes
A prototype application is available at https://www.dropbox.com/sh/d5u1u24qnyqg18f/7Oefq8-qVa.
A prototype application is available at https://www.dropbox.com/sh/d5u1u24qnyqg18f/7Oefq8-qVa.
At https://www.dropbox.com/sh/d5u1u24qnyqg18f/7Oefq8-qVa you can find the complete set of queries.
References
De Virgilio, R., Giunchiglia, F., Tanca, L. (eds.): Semantic Web Information Management—A Model-Based Perspective. Springer, Berlin (2010)
De Virgilio, R., Guerra, F., Velegrakis, Y. (eds.): Semantic Search Over the Web. Springer, Berlin, Heidelberg (2012)
De Virgilio, R., Orsi, G., Tanca, L., Torlone, R.: Nyaya: A system supporting the uniform management of large sets of semantic data. In: ICD., pp. 1309–1312. (2012)
Bröcheler, M., Pugliese, A., Subrahmanian, V.S.: Dogma: A disk-oriented graph matching algorithm for rdf databases. In: ISWC, pp. 97–113. (2009)
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractable to polynomial time. Proc. VLDB Endow. 3(1), 264–275 (2010)
Zhang, S., Yang, J., Jin, W.: Sapper: subgraph indexing and approximate matching in large graphs. Proc. VLDB Endow. 3(1), 1185–1194 (2010)
Wood, P.T.: Query languages for graph databases. SIGMOD Rec. 41(1), 50–60 (2012)
Gallagher, B.: Matching structure and semantics : A survey on graph-based pattern matching. In: Artificial Intelligence, pp. 45–53. (2006)
Zhang, S., Li, S., Yang, J.: Gaddi: distance index based subgraph matching in biological networks. In: EDBT, pp. 192–203. (2009)
Khan, A., Li, N., Yan, X., Guan, Z., Chakraborty, S., Tao, S.: Neighborhood based fast graph search in large networks. In: SIGMOD, pp. 901–912. (2011)
Iordanov, B.: Hypergraphdb: A generalized graph database. In: WAIM Workshops, pp. 25–36. (2010)
Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. The MIT Press, Cambridge (1998)
Hassanzadeh, O., Consens, M.P.: Linked movie data base (triplification challenge report). In: I-SEMANTICS, pp. 194–196 (2008)
Bizer, C., Schultz, A.: The Berlin sparql benchmark. Int. J. Semant. Web. Inf. Syst. 5(2), 1–24 (2009)
Guo, Y., Pan, Z., Heflin, J.: Lubm: a benchmark for owl knowledge base systems. J. Web. Semant. 3(2–3), 158–182 (2005)
Ma, L., Yang, Y., Qiu, Z., Xie, G.T., Pan, Y., Liu, S.: Towards a complete owl ontology benchmark. In: ESWC, pp. 125–139. (2006)
Cappellari, P., De Virgilio, R., Maccioni, A., Roantree, M.: A path-oriented rdf index for keyword search query processing. In: DEXA, pp. 366–380. (2011)
Zou, L., Chen, L., Özsu, M.T.: Distance-join: pattern match query in a large graph database. Proc. VLDB Endow. 2(1), 886–897 (2009)
Fan, W., Bohannon, P.: Information preserving xml schema embedding. ACM Trans. Database Syst. 33(1) (2008)
Tran, T., Wang, H., Rudolph, S., Cimiano, P.: Top-k exploration of query candidates for efficient keyword search on graph-shaped (rdf) data. In: ICDE Conference, pp. 405–416 (2009)
Neumann, T., Weikum, G.: x-rdf-3x: fast querying, high update rates, and consistency for rdf databases. Proc. VLDB Endow. 3(1), 256–263 (2010)
Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: SIGMOD, pp. 335–346. (2004)
Zhang, S., Hu, M., Yang, J.: Treepi: A novel graph indexing method. In: ICDE, pp. 966–975. (2007)
Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: towards verification-free query processing on graph databases. In: SIGMOD, pp. 857–872. (2007)
Tian, Y., Patel, J.M.: Tale: A tool for approximate large graph matching. In: ICDE, pp. 963–972. (2008)
Zeng, Z., Tung, A.K.H., Wang, J., Feng, J., Zhou, L.: Comparing stars: on approximating graph edit distance. Proc. VLDB Endow. 2(1), 25–36 (2009)
Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-hop: a high-compression indexing scheme for reachability query. In: SIGMOD, pp. 813–826. (2009)
Poulovassilis, A., Wood, P.T.: Combining approximation and relaxation in semantic web path queries. In: ISWC, pp. 631–646. (2010)
Chan, E.P.F., Lim, H.: Optimization and evaluation of shortest path queries. VLDB J. 16(3), 343–369 (2007)
Hu, W., Jian, N., Qu, Y., Wang, Y.: Gmo: A graph matching for ontologies. In: Integrating Ontologies. (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Haixun Wang and Jeffrey Xu Yu.
Rights and permissions
About this article
Cite this article
De Virgilio, R., Maccioni, A. & Torlone, R. Approximate querying of RDF graphs via path alignment. Distrib Parallel Databases 33, 555–581 (2015). https://doi.org/10.1007/s10619-014-7142-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-014-7142-1