Abstract
We consider peer-to-peer data management systems (PDMS), where each peer maintains mappings between its schema and some acquaintances, along with social links with peer friends. In this context, we deal with reformulating conjunctive queries from a peer’s schema into other peer’s schemas. Precisely, queries against a peer node are rewritten into queries against other nodes using schema mappings thus obtaining query rewritings. Unfortunately, not all the obtained rewritings are relevant to a given query, as the information gain may be negligible or the peer is not worth exploring. On the other hand, the existence of social links with peer friends might be useful to get relevant rewritings. Therefore, we propose a new notion of ‘relevance’ of a query with respect to a mapping that encompasses both a local relevance (the relevance of the query w.r.t. the mapping) and a global relevance (the relevance of the query w.r.t. the entire network). Based on this notion, we have conceived a new query reformulation approach for social PDMS which achieves great accuracy and flexibility. To this purpose, we combine several techniques: (i) social links are expressed as FOAF (Friend of a Friend) links to characterize peer’s friendship; (ii) concise mapping summaries are used to obtain mapping descriptions; (iii) local semantic views (LSV) are special views that contain information about mappings captured from the network by using gossiping techniques. Our experimental evaluation, based on a prototype on top of PeerSim and a simulated network demonstrate that our solution yields greater recall, compared to traditional query translation approaches proposed in the literature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
This example has been inspired by a web-based online trusted physician network, https://www.ozmosis.com/home, ‘where good doctors go to become great doctors’.
- 2.
Notice that the relevance of a mapping is useful to discriminate the importance of such mapping wrt other mappings. Indeed, the relevance is a criterion to rank mappings in order to choose the best ones towards which the query has to be translated.
- 3.
Notice that a friendship link between two peers is a symmetric relationship and does not imply that such peers have to be acquaintances with each other.
- 4.
Notice that we do not tackle the problem of merging results after applying the query rewriting to the acquaintance’s database, but we take the union of these results.
- 5.
Notice that we do not also assume the existence of the mapping \(\mathcal{M}_{ji}\), from peer \(p_j\) to peer \(p_i\).
- 6.
The mapping statements have been omitted from Fig. 4 to avoid clutter.
- 7.
In DHTs or structured P2P networks, on which PDMS are based, a unique key identifier is assigned to each peer and object. IDs associated with objects are mapped through the DHT protocol to the peer responsible for that object. In our setting, each object is a mapping.
References
Aberer, K., Cudré-Mauroux, P., Hauswirth, M.: The chatty web: emergent semantics through gossiping. In: WWW (2003)
Arenas, M., Pérez J, Reutter, J.L., Riveros, C.: Foundations of schema mapping management. In: PODS, pp. 227–238. ACM, New York (2010)
Arenas, M., PTrez, J., Riveros, C.: The Recovery of a Schema Mapping: Bringing Exchanged Data Back. In: ACM PODS, pp. 13–22 (2008)
Bizer, C., Heath, T., Berners-Lee, T.: Linked data - the story so far. Int. J. Semantic Web Inf. Syst. 5(3), 1–22 (2009)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Bonifati, A., Chang, E.Q., Ho, T., Lakshmanan, L.V.S., Pottinger, R., Chung, Y.: Schema mapping and query translation in heterogeneous P2P XML databases. VLDB J. 19(2), 231–256 (2010)
Bonifati, A., Chrysanthis, P.K., Ouksel, A.M., Sattler, K.: Distributed databases and peer-to-peer databases: past and present. SIGMOD Record 37(1), 5–11 (2008)
Bonifati, A., Liu, R., Wang, H(.W).: Distributed and secure access control in P2P databases. In: Foresti, S., Jajodia, S. (eds.) Data and Applications Security and Privacy XXIV. LNCS, vol. 6166, pp. 113–129. Springer, Heidelberg (2010)
Bonifati, A., Mecca, G., Pappalardo, A., Raunich, S., Summa, G.: Schema mapping verification: the spicy way. In: EDBT, pp. 85–96 (2008)
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A., Rosati, R., Ruzzi, M.: Data integration through \({{DL-Lite}_{\cal {A}}}\) ontologies. In: Schewe, K.-D., Thalheim, B. (eds.) SDKB 2008. LNCS, vol. 4925, pp. 26–47. Springer, Heidelberg (2008)
Fagin, R.: Inverting Schema Mappings. ACM TODS 32(4), 25:1–25:53 (2007)
Fagin, R., Haas, L.M., Hernández, M., Miller, R., Popa, L., Velegrakis, Y.: Clio: schema mapping creation and data exchange. In: Borgida, A.T., Chaudhri, V.K., Giorgini, P., Yu, E.S. (eds.) Conceptual Modeling: Foundations and Applications. LNCS, vol. 5600, pp. 198–236. Springer, Heidelberg (2009)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. TCS 336(1), 89–124 (2005)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005)
Fan, L., Cao, P., Almeida, J.M., Broder, A.Z.: Summary cache: a scalable wide-area web cache sharing protocol. In: SIGCOMM, pp. 254–265 (1998)
Halevy, A.Y., Ives, Z.G., Suciu, D., Tatarinov, I.: Schema mediation for large-scale semantic data sharing. VLDB J. 14(1), 68–83 (2005)
Hose, K., Roth, A., Zeitz, A., Sattler, K., Naumann, F.: A research agenda for query processing in large-scale peer data management systems. Inf. Syst. 33(7–8), 597–610 (2008)
Ives, Z.G., Halevy, A.Y., Mork, P., Tatarinov, I.: Piazza: mediation and integration infrastructure for Semantic Web data. J. Web Sem. 1(2), 155–175 (2004)
Kantere, V., Tsoumakos, D., Sellis, T.K., Roussopoulos, N.: GrouPeer: Dynamic clustering of P2P databases. Inf. Syst. 34(1), 62–86 (2009)
Kementsietsidis, A., Arenas, M., Miller, R.J.: Mapping data in peer-to-peer systems: semantics and algorithmic issues. In: SIGMOD, pp. 325–336 (2003)
Kermarrec, A., van Steen, M.: Gossiping in distributed systems. Operating Systems Review 41(5), 2–7 (2007)
Koloniari, G., Pitoura, E.: Content-based routing of path queries in peer-to-peer systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 29–47. Springer, Heidelberg (2004)
Lenzerini, M.: Data integration: a theoretical perspective. In: ACM PODS, pp. 233–246 (2002)
Levy, A.Y., Mendelzon, A.O., Sagiv, Y., Srivastava, D.: Answering queries using views. In: PODS (1995)
The Peersim simulator. http://peersim.sf.net
Popa, L., Velegrakis, Y., Miller, R.J., Hernandez, M.A., Fagin, R.: Translating web data. In: VLDB (2002)
Pottinger, R., Halevy, A.Y.: Minicon: a scalable algorithm for answering queries using views. VLDB J. 10(2–3), 182–198 (2001)
Bai X., Bertier, M., Guerraoui, R., Kermarrec, A.M., Leroy, V.: Gossiping personalized queries. In: EDBT, pp. 87–98 (2010)
Yu, C., Popa, L.: Constraint-based XML query rewriting for data integration. In: SIGMOD (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bonifati, A., Summa, G., Pacitti, E., Draidi, F. (2014). Query Reformulation in PDMS Based on Social Relevance. In: Hameurlain, A., Küng, J., Wagner, R. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XIII. Lecture Notes in Computer Science(), vol 8420. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54426-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-54426-2_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54425-5
Online ISBN: 978-3-642-54426-2
eBook Packages: Computer ScienceComputer Science (R0)