Abstract
Rank join operators perform a relational join among two or more relations, assign numeric scores to the join results based on the given scoring function and return K join results with the highest scores. The top-K join results are obtained by accessing a subset of data from the input relations. This paper addresses the problem of getting top-K join results from two or more search services which can be accessed in parallel, and are characterized by non negligible response times. The objectives are: i) minimize the time to get top-K join results. ii) avoid the access to the data that does not contribute to the top-K join results.
This paper proposes a multi-way rank join operator that achieves the above mentioned objectives by using a score guided data pulling strategy. This strategy minimizes the time to get top-K join results by extracting data in parallel from all Web services, while it also avoids accessing the data that is not useful to compute top-K join results, by pausing and resuming the data access from different Web services adaptively, based on the observed score values of the retrieved tuples. An extensive experimental study evaluates the performance of the proposed approach and shows that it minimizes the time to get top-K join results, while incurring few extra data accesses, as compared to the state of the art rank join operators.
Chapter PDF
Similar content being viewed by others
References
Nastev, A., Chang, Y., Smith, J.R., Li, C., Vittor, J.S.: Supporting incremental join queries on ranked inputs. In: VLDB Conference
Brockwell, P.J.: Encyclopedia of Quantitative Finance (2010)
Ceri, S., Brambilla, M. (eds.): Search Computing II. LNCS, vol. 6585. Springer, Heidelberg (2011)
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. Journal of Computer and System Sciences 66(4), 614–656 (2003)
Finger, J., Polyzotis, N.: Robust and efficient algorithms for rank join evaluation. In: SIGMOD Conference, pp. 415–428 (2009)
Guntzer, U., Balke, W., Kiessling, W.: Towards efficient multi-feature queries in heterogeneous environments. In: International Conference on Information Technology: Coding and Computing, Proceedings, pp. 622–628 (2001)
Ilyas, I., Aref, W., Elmagarmid, A.: Supporting top-k join queries in relational databases. The VLDB Journal 13(3), 207–221 (2004)
Ilyas, I., Beskales, G., Soliman, M.: A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys 40(4), 1 (2008)
Mamoulis, N., Theodoridis, Y., Papadias, D.: Spatial joins: Algorithms, cost models and optimization techniques. In: Spatial Databases, pp. 155–184 (2005)
Marian, A., Bruno, N., Gravano, L.: Evaluating top- queries over web-accessible databases. ACM Trans. Database Syst. 29(2), 319–362 (2004)
Martinenghi, D., Tagliasacchi, M.: Proximity rank join. In: PVLDB, vol. 3(1), pp. 352–363 (2010)
Schnaitter, K., Polyzotis, N.: Optimal algorithms for evaluating rank joins in database systems. ACM Trans. Database Syst. 35(1) (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Abid, A., Tagliasacchi, M. (2011). Parallel Data Access for Multiway Rank Joins. In: Auer, S., Díaz, O., Papadopoulos, G.A. (eds) Web Engineering. ICWE 2011. Lecture Notes in Computer Science, vol 6757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22233-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-22233-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22232-0
Online ISBN: 978-3-642-22233-7
eBook Packages: Computer ScienceComputer Science (R0)