[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1739041.1739138acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Querying the deep web

Published: 22 March 2010 Publication History

Abstract

Data stored outside Web pages and accessible from the Web, typically through HTML forms, consitute the so-called Deep Web. Such data are of great value, but difficult to query and search. We survey techniques to optimize query processing on the Deep Web, in a setting where data are represented in the relational model. We illustrate optimizations both at query plan generation time and at runtime, highlighting the role of integrity constraints. We discuss several prototype systems that address the query processing problem.

References

[1]
Michael J. Cafarella, Alon Y. Halevy, and Nodira Khoussainova. Data integration for the relational web. PVLDB, 2(1):1090--1101, 2009.
[2]
Andrea Calì, Diego Calvanese, and Davide Martinenghi. Dynamic query optimization under access limitations and dependencies. Journal of Universal Computer Science, 15(21):33--62, 2009.
[3]
Andrea Calì and Davide Martinenghi. Conjunctive query containment under access limitations. In Proc. of ER, pages 326--340, 2008.
[4]
Andrea Calì and Davide Martinenghi. Querying data under access limitations. In Proc. of ICDE, pages 50--59, 2008.
[5]
Piotr Dembinski and Jan Maluszynski. And-parallelism with intelligent backtracking for annotated logic programs. In Proc. of Symp. on Logic Programming, pages 29--38, 1985.
[6]
Alin Deutsch, Bertram Ludäscher, and Alan Nash. Rewriting queries using views with access patterns under integrity constraints. Theoretical Computer Science, 371(3):200--226, 2007.
[7]
Oliver M. Duschka and Alon Y. Levy. Recursive plans for information gathering. In Proc. of IJCAI, pages 778--784, 1997.
[8]
Daniela Florescu, Alon Y. Levy, Ioana Manolescu, and Dan Suciu. Query optimization in the presence of limited access patterns. In Proc of SIGMOD, pages 311--322, 1999.
[9]
Georg Gottlob, Christoph Koch, Robert Baumgartner, Marcus Herzog, and Sergio Flesca. The Lixto data extraction project -- Back and forth between theory and practice. In Proc. of PODS, pages 1--12, 2004.
[10]
Alon Y. Halevy. Answering queries using views: A survey. VLDB Journal, 10(4):270--294, 2001.
[11]
Bin He, Zhen Zhang, and Kevin Chen-Chuan Chang. Metaquerier: querying structured web sources on-the-fly. In Proc. of SIGMOD, pages 927--929, 2005.
[12]
Hai He, Weiyi Meng, Clement T. Yu, and Zonghuan Wu. Wise-integrator: A system for extracting and integrating complex web search interfaces of the deep web. In Proc. of VLDB, pages 1314--1317, 2005.
[13]
Chen Li. Computing complete answers to queries in the presence of limited access patterns. VLDB Journal, 12(3):211--227, 2003.
[14]
Chen Li and Edward Chang. Answering queries with useful bindings. ACM TODS, 26(3):313--343, 2001.
[15]
Bertram Ludäscher and Alan Nash. Processing union of conjunctive queries with negation under limited access patterns. In Proc. of EDBT, pages 422--440, 2004.
[16]
Jayant Madhavan, Loredana Afanasiev, Lyublena Antova, and Alon Y. Halevy. Harnessing the deep web: Present and future. In Proc. of CIDR, 2009.
[17]
Todd D. Millstein, Alon Y. Halevy, and Marc Friedman. Query containment for data integration systems. JCSS, 66(1):20--39, 2003.
[18]
Alan Nash and Bertram Ludäscher. Processing first-order queries under limited access patterns. In Proc. of PODS, pages 307--318, 2004.
[19]
Sriram Raghavan and Hector Garcia-Molina. Crawling the hidden web. In Proc. of VLDB, pages 129--138, 2001.
[20]
Anand Rajaraman, Yehoshua Sagiv, and Jeffrey D. Ullman. Answering queries using templates with binding patterns. In Proc. of PODS, pages 105--112, 1995.
[21]
Guizhen Yang, Michael Kifer, and Vinay K. Chaudhri. Efficiently ordering subgoals with access constraints. In Proc. of PODS, pages 22--22, 2006.
[22]
Ramana Yerneni, Chen Li, Hector Garcia-Molina, and Jeffrey D. Ullman. Computing capabilities of mediators. In Proc. of SIGMOD, pages 443--454, 1999.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
EDBT '10: Proceedings of the 13th International Conference on Extending Database Technology
March 2010
741 pages
ISBN:9781605589459
DOI:10.1145/1739041
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 March 2010

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

EDBT/ICDT '10
EDBT/ICDT '10: EDBT/ICDT '10 joint conference
March 22 - 26, 2010
Lausanne, Switzerland

Acceptance Rates

Overall Acceptance Rate 7 of 10 submissions, 70%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2019)RED: Redundancy-Driven Data Extraction from Result Pages?The World Wide Web Conference10.1145/3308558.3313529(605-615)Online publication date: 13-May-2019
  • (2019)Deep Web crawlingWorld Wide Web10.1007/s11280-018-0602-122:4(1577-1610)Online publication date: 1-Jul-2019
  • (2019)Dataset search: a surveyThe VLDB Journal10.1007/s00778-019-00564-xOnline publication date: 24-Aug-2019
  • (2018)Harvesting Deep Web Data Through Produser InvolvementThe Dark Web10.4018/978-1-5225-3163-0.ch009(175-198)Online publication date: 2018
  • (2016)Crawling Hidden Objects with kNN QueriesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.250294728:4(912-924)Online publication date: 1-Apr-2016
  • (2016)Efficiently Estimating Statistics of Points of Interests on MapsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.248039728:2(425-438)Online publication date: 1-Feb-2016
  • (2016)The role of developers’ social relationships in improving service selectionInternational Journal of Web Information Systems10.1108/IJWIS-04-2016-002212:4(477-503)Online publication date: 7-Nov-2016
  • (2016)Keyword Queries over the Deep WebConceptual Modeling10.1007/978-3-319-46397-1_20(260-268)Online publication date: 7-Oct-2016
  • (2015)Developers' networks contribution to web application designProceedings of the 17th International Conference on Information Integration and Web-based Applications & Services10.1145/2837185.2837241(1-10)Online publication date: 11-Dec-2015
  • (2015)A Structured Query Model for the Deep Relational WebProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806589(1679-1682)Online publication date: 17-Oct-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media