[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1378773.1378796acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

Transcendence: enabling a personal view of the deep web

Published: 13 January 2008 Publication History

Abstract

A wealth of structured, publicly-available information exists in the deep web but is only accessible by querying web forms. As a result, users are restricted by the interfaces provided and lack a convenient mechanism to express novel and independent extractions and queries on the underlying data. Transcendence enables personalized access to the deep web by enabling users to partially reconstruct web databases in order to perform new types of queries. From just a few examples, Transcendence helps users produce a large number of values for form input fields by using unsupervised information extraction and collaborative filtering of user suggestions. Structural and semantic analysis of returned pages finds individual results and identifies relevant fields. Users may revise automated decisions, balancing the power of automation with the errors it can introduce. In a user evaluation, both programmers and non-programmers found Transcendence to be a powerful way to explore deep web resources and wanted to use it in the future.

References

[1]
Google Maps API. http://www.google.com/apis/maps/.
[2]
Google Sets. http://labs.google.com/sets/.
[3]
Google Suggest. http://labs.google.com/suggest/.
[4]
JFreeChart. http://www.jfree.org/jfreechart/.
[5]
Piggy bank. http://simile.mit.edu/piggy-bank/.
[6]
Solvent. http://simile.mit.edu/solvent.
[7]
XML path language (XPath) version 1.0.
[8]
Yahoo pipes. Yahoo! Inc. (2007). http://pipes.yahoo.com/.
[9]
Bergman, M. K. The deep web: Surfacing hidden value. The Journal of Electronic Publishing, 7, 1.
[10]
Bolin, M., Webber, M., Rha, P., Wilson, T., and Miller, R. C. Automation and customization of rendered web pages. In Proc. of the 18th Symp. on User Interface Software and Technology (UIST '05). Seattle, WA, USA, 2005, 163--172.
[11]
Burke, R. Integrating knowledge-based and collaborative-filtering recommender systems. In Proc. of the Workshop on AI and Electronic Commerce (AAAI '99).
[12]
Cafarella, M. J., Downey, D., Soderland, S., and Etzioni, O. Knowitnow: fast, scalable information extraction from the web. In Proc. of the Conf. on Human Language Technology and Empirical Methods in Natural Language Processing (HLT '05). Association for Computational Linguistics, Morristown, NJ, USA, 2005, 563--570.
[13]
Chang, K. C.-C. and He, B. Toward large scale integration: Building a metaquerier over databases on the web. In In Proc. of the 2nd Conf. on Innovative Data Systems Research. 2005.
[14]
Doan, A., Domingos, P., and Halevy, A. Y. Reconciling schemas of disparate data sources: a machine-learning approach. In Proc. of the 2001 ACM SIGMOD Intl. Conf. on Management of data (SIGMOD '01). 2001, 509--520.
[15]
Dontcheva, M., Drucker, S. M., Wade, G., Salesin, D., and Cohen, M. F. Summarizing personal web browsing sessions. In Proc. of the 19th annual ACM symposium on User interface software and technology (UIST '06). New York, NY, USA, 2006, 115--124.
[16]
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D. S., and Yates, A. Methods for domain-independent information extraction from the web: an experimental comparison. In Proc. of the 19th Natl. Conf. on Artificial Intelligence (AAAI '04). 2004.
[17]
Faaborg, A. and Lieberman, H. A goal-oriented web browser. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems (CHI '06). Montreal, Quebec, Canada, 2006, 751--760.
[18]
Fujima, J., Lunzer, A., Hornbaek, K., and Tanaka, Y. Clip, connect, clone: combining application elements to build custom interfaces for information access. In Proc. of the 17th Symp. on User Interface Software and Technology (UIST '04). 2004, 175--184.
[19]
Hartmann, B., Wu, L., Collins, K., and Klemmer, S. Programming by a sample: Rapidly prototyping web applications with d.mix. In In Proceeding of the 20th Symp. on User Interface Software and Technology (UIST '07). Newport, RI, USA, 2007.
[20]
Horvitz, E. Principles of mixed-initiative user interfaces. In Proc. of the SIGCHI Conf. on Human factors in computing systems (CHI '99). 1999, 159--166.
[21]
Huynh, D. F., Miller, R. C., and Karger, D. R. Enabling web browsers to augment web sites' filtering and sorting functionalities. In Proc. of the 19th Symp. on User Interface Software and Technology (UIST '06). ACM Press, New York, NY, USA, 2006, 125--134.
[22]
Jung, H., Allen, J., Chambers, N., Galescu, L., Swift, M., and Taysom, W. One-shot procedure learning from instruction and observation. In Proc. of the Intl. FLAIRS Conf.: Special Track on Natural Language and Knowledge Representation.
[23]
Little, G., Lau, T. A., Cypher, A., Lin, J., Haber, E. M., and Kandogan, E. Koala: capture, share, automate, personalize business processes on the web. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems (CHI '07). 2007, 943--946.
[24]
Madhavan, J., Halevy, A., Cohen, S., Dong, X., Jeffrey, S. R., Ko, D., and Yu, C. Structured data meets the web: A few observations. IEEE Computer Society: Bulletin of the Technical Committee on Data Engineering, 31, 4 (2006), 10--18.
[25]
Miller, R. C. and Myers, B. Creating dynamic world wide web pages by demonstration (1997).
[26]
Mukherjee, S., Yang, G., Tan, W., and Ramakrishnan, I. Automatic discovery of semantic structures in html documents. In Proc. of the Intl. Conf. on Document Analysis and Recognition (ICDAR '03). 2003.
[27]
Ntoulas, A., Zerfos, P., and Cho, J. Downloading textual hidden web content through keyword queries. In Proc. of the 5th ACM/IEEE-CS joint Conf. on Digital libraries. 1995, 100--109.
[28]
Pilgrim, M., ed. Greasemonkey Hacks: Tips & Tools for Remixing the Web with Firefox. O'Reilly Media, 2005.
[29]
Raghavan, S. and Garcia-Molina, H. Crawling the hidden web. In Proc. of the 27th Intl. Conf. on Very Large Databases (VLDB '01). 2001.
[30]
Tuchinda, R., Szekely, P., and Knoblock, C. A. Building data integration queries by demonstration. In Proc. of the 12th Intl. Conf. on Intelligent User Interfaces (IUI '07). ACM Press, New York, NY, USA, 2007, 170--179.
[31]
Turner, S. R. Playtpus firefox extension (2006). http://platypus.mozdev.org/.
[32]
Wong, J. and Hong, J. I. Making mashups with marmite: towards end-user programming for the web. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems (CHI '07). ACM Press, 2007, 1435--1444.

Cited By

View all
  • (2022)SoK: An Evaluation of the Secure End User Experience on the Dark Net through Systematic Literature ReviewJournal of Cybersecurity and Privacy10.3390/jcp20200182:2(329-357)Online publication date: 27-May-2022
  • (2014)Code you can use: Searching for web automation scripts based on reusability2014 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VLHCC.2014.6883027(81-88)Online publication date: Jul-2014
  • (2013)OXPathThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-012-0286-622:1(47-72)Online publication date: 1-Feb-2013
  • Show More Cited By

Index Terms

  1. Transcendence: enabling a personal view of the deep web

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '08: Proceedings of the 13th international conference on Intelligent user interfaces
    January 2008
    458 pages
    ISBN:9781595939876
    DOI:10.1145/1378773
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 January 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep web
    2. information extraction
    3. web forms

    Qualifiers

    • Research-article

    Conference

    IUI08
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Upcoming Conference

    IUI '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)SoK: An Evaluation of the Secure End User Experience on the Dark Net through Systematic Literature ReviewJournal of Cybersecurity and Privacy10.3390/jcp20200182:2(329-357)Online publication date: 27-May-2022
    • (2014)Code you can use: Searching for web automation scripts based on reusability2014 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VLHCC.2014.6883027(81-88)Online publication date: Jul-2014
    • (2013)OXPathThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-012-0286-622:1(47-72)Online publication date: 1-Feb-2013
    • (2011)The OXPath to success in the deep webProceedings of the 20th international conference companion on World wide web10.1145/1963192.1963352(409-414)Online publication date: 28-Mar-2011
    • (2011)OXPathProceedings of the 20th international conference companion on World wide web10.1145/1963192.1963304(261-264)Online publication date: 28-Mar-2011
    • (2009)Mining web interactions to automatically create mash-upsProceedings of the 22nd annual ACM symposium on User interface software and technology10.1145/1622176.1622215(203-212)Online publication date: 4-Oct-2009
    • (2009)Attaching UI enhancements to websites with end usersProceedings of the SIGCHI Conference on Human Factors in Computing Systems10.1145/1518701.1518987(1859-1868)Online publication date: 4-Apr-2009
    • (2009)Context-based page unit recommendation for web-based sensemaking tasksProceedings of the 14th international conference on Intelligent user interfaces10.1145/1502650.1502668(107-116)Online publication date: 8-Feb-2009
    • (2009)End-user programming of mashups with vegemiteProceedings of the 14th international conference on Intelligent user interfaces10.1145/1502650.1502667(97-106)Online publication date: 8-Feb-2009
    • (2009)Information on the webProceedings of the 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VLHCC.2009.5295240(262-263)Online publication date: 20-Sep-2009
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media