[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/645927.672370dlproceedingsArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

RoadRunner: Towards Automatic Data Extraction from Large Web Sites

Published: 11 September 2001 Publication History

Abstract

No abstract available.

References

[1]
{1} B. Adelberg. NoDoSE - a tool for semiautomatically extracting structured and semistructured data from text documents. In SIGMOD'98.
[2]
{2} P. Atzeni and G. Mecca. Cut and Paste. In PODS'97.
[3]
{3} P. Atzeni, G. Mecca, and P. Merialdo. To Weave the Web. In VLDB'97.
[4]
{4} V. Crescenzi and G. Mecca. Grammars have exceptions. Information Systems, 23(8), 1998.
[5]
{5} D. W. Embley, D. M. Campbell, Y. S. Jiang, S. W. Liddle, Y. Ng, D. Quass, and R. D. Smith A conceptual-modeling approach to extracting data from the web. In ER'98.
[6]
{6} E. M. Gold. Language identification in the limit. Information and Control, 10(5), 1967.
[7]
{7} E. M. Gold. Complexity of automation identification from given data. Information and Control, 37(3), 1978.
[8]
{8} S. Grumbach and G. Mecca. In search of the lost schema. In ICDT'99.
[9]
{9} J. Hammer, H. Garcia-Molina, J. Cho, R. Aranha, and A. Crespo. Extracting semistructured information from the Web. In Proc. of the Workshop on the Management of Semistructured Data, 1997.
[10]
{10} C. Hsu and M. Dung. Generating finite-state transducers for semistructured data extraction from the web. Information Systems, 23(8), 1998.
[11]
{11} G. Huck, P. Frankhauser, K. Aberer, and E. J. Neuhold. Jedi: Extracting and synthesizing information from the web. In CoopIS'98.
[12]
{12} N. Kushmerick. Wrapper induction: Efficiency and expressiveness. Artificial Intelligence, 118, 2000.
[13]
{13} N. Kushmerick, D. S. Weld, and R. Doorenbos. Wrapper induction for information extraction. In IJCAI'97.
[14]
{14} I. Muslea, S. Minton, and C. A. Knoblock. A hierarchical approach to wrapper induction. In Proc. of Autonomous Agents, 1999.
[15]
{15} L. Pitt. Inductive inference, DFAs and computational complexity. In K. P. Jantke, editor, Analogical and Inductive Inference, Lecture Notes in AI 397. Springer-Verlag, Berlin, 1989.
[16]
{16} B. A. Ribeiro-Neto, A. Laender, and A. Soares da Silva. Extracting semistructured data through examples. In CIKM'99.
[17]
{17} A. Sahuguet and F. Azavant. Web ecology: Recycling HTML pages as XML documents using W4F. In WebDB'99.
[18]
{18} S. Soderland. Learning information extraction rules for semistructured and free text. Machine Learning, 34(1-3), 1999.
[19]
{19} P. H. Winston. Artificial Intelligence. Addison-Wesley, 1979.

Cited By

View all
  • (2021)The smallest extraction problemProceedings of the VLDB Endowment10.14778/3476249.347629314:11(2445-2458)Online publication date: 27-Oct-2021
  • (2021)Web question answering with neurosymbolic program synthesisProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454047(328-343)Online publication date: 19-Jun-2021
  • (2020)FreeDOM: A Transferable Neural Architecture for Structured Information Extraction on Web DocumentsProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403153(1092-1102)Online publication date: 23-Aug-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
VLDB '01: Proceedings of the 27th International Conference on Very Large Data Bases
September 2001
709 pages

Publisher

Morgan Kaufmann Publishers Inc.

San Francisco, CA, United States

Publication History

Published: 11 September 2001

Qualifiers

  • Article

Conference

VLDB01

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)The smallest extraction problemProceedings of the VLDB Endowment10.14778/3476249.347629314:11(2445-2458)Online publication date: 27-Oct-2021
  • (2021)Web question answering with neurosymbolic program synthesisProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454047(328-343)Online publication date: 19-Jun-2021
  • (2020)FreeDOM: A Transferable Neural Architecture for Structured Information Extraction on Web DocumentsProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403153(1092-1102)Online publication date: 23-Aug-2020
  • (2020)A browserless architecture for extracting web pricesProceedings of the 35th Annual ACM Symposium on Applied Computing10.1145/3341105.3373850(2193-2200)Online publication date: 30-Mar-2020
  • (2020)T-REXProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3412133(2073-2076)Online publication date: 19-Oct-2020
  • (2020)Online Programming Education Modeling and Knowledge TracingKnowledge Science, Engineering and Management10.1007/978-3-030-55130-8_23(259-270)Online publication date: 28-Aug-2020
  • (2019)Visual Segmentation for Information Extraction from Heterogeneous Visually Rich DocumentsProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3319867(247-262)Online publication date: 25-Jun-2019
  • (2018)CERESProceedings of the VLDB Endowment10.14778/3231751.323175811:10(1084-1096)Online publication date: 1-Jun-2018
  • (2018)Towards extracting web API specifications from documentationProceedings of the 15th International Conference on Mining Software Repositories10.1145/3196398.3196411(454-464)Online publication date: 28-May-2018
  • (2018)Navigating the Data Lake with DATAMARANProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3183746(943-958)Online publication date: 27-May-2018
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media