[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/11611257_3guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A formal comparison of visual web wrapper generators

Published: 21 January 2006 Publication History

Abstract

We study the core fragment of the Elog wrapping language used in the Lixto system (a visual wrapper generator) and formally compare Elog to other wrapping languages proposed in the literature.

References

[1]
Abiteboul, S., Buneman, P., and Suciu, D.: Data on the Web. Morgan Kaufmann Publishers (2000)
[2]
Abiteboul, S., and Vianu, V.: Regular Path Queries with Constraints. Journal of Computer and System Sciences 58 3 (1999) 428-452
[3]
Arocena, G., and Mendelzon, A.: WebOQL: Restructuring Documents, Databases, and Webs. In Proceedings of the 14th IEEE International Conference on Data Engineering (ICDE), Orlando, Florida, USA, Feb. 1998
[4]
Arocena, G., Mendelzon, A., and Mihaila, G.: Applications of a Web Query Language. In Proceedings of the 6th International WWW Conference, Santa Clara, California, USA, Apr. 1997
[5]
Atzeni, P., and Mecca, G.: Cut and Paste. In Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS'97), Tucson, AZ USA (1997)
[6]
Azavant, F.: Personal communication, Oct. 2001
[7]
Baumgartner, R., Flesca, S., and Gottlob, G.: Declarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto. In Proc. LPNMR'01, Vienna, Austria (2001)
[8]
Baumgartner, R., Flesca, S., and Gottlob, G.: Visual Web Information Extraction with Lixto. In Proceedings of the 27th International Conference on Very Large Data Bases (VLDB'01) (2001)
[9]
Brüggemann-Klein, A., Murata, M., and Wood, D.: Regular Tree and Regular Hedge Languages over Non-Ranked Alphabets: Version 1, April 3, 2001. Technical Report HKUST-TCSC-2001-05, Hong Kong University of Science and Technology, Hong Kong SAR, China (2001)
[10]
Brüggemann-Klein, A., and Wood, D.: Caterpillars: A Context Specification Technique. Markup Languages 2 1 (2000) 81-106
[11]
Ceri, S., Gottlob, G., and Tanca, L.: Logic Programming and Databases. Springer-Verlag, Berlin (1990)
[12]
Courcelle, B.: Graph Rewriting: An Algebraic and Logic Approach. In J. van Leeuwen, (ed.), Handbook of Theoretical Computer Science, Elsevier Science Publishers B.V. 2, chapter 5 (1999) 193-242
[13]
Doner, J.: Tree Acceptors and some of their Applications. Journal of Computer and System Sciences 4 (1970) 406-451
[14]
Fernandez, M., Siméon, J., Wadler, P., Cluet, S., Deutsch, A., Levy, D.F.A., Maier, D., Robie, J.M.J., Suciu, D., and Widom, J.: XML Query Languages: Experiences and Exemplars (1999) http://www-db.research.bell-labs.com/user/simeon/xquery.html.
[15]
Flum, J., Frick, M., and Grohe, M.: Query Evaluation via Tree-Decompositions. In J. Van den Bussche and V. Vianu (eds), Proc. of the 8th International Conference on Database Theory (ICDT'01), Lecture Notes in Computer Science, Springer, London, UK 1973 (Jan. 2001) 22-38
[16]
Gottlob, G., and Koch, C.: Monadic Datalog and the Expressive Power of Web Information Extraction Languages. In Proceedings of the 21st ACM SIGACTSIGMOD-SIGART Symposium on Principles of Database Systems (PODS'02), Madison, Wisconsin, (2002) 17-28
[17]
Gottlob, G., and Koch, C.: Monadic Queries over Tree-Structured Data. In Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science (LICS), Copenhagen, Denmark, July 2002, 189-202
[18]
Gottlob, G. and Koch, C.: Monadic Datalog and the Expressive Power of Web Information Extraction Languages. Journal of the ACM 51 1 (2003) 74-113
[19]
Gottlob, G., Koch, C., and Pichler, R.: Efficient Algorithms for Processing xpath Queries. ACM Trans. Database Syst. 30 2 (2005) 444-491
[20]
Gottlob, G., Koch, C., Pichler, R., and Segoufin, L.: The Complexity of xpath Query Evaluation and XML Typing. J. ACM 52 2 (2005) 284-335
[21]
Laender, A.H.F., Ribeiro-Neto, B., and da Silva, A.S.: DEByE - Data Extraction By Example. Data and Knowledge Engineering 40 2 (Feb. 2002) 121-154
[22]
Lakshmanan, L.V., Sadri, F., and Subramanian, I.N: A Declarative Language for Querying and Restructuring the World-Wide-Web. In Workshop on Research Issues in Data Engineering (RIDE-NDS'96), New Orleans, USA, Feb. 1996
[23]
Liu, L., Pu, C., and Han, W.: XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources. In Proceedings of the 16th IEEE International Conference on Data Engineering (ICDE), San Diego, USA (2000) 611-621
[24]
http://www.lixto.com.
[25]
Ludäscher, B., Himmeröder, R., Lausen, G., May, W., and Schlepphorst, C.: Managing Semistructured Data with Florid: A Deductive Object-oriented Perspective. Information Systems, 23 8 (1998) 1-25
[26]
Muslea, I., Minton, S., and Knoblock, C.: STALKER: Learning Extraction Rules for Semistructured, Web-based Information Sources. In Proceedings of the AAAI- 98 Workshop on AI and Information Integration. AAAI Press, Menlo Park, CA (1998)
[27]
Neven, F., and Schwentick, T.: Query Automata on Finite Trees. Theoretical Computer Science 275 (2002) 633-674
[28]
Papakonstantinou, Y., Gupta, A., Garcia-Molina, H., and Ullman, J.: A Query Translation Scheme for Rapid Implementation of Wrappers. In Proc. 4th International Conference on Deductive and Object-oriented Databases (DOOD'95), Singapore, Springer (1995) 161-186
[29]
Sahuguet, A., and Azavant, F.: Building Intelligent Web Applications Using Lightweight Wrappers. Data and Knowledge Engineering 36 3 (2001) 283-316
[30]
Thatcher, J., and Wright, J.: Generalized Finite Automata Theory with an Application to a Decision Problem of Second-Order Logic. Mathematical Systems Theory 2 1 (1968) 57-81
[31]
Thomas, W.: Automata on Infinite Objects. In J. van Leeuwen (ed.), Handbook of Theoretical Computer Science, Elsevier Science Publishers B.V., 2, chapter 4 (1990) 133-192
[32]
Thomas, W.: Languages, Automata, and Logic. In G. Rozenberg and A. Salomaa (eds), Handbook of Formal Languages, Springer Verlag 3, chapter 7 (1997) 389-455
[33]
Ullman, J.D.: Principles of Database & Knowledge-Base Systems Vol. 1. Computer Science Press, 1988
[34]
World Wide Web Consortium. XML Path Language (XPath) Recommendation. http://www.w3c.org/TR/xpath/, Nov. 1999

Cited By

View all
  • (2006)The lixto projectProceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling10.1007/11788911_1(1-15)Online publication date: 18-Jul-2006

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
SOFSEM'06: Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
January 2006
574 pages
ISBN:354031198X
  • Editors:
  • Jiří Wiedermann,
  • Gerard Tel,
  • Jaroslav Pokorný,
  • Mária Bieliková,
  • Július Štuller

Sponsors

  • ERCIM: European Research Consortium for Informatics & Mathematics

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 21 January 2006

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2006)The lixto projectProceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling10.1007/11788911_1(1-15)Online publication date: 18-Jul-2006

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media