[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1242572.1242677acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Hierarchical, perceptron-like learning for ontology-based information extraction

Published: 08 May 2007 Publication History

Abstract

Recent work on ontology-based Information Extraction (IE) has tried to make use of knowledge from the target ontology in order to improve semantic annotation results. However, very few approaches exploit the ontology structure itself, and those that do so, have some limitations. This paper introduces a hierarchical learning approach for IE, which uses the target ontology as an essential part of the extraction process, by taking into account the relations between concepts. The approach is evaluated on the largest available semantically annotated corpus. The results demonstrate clearly the benefits of using knowledge from the ontology as input to the information extraction process. We also demonstrate the advantages of our approach over other state-of-the-art learning systems on a commonly used benchmark dataset.

References

[1]
M. E. Califf. Relational Learning Techniques for Natural Language Information Extraction. PhD thesis, University of Texas at Austin, 1998.
[2]
N. Cesa-Bianchi, C. Gentile, A. Tironi, and L. Zaniboni. Incremental Algorithms for Hierarchical Classification. In Neural Information Processing Systems, 2004.
[3]
H. L. Chieu and H. T. Ng. A Maximum Entropy Approach to Information Extraction from Semi-Structured and Free Text. In Proceedings of the Eighteenth National Conference on Artificial Intelligence, pages 786--791, 2002.
[4]
N. Chinchor. Muc-4 evaluation metrics. In Proceedings of the Fourth Message Understanding Conference, pages 22--29, 1992.
[5]
P. Cimiano, S. Handschuh, and S. Staab. Towards the Self-Annotating Web. In Proceedings of WWW'04, 2004.
[6]
F. Ciravegna and Y. Wilks. Designing Adaptive Information Extraction for the Semantic Web in Amilcare. In S. Handschuh and S. Staab, editors, Annotation for the Semantic Web. IOS Press, Amsterdam, 2003.
[7]
H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02), 2002.
[8]
O. Dekel, J. Keshet, and Y. Singer. Large Margin Hierarchical Classification. In Proceedings of the 21st International Conference on Machine Learning (ICML-2004), Canada, 2004.
[9]
S. Dill, J. A. Tomlin, J. Y. Zien, N. Eiron, D. Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, and A. Tomkins. SemTag and Seeker: Bootstrapping the semantic web via automated semantic annotation. In Proceedings of the 12th International Conference on World Wide Web (WWW2003), pages 178--186, Budapest, Hungary, May 2003.
[10]
J. Domingue, M. Dzbor, and E. Motta. Magpie: Supporting Browsing and Navigation on the Semantic Web. In N. Nunes and C. Rich, editors, Proceedings ACM Conference on Intelligent User Interfaces (IUI), pages 191--197, 2004.
[11]
D. Freigtag and A. K. McCallum. Information Extraction with HMMs and Shrinkage. In Proceesings of Workshop on Machine Learnig for Information Extraction, pages 31--36, 1999.
[12]
D. Freitag. Machine Learning for Information Extraction in Informal Domains. Machine Learning, 39(2/3):169--202, 2000.
[13]
D. Freitag and N. Kushmerick. Boosted Wrapper Induction. In Proceedings of AAAI 2000, 2000.
[14]
S. Handschuh, S. Staab, and F. Ciravegna. S-CREAM - Semi-automatic CREAtion of Metadata. In 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW02), pages 358--372, Siguenza, Spain, 2002.
[15]
A. Kiryakov, B. Popov, D. Ognyanoff, D. Manov, A. Kirilov, and M. Goranov. Semantic annotation, indexing and retrieval. Journal of Web Semantics, ISWC 2003 Special Issue, 1(2):671--680, 2004.
[16]
P. Kogut and W. Holmes. AeroDAML: Applying Information Extraction to Generate DAML Annotations from Web Pages. In First International Conference on Knowledge Capture (K-CAP 2001), Workshop on Knowledge Markup and Semantic Annotation, Victoria, B.C., 2001.
[17]
Y. Li, K. Bontcheva, and H. Cunningham. Using Uneven Margins SVM and Perceptron for Information Extraction. In Proceedings of Ninth Conference on Computational Natural Language Learning (CoNLL-2005), 2005.
[18]
Y. Li, H. Zaragoza, R. Herbrich, J. Shawe-Taylor, and J. Kandola. The Perceptron Algorithm with Uneven Margins. In Proceedings of the 9th International Conference on Machine Learning (ICML-2002), pages 379--386, 2002.
[19]
D. Maynard, W. Peters, and Y. Li. Metrics for evaluation of ontology-based information extraction. In WWW 2006 Workshop on "Evaluation of Ontologies for the Web" (EON), Edinburgh, Scotland, 2006.
[20]
L. K. McDowell and M. Cafarella. Ontology-Driven Information Extraction with OntoSyphon. In 5th Internal Semantic Web Conference (ISWC'06). Springer, 2006.
[21]
E. Motta, M. Vargas-Vera, J. Domingue, M. Lanzoni, A. Stutt, and F. Ciravegna. MnM: Ontology Driven Semi-Automatic and Automatic Support for Semantic Markup. In 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW02), pages 379--391, Siguenza, Spain, 2002.
[22]
J. Perna and A. Spector. Introduction to the Special Issue on Unstructured Information Management. IBM Systems Journal, 43(3), 2004.
[23]
P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. In Proc. of 14th International Joint Conference on Artificial Intelligence, pages 448--453, Montreal, Canada, 1995.
[24]
D. Roth and W. T. Yih. Relational Learning via Propositional Algorithms: An Information Extraction Case Study. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI), pages 1257--1263, 2001.
[25]
J. Rousu, C. Saunders, S. Szedmak, and J. Shawe-Taylor. Learning Hierarchical Multi-Category Text Classification Models. Journal of Machine Learning Research, 7:1601--1626, 2006.

Cited By

View all
  • (2020)Semantic analysis on social networks: A surveyInternational Journal of Communication Systems10.1002/dac.442433:11Online publication date: 16-Apr-2020
  • (2018)Mitigating Risks by Weighting Intangibles when Investing in Renewables2018 7th International Conference on Renewable Energy Research and Applications (ICRERA)10.1109/ICRERA.2018.8566916(582-593)Online publication date: Oct-2018
  • (2017)Semantic annotation in historical documents2017 12th Iberian Conference on Information Systems and Technologies (CISTI)10.23919/CISTI.2017.7975968(1-7)Online publication date: Jun-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '07: Proceedings of the 16th international conference on World Wide Web
May 2007
1382 pages
ISBN:9781595936547
DOI:10.1145/1242572
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hierarchical learning
  2. ontology-based information extraction
  3. semantic annotation

Qualifiers

  • Article

Conference

WWW'07
Sponsor:
WWW'07: 16th International World Wide Web Conference
May 8 - 12, 2007
Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Semantic analysis on social networks: A surveyInternational Journal of Communication Systems10.1002/dac.442433:11Online publication date: 16-Apr-2020
  • (2018)Mitigating Risks by Weighting Intangibles when Investing in Renewables2018 7th International Conference on Renewable Energy Research and Applications (ICRERA)10.1109/ICRERA.2018.8566916(582-593)Online publication date: Oct-2018
  • (2017)Semantic annotation in historical documents2017 12th Iberian Conference on Information Systems and Technologies (CISTI)10.23919/CISTI.2017.7975968(1-7)Online publication date: Jun-2017
  • (2017)Gamification to support programming learning2017 12th Iberian Conference on Information Systems and Technologies (CISTI)10.23919/CISTI.2017.7975788(1-6)Online publication date: Jun-2017
  • (2016)Semantic Annotation for Supporting Context-Aware Information Retrieval in the Transportation Project Environmental Review DomainJournal of Computing in Civil Engineering10.1061/(ASCE)CP.1943-5487.000056530:6Online publication date: Nov-2016
  • (2016)Semantic social media analysis of Chinese tourists in SwitzerlandInformation Technology & Tourism10.1007/s40558-016-0066-z17:2(183-202)Online publication date: 25-Oct-2016
  • (2015)Context-based weighting for vector space model to evaluate the relation between concept and context in information storage and retrieval system2015 International Conference on Computer, Communication and Control (IC4)10.1109/IC4.2015.7375682(1-5)Online publication date: Sep-2015
  • (2015)Global machine learning for spatial ontology populationWeb Semantics: Science, Services and Agents on the World Wide Web10.1016/j.websem.2014.06.00130:C(3-21)Online publication date: 1-Jan-2015
  • (2015)A Discovery Method of Anteroposterior Correlation for Big Data EraSoftware Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing10.1007/978-3-319-10389-1_12(161-177)Online publication date: 2015
  • (2014)Semantic Context-Dependent Weighting for Vector Space ModelProceedings of the 2014 IEEE International Conference on Semantic Computing10.1109/ICSC.2014.49(262-266)Online publication date: 16-Jun-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media