More Web Proxy on the site http://driver.im/

Article

Yago: a core of semantic knowledge

Authors:

Fabian M. Suchanek,

Gjergji Kasneci,

Gerhard WeikumAuthors Info & Claims

WWW '07: Proceedings of the 16th international conference on World Wide Web

Pages 697 - 706

https://doi.org/10.1145/1242572.1242667

Published: 08 May 2007 Publication History

Abstract

We present YAGO, a light-weight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains more than 1 million entities and 5 million facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as HASONEPRIZE). The facts have been automatically extracted from Wikipedia and unified with WordNet, using a carefully designed combination of rule-based and heuristic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations, products, etc. with their semantic relationships - and in quantity by increasing the number of facts by more than an order of magnitude. Our empirical evaluation of fact correctness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, we show how YAGO can be further extended by state-of-the-art information extraction techniques.

References

[1]

E. Agichtein and L. Gravano. Snowball: extracting relations from large plain-text collections. In ICDL, 2000.

Digital Library

[2]

F. Baader and T. Nipkow. Term rewriting and all that. Cambridge University Press, New York, NY, USA, 1998.

Digital Library

[3]

R. C. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL, 2006.

[4]

M. J. Cafarella, D. Downey, S. Soderland, and O. Etzioni. KnowItNow: Fast, scalable information extraction from the web. In EMNLP, 2005.

Digital Library

[5]

N. Chatterjee, S. Goyal, and A. Naithani. Resolving pattern ambiguity for english to hindi machine translation using WordNet. In Workshop on Modern Approaches in Translation Technologies, 2005.

[6]

S. Chaudhuri, V. Ganti, and R. Motwani. Robust identification of fuzzy duplicates. In ICDE, 2005.

Digital Library

[7]

W. W. Cohen and S. Sarawagi. Exploiting dictionaries in named entity extraction: combining semi-markov extraction processes and data integration methods. In KDD, 2004.

Digital Library

[8]

H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A framework and graphical development environment for robust NLP tools and applications. In ACL, 2002.

[9]

O. Etzioni, M. J. Cafarella, D. Downey, S. Kok, A. -M. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Web-scale information extraction in KnowItAll. In WWW, 2004.

Digital Library

[10]

C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, 1998.

[11]

J. Graupmann, R. Schenkel, and G. Weikum. The spheresearch engine for unified ranked retrieval of heterogeneous XML and web documents. In VLDB, 2005.

Digital Library

[12]

I. Horrocks, O. Kutz, and U. Sattler. The even more irresistible SROIQ. In KR, 2006.

[13]

W. Hunt, L. Lita, and E. Nyberg. Gazetteers, wordnet, encyclopedias, and the web: Analyzing question answering resources. Technical Report CMU-LTI-04-188, Language Technologies Institute, Carnegie Mellon, 2004.

[14]

G. Ifrim and G. Weikum. Transductive learning for text classification using explicit knowledge models. In PKDD, 2006.

Digital Library

[15]

D. Kinzler. WikiSense - Mining the Wiki. In Wikimania, 2005.

[16]

S. Liu, F. Liu, C. Yu, and W. Meng. An effective approach to document retrieval via utilizing wordnet and recognizing phrases. In SIGIR, 2004.

Digital Library

[17]

C. Matuszek, J. Cabral, M. Witbrock, and J. DeOliveira. An introduction to the syntax and content of Cyc. In AAAI Spring Symposium, 2006.

[18]

I. Niles and A. Pease. Towards a standard upper ontology. In FOIS, 2001.

Digital Library

[19]

N. F. Noy, A. Doan, and A. Y. Halevy. Semantic integration. AI Magazine, 26(1):7--10, 2005.

Digital Library

[20]

P. Pantel and M. Pennacchiotti. Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In ACL, 2006.

Digital Library

[21]

M. Ruiz-Casado, E. Alfonseca, and P. Castells. Automatic extraction of semantic relationships for WordNet by means of pattern learning from Wikipedia. In NLDB, pages 67--79, 2006.

Digital Library

[22]

S. Russell and P. Norvig. Artificial Intelligence: a Modern Approach. Prentice Hall, 2002.

Digital Library

[23]

R. Snow, D. Jurafsky, and A. Y. Ng. Semantic taxonomy induction from heterogenous evidence. In ACL, 2006.

Digital Library

[24]

S. Staab and R. Studer. Handbook on Ontologies. Springer, 2004.

Digital Library

[25]

F. M. Suchanek, G. Ifrim, and G. Weikum. Combining linguistic and statistical analysis to extract relations from web documents. In KDD, 2006.

Digital Library

[26]

F. M. Suchanek, G. Ifrim, and G. Weikum. LEILA: Learning to Extract Information by Linguistic Analysis. In Workshop on Ontology Population at ACL/COLING, 2006.

[27]

M. Theobald, R. Schenkel, and G. Weikum. TopX and XXL at INEX 2005. In INEX, 2005.

Digital Library

[28]

W3C. Sparql, 2005. retrieved from http://www.w3.org/TR/rdf-sparql-query/.

Cited By

Mohamed SFarah KLotfy ARizk KSaeed AMohamed SKhoriba GArafa T(2025)Knowledge GraphsAdvanced Research Trends in Sustainable Solutions, Data Analytics, and Security10.4018/979-8-3693-7117-6.ch005(99-146)Online publication date: 3-Jan-2025
https://doi.org/10.4018/979-8-3693-7117-6.ch005
Wang LCheng HWang RHuang X(2025)Machining Scheme Selection of Features Based on Process Knowledge Graph and Improved Cosine Similarity MatchingMachines10.3390/machines1303018813:3(188)Online publication date: 26-Feb-2025
https://doi.org/10.3390/machines13030188
Li TChen RDuan YYao HLi SLi X(2025)HGeoKG: A Hierarchical Geographic Knowledge Graph for Geographic Knowledge ReasoningISPRS International Journal of Geo-Information10.3390/ijgi1401001814:1(18)Online publication date: 3-Jan-2025
https://doi.org/10.3390/ijgi14010018
Show More Cited By

Index Terms

Yago: a core of semantic knowledge
1. Information systems

Recommendations

YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames
The Semantic Web – ISWC 2016
Abstract
YAGO is a large knowledge base that is built automatically from Wikipedia, WordNet and GeoNames. The project combines information from Wikipedias in 10 different languages into a coherent whole, thus giving the knowledge a multilingual dimension. ...
Timely YAGO: harvesting, querying, and visualizing temporal knowledge from Wikipedia
EDBT '10: Proceedings of the 13th International Conference on Extending Database Technology

Recent progress in information extraction has shown how to automatically build large ontologies from high-quality sources like Wikipedia. But knowledge evolves over time; facts have associated validity intervals. Therefore, ontologies should include ...
YAGO: A Large Ontology from Wikipedia and WordNet

This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '07: Proceedings of the 16th international conference on World Wide Web

May 2007

1382 pages

ISBN:9781595936547

DOI:10.1145/1242572

General Chairs:
Carey Williamson
University of Calgary, Canada
,
Mary Ellen Zurko
IBM, USA
,
Program Chairs:
Peter Patel-Schneider
Bell Labs Research, USA
,
Prashant Shenoy
University of Massachusetts at Amherst, USA

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

WWW'07

Sponsor:

ACM

WWW'07: 16th International World Wide Web Conference

May 8 - 12, 2007

Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2,430
Total Citations
View Citations
5,140
Total Downloads

Downloads (Last 12 months)237
Downloads (Last 6 weeks)20

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mohamed SFarah KLotfy ARizk KSaeed AMohamed SKhoriba GArafa T(2025)Knowledge GraphsAdvanced Research Trends in Sustainable Solutions, Data Analytics, and Security10.4018/979-8-3693-7117-6.ch005(99-146)Online publication date: 3-Jan-2025
https://doi.org/10.4018/979-8-3693-7117-6.ch005
Wang LCheng HWang RHuang X(2025)Machining Scheme Selection of Features Based on Process Knowledge Graph and Improved Cosine Similarity MatchingMachines10.3390/machines1303018813:3(188)Online publication date: 26-Feb-2025
https://doi.org/10.3390/machines13030188
Li TChen RDuan YYao HLi SLi X(2025)HGeoKG: A Hierarchical Geographic Knowledge Graph for Geographic Knowledge ReasoningISPRS International Journal of Geo-Information10.3390/ijgi1401001814:1(18)Online publication date: 3-Jan-2025
https://doi.org/10.3390/ijgi14010018
Mai ZWang WLiu XFeng XWang JFu W(2025)A Reinforcement Learning Approach for Graph Rule LearningBig Data Mining and Analytics10.26599/BDMA.2024.90200708:1(31-44)Online publication date: Feb-2025
https://doi.org/10.26599/BDMA.2024.9020070
Wang XAo XZhang FZhang ZHe Q(2025)Knowledge Error Detection via Textual and Structural Joint LearningBig Data Mining and Analytics10.26599/BDMA.2024.90200408:1(233-240)Online publication date: Feb-2025
https://doi.org/10.26599/BDMA.2024.9020040
Li JHua WJin FLi XNejdl WAuer SKarras OCha MMoens MNajork M(2025)HTEA: Heterogeneity-aware Embedding Learning for Temporal Entity AlignmentProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703588(982-990)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703588
Moradan ASorkhpar MMiyauchi AMottin DAssent INejdl WAuer SKarras OCha MMoens MNajork M(2025)Untapping the Power of Indirect Relationships in Entity SummarizationProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703566(820-828)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703566
Yoon SKo SKim TKang SYeo JLee DNejdl WAuer SKarras OCha MMoens MNajork M(2025)Unsupervised Robust Cross-Lingual Entity Alignment via Neighbor Triple Matching with Entity and Relation TextsProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining10.1145/3701551.3703500(184-193)Online publication date: 10-Mar-2025
https://dl.acm.org/doi/10.1145/3701551.3703500
Zhang THou CJiang RZhang XZhou CTang KLv H(2025)Label Informed Contrastive Pretraining for Node Importance Estimation on Knowledge GraphsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2024.336369536:3(4462-4476)Online publication date: Mar-2025
https://doi.org/10.1109/TNNLS.2024.3363695
Wang KLin DLuo S(2025)Graph Percolation Embeddings for Efficient Knowledge Graph Inductive ReasoningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.350806437:3(1198-1212)Online publication date: Mar-2025
https://doi.org/10.1109/TKDE.2024.3508064
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten