More Web Proxy on the site http://driver.im/

short-paper

Finding additional semantic entity information for search engines

Authors:

Jinglan ZhangAuthors Info & Claims

ADCS '12: Proceedings of the Seventeenth Australasian Document Computing Symposium

Pages 115 - 122

https://doi.org/10.1145/2407085.2407101

Published: 05 December 2012 Publication History

Abstract

Entity-oriented search has become an essential component of modern search engines. It focuses on retrieving a list of entities or information about the specific entities instead of documents. In this paper, we study the problem of finding entity related information, referred to as attribute-value pairs, that play a significant role in searching target entities. We propose a novel decomposition framework combining reduced relations and the discriminative model, Conditional Random Field (CRF), for automatically finding entity-related attribute-value pairs from free text documents. This decomposition framework allows us to locate potential text fragments and identify the hidden semantics, in the form of attribute-value pairs for user queries. Empirical analysis shows that the decomposition framework outperforms pattern-based approaches due to its capability of effective integration of syntactic and semantic features.

References

[1]

Adafre, S. F., Rijke, de M., and Sang, E. T. K. 2007. Entity Retrieval. In Proceedings of International Conference of Recent Advances in Natural Language Processing (Borovets, Bulgaria, 2007). RANLP'07. John Benjamins, Amsterdam. Netherland.

[2]

Almuhareb, A. 2006. Attributes in Lexical Acquisition. University of Essex, Colchester.

[3]

Arguello, J., F. Diaz, F., Callan, J., and Crespo, J. F. 2009. Sources of evidence for vertical selection. In Proceedings of ACM International Conference on Research and development in information retrieval (Boston, MA, USA, 2009). SIGIR'09. ACM, New York, NY, 315--322. DOI=http://doi.acm.org/10.1145/1571941.1571997.

Digital Library

[4]

Banko, M. 2009. Open Information Extraction for the Web. University of Washington, Seattle.

[5]

Banko, M. and Etzioni, O. 2008. The Tradeoffs Between Open and Traditional Relation Extraction. In Proceedings of Annual Meeting of the Association for Computational Linguistics, (Ohio, USA, 2008). ACL'08. Association for Computational Linguistics, Stroudsburg, PA, 28--36.

[6]

Bron, M., He, J., Hofmann, K., Meij, E., Rijke, M. D., Tsagkias, M., and Weerkamp, W. 2011. The University of Amsterdam at TREC 2010: Session, Entity and Relevance Feedback. In Proceedings of Text REtrieval Conference TREC 2010 (Gaithersburg, USA, 2011). TREC'11. NIST Special Publication, Gaithersburg, Maryland.

[7]

Demartini, G., C. S. Firan, C. S., lofciu, T., Krestel, R., and Nejdl, W. 2010. Why finding entities in Wikipedia is difficult, sometimes. Inf. Retr, 135, 534--567. DOI=http://doi.acm.org/10.1007/s10791-010-9135-7.

Digital Library

[8]

Etzioni, O., M. Banko, M., Soderland, S., and Weld, D. S. 2008. Open information extraction from the web. In Proceedings of International Joint Conference on Artificial Intelligence (Hyderabad, India, 2008). IJCAI'08. AAAI Press, Palo Alto, California, 2670--2676. DOI=http://doi.acm.org/10.1145/1409360.1409378.

Digital Library

[9]

Fader, A., Soderland, S., and Etzioni, O. 2011. Identifying relations for open information extraction. In Proceedings of Conference on Empirical Methods in Natural Language Processing (Edinburgh, United Kingdom, 2011). EMNLP'11. Association for Computational Linguistics, Stroudsburg, PA, 1535--1545.

Digital Library

[10]

Ghani, R., K. Probst, K., Liu, Y., Krema, M., and Fano, A. 2006. Text mining for product attribute extraction. ACM SIGKDD Explorations Newsletter, 81, 41--48. DOI=http://doi.acm.org/10.1145/1147234.1147241.

Digital Library

[11]

Hartung, M. and Frank, A. 2010. A structured vector space model for hidden attribute meaning in adjective-noun phrases. In Proceedings of International Conference on Computational Linguistics (Beijing, China, 2010). COLING'10. Association for Computational Linguistics, Stroudsburg, PA, 430--438.

Digital Library

[12]

Hartung, M. and Frank, A. 2011. Exploring supervised LDA models for assigning attributes to adjective-noun phrases. In Proceedings of Conference on Empirical Methods in Natural Language Processing (Edinburgh, United Kingdom, 2011). EMNLP'11. Association for Computational Linguistics, Stroudsburg, PA, 540--551.

Digital Library

[13]

Lafferty, J. D., A. McCallum, A., and Pereira, F. C. N. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of International Conference on Machine Learning (Williamstown, USA, 2001). ICML'01. Morgan Kaufmann Publishers Inc., San Fransisco, CA, 282--289.

Digital Library

[14]

Li, F., X. Zhang, X., Yuan, J. H., and Zhu, X. Y. 2008. Classifying what-type questions by head noun tagging. In Proceedings of International Conference on Computational Linguistics (Manchester, United Kingdom, 2008). COLING'08. Association for Computational Linguistics, Stroudsburg, PA, 481--488.

Digital Library

[15]

Li, X. 2010. Understanding the semantic structure of noun phrase queries. In Proceedings of Annual Meeting of the Association for Computational Linguistics (Uppsala, Sweden, 2010). ACL'10. Association for Computational Linguistics, Stroudsburg, PA, 1337--1345.

Digital Library

[16]

Pasca, M. 2007. Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds. In Proceedings of International conference on World Wide Web (Banff, Canada, 2007). WWW'07. ACM, New York, NY, 101--110. DOI=http://doi.acm.org/10.1145/1242572.1242587.

Digital Library

[17]

Pasca, M. 2008. Turning web text and search queries into factual knowledge: hierarchical class attribute extraction. In Proceedings of National Conference on Artificial intelligence (Chicago, Illinois, 2008). AAAI'08. AAAI Press, Palo Alto, California, 1225--1230.

Digital Library

[18]

Pasca, M. and Durme, B. V. 2007. What you seek is what you get: extraction of class attributes from query logs. In Proceedings of International joint conference on Artifical intelligence (Hyderabad, India, 2007). IJCAI'07. Morgan Kaufmann Publishers Inc., San Fransisco, CA, 2832--2837.

Digital Library

[19]

Pasca, M. and Durme, B. V. 2008. Weakly--supervised acquisition of open-domain classes and class attributes from web documents and query logs. In Proceedings of Annual Meeting of the Association for Computational Linguistics (Ohio, USA, 2008). ACL'08. Association for Computational Linguistics, Stroudsburg, PA, 19--27.

[20]

Reverb. http://reverb.cs.washington.edu

[21]

Rode, H. 2008. From document to entity retrieval: improving precision and performance of focused text search. University of Twente, Enschede.

[22]

Shen, D., J.--T. Sun, J. T., Yang, Q., and Chen, Z. 2006. Building bridges for web query classification. In Proceedings of ACM International Conference on Research and development in information retrieval (Seattle, USA, 2006). SIGIR'06. ACM, New York, NY, 131--138. DOI= http://doi.acm.org/10.1145/1148170.1148196.

Digital Library

[23]

Sowa, John F. 2000. Knowledge Representation: Logical, Philosophical, and Computational Foundations. Distributed Systems Online, 51, 1--3.

[24]

Suchanek, F. M., Kasneci, G., and Weikum, G. 2007. YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia. In Proceedings of International World Wide Web Conference (Banff, Canada, 2007). WWW'07. ACM, New York, NY, 697--706. DOI= http://doi.acm.org/10.1145/1242572.1242667.

Digital Library

[25]

Tsikrika, T., P. Serdyukov, P., Rode, H., Westerveld, T., Aly, D, and Vries, A. P. 2008. Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah. In Proceedings of Focused access to XML documents: 6th international workshop of the initiative for the evaluation of XML (Dagstuhl Castle, Germany, 2008). INEX'08. Springer--Verlag, Heidelberg, Germany, 306--320. DOI= http://dx.doi.org/10.1007/978-3-540-85902-4_27.

Digital Library

[26]

Voorhees, E. M. and Harman, D. 2004. Overview of the TREC 2004 Question Answering Track. In Proceedings of Text REtrieval Conference TREC-4 (Gaithersburg, USA, 2004). TREC'04. NIST Special Publication, Gaithersburg, Maryland, 1--11.

[27]

Wu, F. and Weld, D. S. 2010. Open information extraction using Wikipedia. In Proceedings of Annual Meeting of the Association for Computational Linguistics (Uppsala, Sweden, 2010). ACL'10. Association for Computational Linguistics, Stroudsburg, PA, 118--127.

Digital Library

[28]

Zirn, C., V. Nastase, V., and Strube, M. 2008. Distinguishing between instances and classes in the Wikipedia taxonomy. In Proceedings of European semantic web conference on The semantic web: research and applications (Tenerife, Spain, 2008). ESWC'08. Springer--Verlag, Heidelberg, Germany, 376--387. DOI= http://dx.doi.org/10.1007/978-3-540-68234-9_29.

Digital Library

Cited By

Trotman ACunningham SSitbon L(2012)The seventeenth australasian document computing symposiumACM SIGIR Forum10.1145/2492189.249219347:1(17-21)Online publication date: 7-Jun-2012
https://dl.acm.org/doi/10.1145/2492189.2492193

Index Terms

Finding additional semantic entity information for search engines
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Entity linking and retrieval for semantic search
WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining

More and more search engine users are expecting direct answers to their information needs, rather than links to documents. Semantic search and its recent applications enabled search engines to organize their wealth of information around entities. Entity ...
Dynamic Collective Entity Representations for Entity Ranking
WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining

Entity ranking, i.e., successfully positioning a relevant entity at the top of the ranking for a given query, is inherently difficult due to the potential mismatch between the entity's description in a knowledge base, and the way people refer to the ...
An Empirical Evaluation on Semantic Search Performance of Keyword-Based and Semantic Search Engines: Google, Yahoo, Msn and Hakia
ICIMP '09: Proceedings of the 2009 Fourth International Conference on Internet Monitoring and Protection

This paper investigates the semantic search performance of search engines. Initially, three keyword-based search engines (Google, Yahoo and Msn) and a semantic search engine (Hakia) were selected. Then, ten queries, from various topics, and four phrases,...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ADCS '12: Proceedings of the Seventeenth Australasian Document Computing Symposium

December 2012

142 pages

ISBN:9781450314114

DOI:10.1145/2407085

Conference Chair:
Andrew Trotman
University of Otago
,
Program Chairs:
Sally Jo Cunningham
University of Waikato
,
Laurianne Sitbon
Queensland University of Technology

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Dept. of Information Science, Univ.of Otago: Department of Information Science, University of Otago, Dunedin, New Zealand

In-Cooperation

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

ADCS '12

Sponsor:

Dept. of Information Science, Univ.of Otago

ADCS '12: The Seventeenth Australasian Document Computing Symposium

December 5 - 6, 2012

Dunedin, New Zealand

Acceptance Rates

Overall Acceptance Rate 30 of 57 submissions, 53%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
118
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Trotman ACunningham SSitbon L(2012)The seventeenth australasian document computing symposiumACM SIGIR Forum10.1145/2492189.249219347:1(17-21)Online publication date: 7-Jun-2012
https://dl.acm.org/doi/10.1145/2492189.2492193

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents