[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1277741.1277836acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Broad expertise retrieval in sparse data environments

Published: 23 July 2007 Publication History

Abstract

Expertise retrieval has been largely unexplored on data other than the W3C collection. At the same time, many intranets of universities and other knowledge-intensive organisations offer examples of relatively small but clean multilingual expertise data, covering broad ranges of expertise areas. We first present two main expertise retrieval tasks, along with a set of baseline approaches based on generative language modeling, aimed at finding expertise relations between topics and people. For our experimental evaluation, we introduce (and release) a new test set based on a crawl of a university site. Using this test set, we conduct two series of experiments. The first is aimed at determining the effectiveness of baseline expertise retrieval methods applied to the new test set. The second is aimed at assessing refined models that exploit characteristic features of the new test set, such as the organizational structure of the university, and the hierarchical structure of the topics in the test set. Expertise retrieval models are shown to be robust with respect to environments smaller than the W3C collection, and current techniques appear to be generalizable to other settings.

References

[1]
L. Azzopardi. Incorporating Context in the Language Modeling Framework for ad hoc Information Retrieval. PhD thesis, University of Paisley, 2005.
[2]
K. Balog and M. de Rijke. Finding similar experts. In This volume, 2007.
[3]
K. Balog and M. de Rijke. Determining expert profiles (with an application to expert finding). In IJCAI '07: Proc. 20th Intern. Joint Conf. on Artificial Intelligence, pages 2657--2662, 2007.
[4]
K. Balog, L. Azzopardi, and M. de Rijke. Formal models for expert finding in enterprise corpora. In SIGIR '06: Proc. 29th annual intern. ACM SIGIR conf. on Research and development in information retrieval, pages 43--50, 2006.
[5]
I. Becerra-Fernandez. The role of artificial intelligence technologies in the implementation of people-finder knowledge management systems. In AAAI Workshop on Bringing Knowledge to Business Processes, March 2000.
[6]
C. S. Campbell, P. P. Maglio, A. Cozzi, and B. Dom. Expertise identification using email communications. In CIKM '03: Proc. twelfth intern. conf. on Information and knowledge management, pages 528--531, 2003.
[7]
G. Cao, J.-Y. Nie, and J. Bai. Integrating word relationships into language models. In SIGIR '05: Proc. 28th annual intern. ACM SIGIR conf. on Research and development in information retrieval, pages 298--305, 2005.
[8]
T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, 1991.
[9]
N. Craswell, D. Hawking, A. M. Vercoustre, and P.Wilkins. P@noptic expert: Searching for experts not just for documents. In Ausweb, 2001.
[10]
N. Craswell, A. de Vries, and I. Soboroff. Overview of the TREC-2005 Enterprise Track. In The Fourteenth Text REtrieval Conf. Proc. (TREC 2005), 2006.
[11]
T. H. Davenport and L. Prusak. Working Knowledge: How Organizations Manage What They Know. Harvard Business School Press, Boston, MA, 1998.
[12]
T. Dunning. Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1):61--74, 1993.
[13]
E. Filatova and J. Prager. Tell me what you do and I'll tell you what you are: Learning occupation-related activities for biographies. In HLT/EMNLP, 2005.
[14]
V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR '01: Proc. 24th annual intern. ACM SIGIR conf. on Research and development in information retrieval, pages 120--127, 2001.
[15]
V. Lavrenko, M. Choquette, and W. B. Croft. Cross-lingual relevance models. In SIGIR '02: Proc. 25th annual intern. ACM SIGIR conf. on Research and development in information retrieval, pages 175--182, 2002.
[16]
C. Macdonald and I. Ounis. Voting for candidates: adapting data fusion techniques for an expert search task. In CIKM '06: Proc. 15th ACM intern. conf. on Information and knowledge management, pages 387--396, 2006.
[17]
C. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999.
[18]
A. Mockus and J. D. Herbsleb. Expertise browser: a quantitative approach to identifying expertise. In ICSE '02: Proc. 24th Intern. Conf. on Software Engineering, pages 503--512, 2002.
[19]
D. Petkova and W. B. Croft. Hierarchical language models for expert finding in enterprise corpora. In Proc. ICTAI 2006, pages 599--608, 2006.
[20]
I. Soboroff, A. de Vries, and N. Craswell. Overview of the TREC 2006 Enterprise Track. In TREC 2006 Working Notes, 2006.
[21]
T. Tao, X. Wang, Q. Mei, and C. Zhai. Language model information retrieval with document expansion. In HLT-NAACL 2006, 2006.
[22]
TREC. Enterprise track, 2005. URL: http://www.ins.cwi.nl/projects/trec-ent/wiki/.
[23]
G. van Noord. TextCat Language Guesser. URL: http://www.let.rug.nl/~vannoord/TextCat/.
[24]
W3C. The W3C test collection, 2005. URL: http://research.microsoft.com/users/nickcr/w3c-summary.html.

Cited By

View all
  • (2023)Optimized Doctor Recommendation System using Supervised Machine LearningProceedings of the 24th International Conference on Distributed Computing and Networking10.1145/3571306.3571372(360-365)Online publication date: 4-Jan-2023
  • (2022)Deep Generative Networks Coupled With Evidential Reasoning for Dynamic User Preferences Using Short TextsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3188497(1-16)Online publication date: 2022
  • (2022)A Strategy for Identifying Specialists in Scientific Data RepositoriesMobile Networks and Applications10.1007/s11036-022-01964-027:5(1941-1951)Online publication date: 3-May-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
July 2007
946 pages
ISBN:9781595935977
DOI:10.1145/1277741
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 July 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. expert finding
  2. expertise search
  3. intranet search
  4. language models

Qualifiers

  • Article

Conference

SIGIR07
Sponsor:
SIGIR07: The 30th Annual International SIGIR Conference
July 23 - 27, 2007
Amsterdam, The Netherlands

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Optimized Doctor Recommendation System using Supervised Machine LearningProceedings of the 24th International Conference on Distributed Computing and Networking10.1145/3571306.3571372(360-365)Online publication date: 4-Jan-2023
  • (2022)Deep Generative Networks Coupled With Evidential Reasoning for Dynamic User Preferences Using Short TextsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3188497(1-16)Online publication date: 2022
  • (2022)A Strategy for Identifying Specialists in Scientific Data RepositoriesMobile Networks and Applications10.1007/s11036-022-01964-027:5(1941-1951)Online publication date: 3-May-2022
  • (2021)Fast Filtering of Search Results Sorted by AttributeACM Transactions on Information Systems10.1145/347798240:2(1-24)Online publication date: 24-Nov-2021
  • (2021)Personalized, Sequential, Attentive, Metric-Aware Product SearchACM Transactions on Information Systems10.1145/347333740:2(1-29)Online publication date: 24-Nov-2021
  • (2021)Profiling Users for Question Answering Communities via Flow-Based Constrained Co-Embedding ModelACM Transactions on Information Systems10.1145/347056540:2(1-38)Online publication date: 24-Nov-2021
  • (2021)Neural Weak Supervision Model for Search of Specialists in Scientific Data RepositoryData and Information in Online Environments10.1007/978-3-030-77417-2_21(286-296)Online publication date: 15-Jun-2021
  • (2020)Feasibility of activity-based expert profiling using text mining of scientific publications and patentsScientometrics10.1007/s11192-020-03414-8Online publication date: 18-Mar-2020
  • (2020)An Evidential Reasoning Framework for User Profiling Using Short TextsIntegrated Uncertainty in Knowledge Modelling and Decision Making10.1007/978-3-030-62509-2_12(137-150)Online publication date: 2-Nov-2020
  • (2019)An Expertise Recommender System based on Data from an Institutional Repository (DiVA)Connecting the Knowledge Commons — From Projects to Sustainable Infrastructure10.4000/books.oep.9078(135-149)Online publication date: 2-Jun-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media