Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleMarch 2024
Translate-Distill: Learning Cross-Language Dense Retrieval by Translation and Distillation
AbstractPrior work on English monolingual retrieval has shown that a cross-encoder trained using a large number of relevance judgments for query-document pairs can be used as a teacher to train more efficient, but similarly effective, dual-encoder student ...
- tutorialJuly 2023
Neural Methods for Cross-Language Information Retrieval
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 3430–3431https://doi.org/10.1145/3539618.3594244This half day tutorial introduces the participant to the basic concepts underlying neural Cross-Language Information Retrieval (CLIR). It discusses the most common algorithmic approaches to CLIR, focusing on modern neural methods; the history of CLIR; ...
- ArticleApril 2022
Patapasco: A Python Framework for Cross-Language Information Retrieval Experiments
AbstractWhile there are high-quality software frameworks for information retrieval experimentation, they do not explicitly support cross-language information retrieval (CLIR). To fill this gap, we have created Patapsco, a Python CLIR framework. This ...
- ArticleApril 2022
Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models
- Suraj Nair,
- Eugene Yang,
- Dawn Lawrie,
- Kevin Duh,
- Paul McNamee,
- Kenton Murray,
- James Mayfield,
- Douglas W. Oard
AbstractThe advent of transformer-based models such as BERT has led to the rise of neural ranking models. These models have improved the effectiveness of retrieval systems well beyond that of lexical term matching models such as BM25. While monolingual ...
- ArticleApril 2022
HC4: A New Suite of Test Collections for Ad Hoc CLIR
AbstractHC4 is a new suite of test collections for ad hoc Cross-Language Information Retrieval (CLIR), with Common Crawl News documents in Chinese, Persian, and Russian, topics in English and in the document languages, and graded relevance judgments. New ...
-
- short-paperJuly 2020
Combining Contextualized and Non-contextualized Query Translations to Improve CLIR
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1581–1584https://doi.org/10.1145/3397271.3401270In cross-language information retrieval using probabilistic structured queries (PSQ), translation probabilities from statistical machine translation act as a bridge between the query and document vocabulary. These translation probabilities are typically ...
- research-articleAugust 2016
Exploring Bilingual Word Vectors for Hindi-English Cross-Language Information Retrieval
ICIA-16: Proceedings of the International Conference on Informatics and AnalyticsArticle No.: 28, Pages 1–4https://doi.org/10.1145/2980258.2980310Todays, The internet has become a source of multi-lingual content. Users are not aware of multiple languages, so the language diversity becomes a great barrier for world communication. Cross-Language Information Retrieval (CLIR) provides a solution for ...
- articleJanuary 2016
Dealing with Relevance Ranking in Cross-Lingual Cross-Script Text Reuse
International Journal of Information Retrieval Research (IJIRR-IGI), Volume 6, Issue 1Pages 16–35https://doi.org/10.4018/IJIRR.2016010102Proliferation of multilingual content on the web has paved way for text reuse to get cross-lingual and also cross script. Identifying cross language text reuse becomes tougher if one considers cross-script less resourced languages. This paper focuses on ...
- research-articleDecember 2013
Experiments with query translation and re-ranking methods in Vietnamese-English bilingual information retrieval
SoICT '13: Proceedings of the 4th Symposium on Information and Communication TechnologyPages 118–122https://doi.org/10.1145/2542050.2542073Using bilingual dictionaries is a common way for query translation in Cross Language Information Retrieval. In this article, we focus on Vietnamese-English Bilingual Information Retrieval and present algorithms for query segmentation, word ...
- research-articleDecember 2013
Monolingual and Crosslingual SMS-based FAQ Retrieval
FIRE '12 & '13: Proceedings of the 4th and 5th Annual Meetings of the Forum for Information Retrieval EvaluationArticle No.: 3, Pages 1–6https://doi.org/10.1145/2701336.2701634This paper presents results for DCU's second participation in the SMS-based FAQ Retrieval task at FIRE. For FIRE 2012, we submitted runs for the monolingual English and Hindi and the crosslingual English to Hindi subtasks. Compared to our experiments ...
- research-articleFebruary 2012
auroraDL and responding to end-user digital library needs
iConference '12: Proceedings of the 2012 iConferencePages 444–446https://doi.org/10.1145/2132176.2132242This paper reports on functions and extensions to a digital library record creation, search, and collection-building tools, auroraDL, as a response to academic and professional focus groups' interest in digital library/content exploration tools.
- articleFebruary 2011
Query translation-based cross-language print defect diagnosis based on the fuzzy Bayesian model
Journal of Intelligent Manufacturing (SPJIM), Volume 22, Issue 1Pages 43–55https://doi.org/10.1007/s10845-009-0274-xThis paper discusses a query-translation based cross-language diagnosis (Q-CLD) for print defects conducted by nonnative English users. The first step involved developing three fuzzy Bayesian models: one based on English descriptions provided by native ...
- ArticleAugust 2010
A New Cross-Language Commodity Information Retrieval Approach in Book Searching
ISME '10: Proceedings of the 2010 International Conference of Information Science and Management Engineering - Volume 01Pages 411–415https://doi.org/10.1109/ISME.2010.245This paper analyzes the basic modes of Cross-Language Information Retrieval (CLIR) and the critical technologies of translation disambiguation. It optimizes translation outcome by eliminating translation ambiguity. The methods are based on co-...
- posterJuly 2010
Cross-language retrieval using link-based language models
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrievalPages 773–774https://doi.org/10.1145/1835449.1835609We propose a cross-language retrieval model that is solely based on Wikipedia as a training corpus. The main contributions of our work are: 1. A translation model based on linked text in Wikipedia and a term weighting method associated with it. 2. A ...
- ArticleSeptember 2009
Ontology-based terminology management for transitive translations focusing on NEs
FDIA'09: Proceedings of the Third BCS-IRSG conference on Future Directions in Information AccessPages 125–127I demonstrate that there are two types of transitive translations of Named Entities (NEs), both of which should be handled in the process of Cross Lingual Information Retrieval (CLIR). An official transitive translation is defined as a translation made ...
- articleAugust 2009
A query-based cross-language diagnosis tool for distributed decision making support
Computers and Industrial Engineering (CINE), Volume 57, Issue 1Pages 37–45https://doi.org/10.1016/j.cie.2008.11.020A query translation-based Korean-English cross-language diagnosis (Q-KE-CLD) tool for assisting Korean users diagnosing print defects was developed and then evaluated as a case study of distributed decision making support for nonnative English users. ...
- posterJuly 2009
A graph-based approach to mining multilingual word associations from wikipedia
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrievalPages 690–691https://doi.org/10.1145/1571941.1572080In this paper, we propose a graph-based approach to constructing a multilingual association dictionary from Wikipedia, in which we exploit two kinds of links in Wikipedia articles to associate multilingual words and concepts together in a graph. The ...
- research-articleJuly 2009
Addressing morphological variation in alphabetic languages
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrievalPages 75–82https://doi.org/10.1145/1571941.1571957The selection of indexing terms for representing documents is a key decision that limits how effective subsequent retrieval can be. Often stemming algorithms are used to normalize surface forms, and thereby address the problem of not finding documents ...
- ArticleMay 2009
Research of Enterprise Competitive Intelligence Collection System Based on Cross-Language Information Retrieval
ISECS '09: Proceedings of the 2009 Second International Symposium on Electronic Commerce and Security - Volume 01Pages 601–604https://doi.org/10.1109/ISECS.2009.199Now with the competition of the enterprise globalization gradually, so enterprise which want to participating competition, not only know their own circumstance about the enterprise themselves, but also know the circumstance about the rival. This need ...
- ArticleMarch 2009
Mining Cross-Lingual/Cross-Cultural Differences in Concerns and Opinions in Blogs
ICCPOL '09: Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based EconomyPages 213–224https://doi.org/10.1007/978-3-642-00831-3_20The goal of this paper is to cross-lingually analyze multilingual blogs collected with a topic keyword. The framework of collecting multilingual blogs with a topic keyword is designed as the blog feed retrieval procedure. Mulitlingual queries for ...