Computer Science > Machine Learning

arXiv:2407.12982 (cs)

[Submitted on 17 Jul 2024 (v1), last revised 18 Oct 2024 (this version, v2)]

Title:Retrieval-Enhanced Machine Learning: Synthesis and Opportunities

Authors:To Eun Kim, Alireza Salemi, Andrew Drozdov, Fernando Diaz, Hamed Zamani

Abstract:In the field of language modeling, models augmented with retrieval components have emerged as a promising solution to address several challenges faced in the natural language processing (NLP) field, including knowledge grounding, interpretability, and scalability. Despite the primary focus on NLP, we posit that the paradigm of retrieval-enhancement can be extended to a broader spectrum of machine learning (ML) such as computer vision, time series prediction, and computational biology. Therefore, this work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature. Also, we found that while a number of studies employ retrieval components to augment their models, there is a lack of integration with foundational Information Retrieval (IR) research. We bridge this gap between the seminal IR research and contemporary REML studies by investigating each component that comprises the REML framework. Ultimately, the goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.

Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as:	arXiv:2407.12982 [cs.LG]
	(or arXiv:2407.12982v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2407.12982

Submission history

From: To Eun Kim [view email]
[v1] Wed, 17 Jul 2024 20:01:21 UTC (305 KB)
[v2] Fri, 18 Oct 2024 18:42:25 UTC (346 KB)

Computer Science > Machine Learning

Title:Retrieval-Enhanced Machine Learning: Synthesis and Opportunities

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Retrieval-Enhanced Machine Learning: Synthesis and Opportunities

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators