Computer Science > Computation and Language

arXiv:2401.11458 (cs)

[Submitted on 21 Jan 2024 (v1), last revised 2 Jul 2024 (this version, v3)]

Title:Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

Authors:Songyang Gao, Qiming Ge, Wei Shen, Shihan Dou, Junjie Ye, Xiao Wang, Rui Zheng, Yicheng Zou, Zhi Chen, Hang Yan, Qi Zhang, Dahua Lin

View PDF HTML (experimental)

Abstract:The success of AI assistants based on Language Models (LLMs) hinges on Reinforcement Learning from Human Feedback (RLHF) to comprehend and align with user intentions. However, traditional alignment algorithms, such as PPO, are hampered by complex annotation and training requirements. This reliance limits the applicability of RLHF and hinders the development of professional assistants tailored to diverse human preferences. In this work, we introduce \textit{Linear Alignment}, a novel algorithm that aligns language models with human preferences in one single inference step, eliminating the reliance on data annotation and model training. Linear alignment incorporates a new parameterization for policy optimization under divergence constraints, which enables the extraction of optimal policy in a closed-form manner and facilitates the direct estimation of the aligned response. Extensive experiments on both general and personalized preference datasets demonstrate that linear alignment significantly enhances the performance and efficiency of LLM alignment across diverse scenarios. Our code and dataset is published on \url{this https URL}.

Comments:	Accepted by ICML2024, I'm still preparing a better vision
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2401.11458 [cs.CL]
	(or arXiv:2401.11458v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.11458

Submission history

From: Songyang Gao [view email]
[v1] Sun, 21 Jan 2024 10:46:23 UTC (173 KB)
[v2] Mon, 6 May 2024 09:30:24 UTC (479 KB)
[v3] Tue, 2 Jul 2024 03:24:29 UTC (1,248 KB)

Computer Science > Computation and Language

Title:Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators