Computer Science > Machine Learning

arXiv:2403.13027 (cs)

[Submitted on 19 Mar 2024]

Title:Towards Better Statistical Understanding of Watermarking LLMs

Authors:Zhongze Cai, Shang Liu, Hanzhao Wang, Huaiyang Zhong, Xiaocheng Li

Abstract:In this paper, we study the problem of watermarking large language models (LLMs). We consider the trade-off between model distortion and detection ability and formulate it as a constrained optimization problem based on the green-red algorithm of Kirchenbauer et al. (2023a). We show that the optimal solution to the optimization problem enjoys a nice analytical property which provides a better understanding and inspires the algorithm design for the watermarking process. We develop an online dual gradient ascent watermarking algorithm in light of this optimization formulation and prove its asymptotic Pareto optimality between model distortion and detection ability. Such a result guarantees an averaged increased green list probability and henceforth detection ability explicitly (in contrast to previous results). Moreover, we provide a systematic discussion on the choice of the model distortion metrics for the watermarking problem. We justify our choice of KL divergence and present issues with the existing criteria of ``distortion-free'' and perplexity. Finally, we empirically evaluate our algorithms on extensive datasets against benchmark algorithms.

Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Information Theory (cs.IT); Machine Learning (stat.ML)
Cite as:	arXiv:2403.13027 [cs.LG]
	(or arXiv:2403.13027v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2403.13027

Submission history

From: Shang Liu [view email]
[v1] Tue, 19 Mar 2024 01:57:09 UTC (1,511 KB)

Computer Science > Machine Learning

Title:Towards Better Statistical Understanding of Watermarking LLMs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Towards Better Statistical Understanding of Watermarking LLMs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators