Temperature-Based Watermarking and Detection for Large Language Models

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15251))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

57 Accesses

Abstract

With the wide application of Large Language Models (LLMs), protecting the copyright of generated content and preventing its misuse becomes important. This paper proposes a temperature-based watermark embedding algorithm that embeds watermarks in text using the Softmax function and polynomial sampling techniques. Meanwhile, this paper also discusses a watermark detection technique based on statistical testing, which can effectively identify and verify watermarks embedded in text. By applying these techniques to different LLMs and computing environments, including OPT series, Llama series, BLOOM series and GPT-2, this paper analyses the scenarios, evaluates the key parameters in the algorithms and proposes solutions to ensure the integration of watermarks without compromising on the performance of the model or the naturalness of the generated text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 49.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 59.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bubeck, S., et al.: Sparks of artificial general intelligence: early experiments with GPT-4. arXiv preprint arXiv:2303.12712 (2023)
Turing, A.M.: Computing Machinery and Intelligence. Springer (2009)
Google Scholar
Thede, S.M., Harper, M.: A second-order hidden Markov model for part-of-speech tagging. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 175–182 (1999)
Google Scholar
Bahl, L.R., Brown, P.F., De Souza, P.V., Mercer, R.L.: A tree-based statistical language model for natural language speech recognition. IEEE Trans. Acoust. Speech Signal Process. 37(7), 1001–1008 (1989)
Article MATH Google Scholar
Brants, T., Popat, A., Xu, P., Och, F.J., Dean, J.: Large language models in machine translation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 858–867 (2007)
Google Scholar
Liu, X., Croft, W.B.: Statistical language modeling for information retrieval. Annu. Rev. Inf. Sci. Technol. 39(1), 1–31 (2005)
Article MATH Google Scholar
Mikolov, T., et al.: Statistical language models based on neural networks. Presentation at Google, Mountain View, 2 April 2012, vol. 80, no. 26 (2012)
Google Scholar
Mirsky, Y., et al.: The threat of offensive AI to organizations. Comput. Secur. 124, 103006 (2023)
Article MATH Google Scholar
Bergman, A.S., et al.: Guiding the release of safer e2e conversational AI through value sensitive design. In: Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics (2022)
Google Scholar
Megías, D., Kuribayashi, M., Rosales, A., Mazurczyk, W.: Dissimilar: towards fake news detection using information hiding, signal processing and machine learning. In: Proceedings of the 16th International Conference on Availability, Reliability and Security, pp. 1–9 (2021)
Google Scholar
Tang, R., Feng, Q., Liu, N., Yang, F., Hu, X.: Did you train on my dataset? Towards public dataset protection with cleanlabel backdoor watermarking. ACM SIGKDD Explor. Newsl. 25(1), 43–53 (2023)
Article Google Scholar
Zhao, X., Wang, Y.-X., Li, L.: Protecting language generation models via invisible watermarking. In: International Conference on Machine Learning. PMLR, pp. 42187–42199 (2023)
Google Scholar
Brassil, J.T., Low, S., Maxemchuk, N.F., O’Gorman, L.: Electronic marking and identification techniques to discourage document copying. IEEE J. Sel. Areas Commun. 13(8), 1495–1504 (1995)
Article MATH Google Scholar
Begum, M., Uddin, M.S.: Digital image watermarking techniques: a review. Information 11(2), 110 (2020)
Article MATH Google Scholar
Por, L.Y., Wong, K., Chee, K.O.: UniSpaCh: a text-based data hiding method using Unicode space characters. J. Syst. Softw. 85(5), 1075–1082 (2012)
Article Google Scholar
Rizzo, S.G., Bertini, F., Montesi, D.: Content-preserving text watermarking through Unicode homoglyph substitution. In: Proceedings of the 20th International Database Engineering & Applications Symposium, pp. 97–104 (2016)
Google Scholar
Sato, R., Takezawa, Y., Bao, H., Niwa, K., Yamada, M.: Embarrassingly simple text watermarks. arXiv preprint arXiv:2310.08920 (2023)
Munyer, T., Zhong, X.: DeepTextMark: deep learning based text watermarking for detection of large language model generated text. arXiv preprint arXiv:2305.05773 (2023)
Topkara, U., Topkara, M., Atallah, M.J.: The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions. In: Proceedings of the 8th Workshop on Multimedia and Security, pp. 164–174 (2006)
Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Book MATH Google Scholar
Yang, X., et al.: Tracing text provenance via context-aware lexical substitution. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 10, pp. 11613–11621 (2022)
Google Scholar
Yoo, K., Ahn, W., Jang, J., Kwak, N.: Robust multi-bit natural language watermarking through invariant features. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2092–2115 (2023)
Google Scholar
Topkara, M., Topkara, U., Atallah, M.J.: Words are not enough: sentence level natural language watermarking. In: Proceedings of the 4th ACM International Workshop on Contents Protection and Security, pp. 37–46 (2006)
Google Scholar
Abdelnabi, S., Fritz, M.: Adversarial watermarking transformer: towards tracing text provenance with data hiding. In: 2021 IEEE Symposium on Security and Privacy (SP), pp. 121–140. IEEE (2021)
Google Scholar
Zhang, R., Hussain, S.S., Neekhara, P., Koushanfar, F.: Remark-LLM: a robust and efficient watermarking framework for generative large language models. arXiv preprint arXiv:2310.12362 (2023)
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016)
Liu, Y., Hu, H., Zhang, X., Sun, L.: Watermarking text data on large language models for dataset copyright protection. arXiv preprint arXiv:2305.13257 (2023)
Sun, Z., Du, X., Song, F., Ni, M., Li, L.: CoProtector: protect open-source code against unauthorized training usage with data poisoning. In: Proceedings of the ACM Web Conference 2022, pp. 652–660 (2022)
Google Scholar
Kirchenbauer, J., Geiping, J., Wen, Y., Katz, J., Miers, I., Goldstein, T.: A watermark for large language models. In: International Conference on Machine Learning. PMLR, pp. 17061–17084 (2023)
Google Scholar
Iqbal, M.M., Khadam, U., Han, K.J., Han, J., Jabbar, S.: A robust digital watermarking algorithm for text document copyright protection based on feature coding. In: 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), pp. 1940–1945. IEEE (2019)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. U23A20307, No. 62272118). We extend our gratitude to the Foundation for their financial support, our collaborators and team members for their contributions, and the editors and reviewers for their valuable feedback.

Author information

Authors and Affiliations

Institute of Artificial Intelligence, Guangzhou University, Guangzhou, China
Weitong Chen, Zhengdao Li & Shanshan Huang
School of Cyber Engineering, Xidian University, Xi’an, China
Zhenxin Zhang
School of Cyberspace Security, Guangzhou University, Guangzhou, China
Huali Ren
School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing, China
Pei-Gen Ye

Authors

Weitong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhenxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Huali Ren
View author publications
You can also search for this author in PubMed Google Scholar
Pei-Gen Ye
View author publications
You can also search for this author in PubMed Google Scholar
Zhengdao Li
View author publications
You can also search for this author in PubMed Google Scholar
Shanshan Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pei-Gen Ye .

Editor information

Editors and Affiliations

City University of Macau, Macau, China
Tianqing Zhu
Guangzhou University, Guangzhou, China
Jin Li
University of Salerno, Fisciano, Italy
Aniello Castiglione

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, W., Zhang, Z., Ren, H., Ye, PG., Li, Z., Huang, S. (2025). Temperature-Based Watermarking and Detection for Large Language Models. In: Zhu, T., Li, J., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2024. Lecture Notes in Computer Science, vol 15251. Springer, Singapore. https://doi.org/10.1007/978-981-96-1525-4_18

Download citation

DOI: https://doi.org/10.1007/978-981-96-1525-4_18
Published: 17 February 2025
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-1524-7
Online ISBN: 978-981-96-1525-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics