Robust and Efficient Watermarking of Large Language Models Using Error Correction Codes

Xiaokun Luan; Zeming Wei; Yihao Zhang; Meng Sun

Robust and Efficient Watermarking of Large Language Models Using Error Correction Codes

Authors: Xiaokun Luan (Peking University), Zeming Wei (Peking University), Yihao Zhang (Peking University), Meng Sun (Peking University)

Volume: 2025
Issue: 4
Pages: 183–200
DOI: https://doi.org/10.56553/popets-2025-0126

Artifact: Available, Functional

Download PDF

Abstract: Large language models (LLMs) have demonstrated remarkable performance in various tasks, but they also face challenges in intellectual property (IP) protection. Traditional training-based watermarking techniques are computationally expensive, while function invariant transformations (FITs) offer a lightweight alternative. Nevertheless, FIT-based watermarking methods are vulnerable to adaptive attacks, where adversaries can exploit the same transformation to remove or forge watermarks. We propose a novel white-box watermarking scheme that combines error correction codes (ECCs) with weight permutations. By encoding model identifiers using ECCs, our approach guarantees reliable watermark extraction under various attacks. Additionally, we develop a linear assignment-based extraction algorithm to enhance its efficiency. Evaluations on six LLMs show that our method offers robust watermarking capabilities. It has a minimal impact on model performance while effectively defending against removal and forgery attacks. Overall, our approach provides a scalable and secure solution for safeguarding the copyrights of LLMs.

Keywords: watermarking, error correction code, large language models

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution 4.0 license.