STEAD: Robust Provably Secure Linguistic Steganography with Diffusion Language Model

May 1, 2025ยท
Yuang Qi
,
Na Zhao
Qiyi Yao
Qiyi Yao
,
Benlong Wu
,
Weiming Zhang
,
Nenghai Yu
,
Kejiang Chen
ยท 1 min read
Abstract
Recent provably secure linguistic steganography (PSLS) methods rely on mainstream autoregressive language models (ARMs) to address historically challenging tasks, that is, to disguise covert communication as ``innocuous’’ natural language communication. However, due to the characteristic of sequential generation of ARMs, the stegotext generated by ARM-based PSLS methods will produce serious error propagation once it changes, making existing methods unavailable under an active tampering attack. To address this, we propose a robust provably secure linguistic steganography with diffusion language models (DMs). Unlike ARMs, DMs can generate text in partial parallel manner, allowing us to find robust positions for steganographic embedding that can be combined with error-correcting codes. Furthermore, we introduce an error correction strategies, including pseudo-random error correction and neighborhood search correction, during steganographic extraction. Theoretical proof and experimental results demonstrate that our method is secure and robust. It can resist token ambiguity in stegotext segmentation and, to some extent, withstand token-level attacks of insertion, deletion, and substitution.
Type
Publication
Submitted, The Thirty-Ninth Annual Conference on Neural Information Processing Systems
This work is driven by the results in my [previous paper](/publication/conference-paper/) on LLMs.
Create your slides in Markdown - click the Slides button to check out the example.

Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.