STEAD: Robust Provably Secure Linguistic Steganography with Diffusion Language Model

May 1, 2025·

Yuang Qi

Na Zhao

Qiyi Yao

Benlong Wu

Weiming Zhang

Nenghai Yu

Kejiang Chen

· 1 min read

Abstract

Recent provably secure linguistic steganography (PSLS) methods rely on mainstream autoregressive language models (ARMs) to address historically challenging tasks, that is, to disguise covert communication as ``innocuous’’ natural language communication. However, due to the characteristic of sequential generation of ARMs, the stegotext generated by ARM-based PSLS methods will produce serious error propagation once it changes, making existing methods unavailable under an active tampering attack. To address this, we propose a robust provably secure linguistic steganography with diffusion language models (DMs). Unlike ARMs, DMs can generate text in partial parallel manner, allowing us to find robust positions for steganographic embedding that can be combined with error-correcting codes. Furthermore, we introduce an error correction strategies, including pseudo-random error correction and neighborhood search correction, during steganographic extraction. Theoretical proof and experimental results demonstrate that our method is secure and robust. It can resist token ambiguity in stegotext segmentation and, to some extent, withstand token-level attacks of insertion, deletion, and substitution.

Type

Preprint

Publication

Submitted, The Thirty-Ninth Annual Conference on Neural Information Processing Systems

Last updated on May 1, 2025