Publications

You can also find my articles on my Google Scholar profile.

Conference & Journal

Hide and Seek in Embedding Space: Geometry-based Steganography and Detection in Large Language Models

Charles Westphal, Keivan Navaie, Fernando E. Rosas

ICML 2026

A geometry-based steganography scheme that hides secrets in fine-tuned LLM outputs via embedding-space hyperplanes, together with a linear-probe detector that exposes it more reliably than traditional steganalysis.

arXiv

\[\mathcal{X}_* \in \Bigl\{ \mathcal{P} \in \mathscr{P}(\mathcal{X}) \;:\; |\mathcal{P}| = \min_{H(A \mid \mathcal{P}) = H(A \mid \mathcal{X})} |\mathcal{P}| \;\;\&\;\; H(A \mid \mathcal{P}) = H(A \mid \mathcal{X}) \Bigr\}\]

Information-theoretic State Variable Selection for Reinforcement Learning

Charles Westphal, Stephen Hailes, Mirco Musolesi

TMLR 2026

The Transfer Entropy Redundancy Criterion (TERC): an information-theoretic test that provably drops state variables with no effect on agent performance, improving sample efficiency across Q-learning, Actor-Critic, and PPO.

arXiv

Feature Selection for Network Intrusion Detection

Charles Westphal, Stephen Hailes, Mirco Musolesi

SIGKDD 2025

An information-theoretic feature selection method (FSNID) that drops uninformative inputs for network intrusion detection while preserving classifier performance.

arXiv GitHub

Partial Information Decomposition for Data Interpretability and Feature Selection

Charles Westphal, Stephen Hailes, Mirco Musolesi

AISTATS 2025

PIDF replaces a single feature-importance score with three: how much information a feature shares with the target, how much arises only in combination with others (synergy), and how much is redundant with what other features already carry.

arXiv GitHub

Preprints

Now You (Still) See Me: Detecting Evasive Steganographic Payloads in LLMs

Charles Westphal, Timothy Douglas, Keivan Navaie, Tiago Pimentel, Fernando E. Rosas

arXiv preprint, 2026

Linear-probe detectors for LLM steganography can be evaded by adversarial fine-tuning (58–79% covert recovery preserved). A theory-guided recontextualization intervention restores detection where activation-only methods fail.

arXiv

A Generalized Information Bottleneck Theory of Deep Learning

Charles Westphal, Stephen Hailes, Mirco Musolesi

arXiv preprint, 2025

Recasts the Information Bottleneck through synergy — information that only appears when features are processed jointly — yielding interpretable compression phases in ReLU networks, CNNs, and Transformers where standard IB struggles.

arXiv

Mutual Information Preserving Neural Network Pruning

Charles Westphal, Stephen Hailes, Mirco Musolesi

arXiv preprint, 2024

A structured pruning method that keeps the nodes carrying mutual information between adjacent layers, with a guarantee that the pruned upstream activations can still be mapped to the downstream layer — so the network remains retrainable.

arXiv