Charles Westphal

PhD Student in ML · UCL · London

I am a Ph.D. candidate in Computer Science at University College London, where my work is grounded in multivariate information theory and its applications to modern machine learning.

Research

My research is motivated by multivariate information theory — a framework for reasoning about how information is shared, decomposed, and transformed across complex systems. The methods I develop are theoretically rooted but broadly applicable, with applications in:

Feature selection, engineering, and interpretation
Reinforcement learning
Neural network pruning and compression
Variational inference and representation learning

Recent work has been published at SIGKDD, AISTATS, and ICML.

Experience

Ph.D. Candidate · University College London 2021 — Present
Applications of partial information decomposition to AI. Supervised by Mirco Musolesi and Stephen Hailes.
Research Assistant · UCL Computer Science May — Jun 2026
Applying partial information decomposition to understanding agents. Supervised by Mirco Musolesi and Stephen Hailes.
MATS Scholar · Machine Learning Alignment & Theory Scholars Jun 2025 — Mar 2026
Improving steganography, its detection, and understanding its theoretical limits. Supervised by Fernando Rosas and Keivan Navaie.

Publications

Conference & Journal

Hide and Seek in Embedding Space: Geometry-based Steganography and Detection in Large Language Models

Charles Westphal, Keivan Navaie, Fernando E. Rosas

ICML 2026

A geometry-based steganography scheme that hides secrets in fine-tuned LLM outputs via embedding-space hyperplanes, together with a linear-probe detector that exposes it more reliably than traditional steganalysis.

arXiv

\[\mathcal{X}_* \in \Bigl\{ \mathcal{P} \in \mathscr{P}(\mathcal{X}) \;:\; |\mathcal{P}| = \min_{H(A \mid \mathcal{P}) = H(A \mid \mathcal{X})} |\mathcal{P}| \;\;\&\;\; H(A \mid \mathcal{P}) = H(A \mid \mathcal{X}) \Bigr\}\]

Information-theoretic State Variable Selection for Reinforcement Learning

Charles Westphal, Stephen Hailes, Mirco Musolesi

TMLR 2026

The Transfer Entropy Redundancy Criterion (TERC): an information-theoretic test that provably drops state variables with no effect on agent performance, improving sample efficiency across Q-learning, Actor-Critic, and PPO.

arXiv

Feature Selection for Network Intrusion Detection

Charles Westphal, Stephen Hailes, Mirco Musolesi

SIGKDD 2025

An information-theoretic feature selection method (FSNID) that drops uninformative inputs for network intrusion detection while preserving classifier performance.

arXiv GitHub

Partial Information Decomposition for Data Interpretability and Feature Selection

Charles Westphal, Stephen Hailes, Mirco Musolesi

AISTATS 2025

PIDF replaces a single feature-importance score with three: how much information a feature shares with the target, how much arises only in combination with others (synergy), and how much is redundant with what other features already carry.

arXiv GitHub

Preprints

Now You (Still) See Me: Detecting Evasive Steganographic Payloads in LLMs

Charles Westphal, Timothy Douglas, Keivan Navaie, Tiago Pimentel, Fernando E. Rosas

arXiv preprint, 2026

Linear-probe detectors for LLM steganography can be evaded by adversarial fine-tuning (58–79% covert recovery preserved). A theory-guided recontextualization intervention restores detection where activation-only methods fail.

arXiv

A Generalized Information Bottleneck Theory of Deep Learning

Charles Westphal, Stephen Hailes, Mirco Musolesi

arXiv preprint, 2025

Recasts the Information Bottleneck through synergy — information that only appears when features are processed jointly — yielding interpretable compression phases in ReLU networks, CNNs, and Transformers where standard IB struggles.

arXiv

Mutual Information Preserving Neural Network Pruning

Charles Westphal, Stephen Hailes, Mirco Musolesi

arXiv preprint, 2024

A structured pruning method that keeps the nodes carrying mutual information between adjacent layers, with a guarantee that the pruned upstream activations can still be mapped to the downstream layer — so the network remains retrainable.

arXiv

Talks

Feature Selection for Network Intrusion Detection: An Information-Theoretic Approach KDD 2025 · Toronto, Canada · May 2025
Partial Information Decomposition for Data Interpretability and Feature Selection AiStats 2025 · Splash Beach Resort, Mai Khao, Phuket, Thailand · Apr 2025

CV

Download my CV (PDF)

Contact

charles.westphal.21@ucl.ac.uk