A Generalized Information Bottleneck Theory of Deep Learning

Published in arXiv preprint, 2025

Abstract

The authors introduce a Generalized Information Bottleneck framework that reconceptualizes the original Information Bottleneck principle through synergy—information accessible only via joint feature processing. Their findings suggest synergistic functions demonstrate improved generalization. They reformulate the IB using interaction information and establish that the original IB objective is upper bounded by our GIB in the case of perfect estimation. The approach reveals compression phases across diverse architectures, including ReLU networks where standard IB struggles, and produces interpretable dynamics in CNNs and Transformers while connecting to adversarial robustness understanding.

Download paper here

Recommended citation: Westphal, C., Hailes, S., & Musolesi, M. (2025). A Generalized Information Bottleneck Theory of Deep Learning. arXiv preprint arXiv:2509.26327.
Download Paper

Twitter Facebook LinkedIn

Charles Westphal

Abstract