A Generalized Information Bottleneck Theory of Deep Learning
Published in arXiv preprint, 2025
Abstract
The authors introduce a Generalized Information Bottleneck framework that reconceptualizes the original Information Bottleneck principle through synergy—information accessible only via joint feature processing. Their findings suggest synergistic functions demonstrate improved generalization. They reformulate the IB using interaction information and establish that the original IB objective is upper bounded by our GIB in the case of perfect estimation. The approach reveals compression phases across diverse architectures, including ReLU networks where standard IB struggles, and produces interpretable dynamics in CNNs and Transformers while connecting to adversarial robustness understanding.
Recommended citation: Westphal, C., Hailes, S., & Musolesi, M. (2025). A Generalized Information Bottleneck Theory of Deep Learning. arXiv preprint arXiv:2509.26327.
Download Paper
