Skip to main content
Have a personal or library account? Click to login
Embo: a Python package for empirical data analysis using the Information Bottleneck Cover

Embo: a Python package for empirical data analysis using the Information Bottleneck

Open Access
|May 2021

Abstract

We present embo, a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables X and Y, the IB finds the stochastic mapping M of X that encodes the most information about Y, subject to a constraint on the information that M is allowed to retain about X. Despite the popularity of the IB, an accessible implementation of the reference algorithm oriented towards ease of use on empirical data was missing. Embo is optimized for the common case of discrete, low-dimensional data. Embo is fast, provides a standard data-processing pipeline, offers a parallel implementation of key computational steps, and includes reasonable defaults for the method parameters. Embo is broadly applicable to different problem domains, as it can be employed with any dataset consisting in joint observations of two discrete variables. It is available from the Python Package Index (PyPI), Zenodo and GitLab.

DOI: https://doi.org/10.5334/jors.322 | Journal eISSN: 2049-9647
Language: English
Submitted on: Feb 4, 2020
Accepted on: May 13, 2021
Published on: May 31, 2021
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Eugenio Piasini, Alexandre L. S. Filipowicz, Jonathan Levine, Joshua I. Gold, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.