Skip to main content
Have a personal or library account? Click to login
Embo: a Python package for empirical data analysis using the Information Bottleneck Cover

Embo: a Python package for empirical data analysis using the Information Bottleneck

Open Access
|May 2021

Figures & Tables

Figure 1

From embo’s documentation (examples/Basic-example.ipynb): Top, red: IB curves for two simple synthetic datasets, one where both X and Y are binary (left column, “Two symbols”) and one where they can both take on 4 possible states (right column, “Four symbols”). Each dot represents the solution of Equation (1) for a particular value of β (solid lines connecting the dots are added for legibility). Gray: identity line. Bottom: values of I(M : Y) and I(M : X) vs their corresponding values of β. See the software documentation for further detail on how these figures were generated. Note that the IB curve is always below the identity line and that the values of I(M : Y) and I(M : X) are never larger than the base 2 logarithm of the number of states (1 bit and 2 bits, respectively, corresponding to 2 and 4 states, respectively). These are conditions that the IB curve should always satisfy [1] and can be taken as sanity checks for embo’s correct operation.

Figure 2

From the documentation (examples/Deterministic-Bottleneck.ipynb): comparison of IB and DIB, similarly to Figure 2 in [12]. In this example, X can take on one out of 128 possible states, Y can take on one out of 32 states, and p(x) is close to uniform (see the notebook for details about the joint p(x, y)). Left: IB and DIB solutions for a range of β values, visualized in the “IB plane” where I(M : Y) is plotted against I(M : X). Right: same solutions as in the left panel, visualized in the “DIB plane” where I(M : Y) is plotted against H(M). As expected from [12], in the IB plane the two methods behave similarly. In the DIB plane, however, the DIB performs better than the IB in the sense that H(M) is much lower for the DIB than for the IB, for any given value of I(M : Y).

Figure 3

From embo’s documentation (examples/Compare-embo-dit.ipynb): comparison of embo and dit [14] on sample IB problems of different dimensionality, defined as the number of possible states for the joint random variable (X, Y). The problem with dimensionality 9 (where both X and Y have three possible states) is taken from the documentation of the current version of dit. Left: runtime vs dimensionality. Dit/sp and dit/ba indicate the algorithm used by dit: sp for scipy.optimize and ba for the Blahut-Arimoto algorithm. It was not possible to run dit on the smallest problem due to a software bug. Center: IB bound for the problem with dimensionality 9, computed with embo and dit. Embo and dit/sp (blue and orange) find the same solution, while dit/ba (green) finds a suboptimal one. Right: I (M : X) and I (M : Y) as a function of β. Note how dit/ba (green) becomes unstable at large β. See notebook for more details.

DOI: https://doi.org/10.5334/jors.322 | Journal eISSN: 2049-9647
Language: English
Submitted on: Feb 4, 2020
Accepted on: May 13, 2021
Published on: May 31, 2021
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2021 Eugenio Piasini, Alexandre L. S. Filipowicz, Jonathan Levine, Joshua I. Gold, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.