Partial Information Decomposition and the Information Delta: A Geometric Unification Disentangling Non-Pairwise Information
<p>(<b>A</b>) Visualization of the Information Decomposition (adapted from [<a href="#B3-entropy-22-01333" class="html-bibr">3</a>]) and its governing equations. The system is underdetermined. (<b>B</b>) Sample binary datasets which contain only one type of information. For (i), where <math display="inline"><semantics> <mrow> <mi>Z</mi> <mo>=</mo> <mi>X</mi> </mrow> </semantics></math>, <span class="html-italic">X</span> contains all information about <span class="html-italic">Z</span> and <span class="html-italic">Y</span> is irrelevant, such <math display="inline"><semantics> <msub> <mi>U</mi> <mi>X</mi> </msub> </semantics></math> is equal to the total information and all other terms are zero. For (ii), where <math display="inline"><semantics> <mrow> <mi>Z</mi> <mo>=</mo> <mi>X</mi> <mo>=</mo> <mi>Y</mi> </mrow> </semantics></math>, <span class="html-italic">X</span> and <span class="html-italic">Y</span> are always identical and thus the information is fully redundant. For (iii), where <span class="html-italic">Z</span> is the XOR function of <span class="html-italic">X</span> and <span class="html-italic">Y</span>, both <span class="html-italic">X</span> and <span class="html-italic">Y</span> are independent of <span class="html-italic">Z</span>, but fully determine its value when taken jointly.</p> "> Figure 2
<p>A geometric interpretation of the Information Deltas, as developed in [<a href="#B18-entropy-22-01333" class="html-bibr">18</a>]. (<b>A</b>) Consider functions where each variable has an alphabet size of three possible values. There are 19,683 possible functions <math display="inline"><semantics> <mrow> <mi>f</mi> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> </mrow> </semantics></math>. If the variables <span class="html-italic">X</span> and <span class="html-italic">Y</span> are independent, these functions map onto 105 unique points (function families) within a plane in <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space. (<b>B</b>) Sample functions and their mappings onto <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space. Functions with a full pairwise dependence on <span class="html-italic">X</span> or <span class="html-italic">Y</span> map to opposite lower corners, whereas the fully synergistic XOR (i.e., the XOR-like ternary extension <math display="inline"><semantics> <mrow> <mi>X</mi> <mi>O</mi> <mi>R</mi> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> <mo>≡</mo> <mo>(</mo> <mi>X</mi> <mo>+</mo> <mi>Y</mi> <mo>)</mo> <mi>mod</mi> <mn>3</mn> </mrow> </semantics></math>) is mapped to the uppermost corner.</p> "> Figure 3
<p>As shown in Equations (<a href="#FD23-entropy-22-01333" class="html-disp-formula">23</a>) and (<a href="#FD24-entropy-22-01333" class="html-disp-formula">24</a>), the <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space encodes the balance of synergy/redundancy along one diagonal, and the balance of unique information in each source along the other.</p> "> Figure 4
<p>An example mapping of the Bertschinger set <span class="html-italic">Q</span> to <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space for a randomly chosen function <span class="html-italic">f</span>. A set <span class="html-italic">Q</span> consists of all probability distributions <math display="inline"><semantics> <mrow> <mi>p</mi> <mo>(</mo> <mi>X</mi> <mo>=</mo> <mi>x</mi> <mo>,</mo> <mi>Y</mi> <mo>=</mo> <mi>y</mi> <mo>,</mo> <mi>Z</mi> <mo>=</mo> <mi>z</mi> <mo>)</mo> </mrow> </semantics></math> that share the same marginal distributions <math display="inline"><semantics> <mrow> <mi>p</mi> <mo>(</mo> <mi>X</mi> <mo>=</mo> <mi>x</mi> <mo>,</mo> <mi>Z</mi> <mo>=</mo> <mi>z</mi> <mo>)</mo> </mrow> </semantics></math> and <math display="inline"><semantics> <mrow> <mi>p</mi> <mo>(</mo> <mi>Y</mi> <mo>=</mo> <mi>y</mi> <mo>,</mo> <mi>Z</mi> <mo>=</mo> <mi>z</mi> <mo>)</mo> </mrow> </semantics></math>. Each <span class="html-italic">Q</span> maps onto a set of points with a complex distribution, but which is constrained to a simple plane in <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space.</p> "> Figure 5
<p>The same function’s <span class="html-italic">Q</span> mapped onto <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space as in <a href="#entropy-22-01333-f004" class="html-fig">Figure 4</a>, viewed from a different angle. <span class="html-italic">Q</span> is constrained to a plane in <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space. This plane, highlighted in red, contains the <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-coordinates of the function <span class="html-italic">f</span> (indicated by the red dot) as well as the line<math display="inline"><semantics> <mrow> <mo>(</mo> <msub> <mi>δ</mi> <mi>X</mi> </msub> <mo>=</mo> <msub> <mi>δ</mi> <mi>Y</mi> </msub> <mo>,</mo> <msub> <mi>δ</mi> <mi>Z</mi> </msub> <mo>=</mo> <mn>1</mn> <mo>)</mo> </mrow> </semantics></math> (indicated by the solid red line).</p> "> Figure 6
<p>All functions <math display="inline"><semantics> <mrow> <mi>Z</mi> <mo>=</mo> <mi>f</mi> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> </mrow> </semantics></math> (with alphabet sizes of 3) mapped onto a plane in <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space, as in <a href="#entropy-22-01333-f002" class="html-fig">Figure 2</a>. Each function is colored by the fraction of the total information in each PID component, as computed using the solution of [<a href="#B9-entropy-22-01333" class="html-bibr">9</a>]. There is a clear geometric structure to the decomposition which matches the previously discussed intuition about <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space.</p> "> Figure 7
<p>The same set of all 3-letter functions <math display="inline"><semantics> <mrow> <mi>Z</mi> <mo>=</mo> <mi>f</mi> <mo>(</mo> <mi>X</mi> <mo>,</mo> <mi>Y</mi> <mo>)</mo> </mrow> </semantics></math> mapped onto a plane in <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space, as in <a href="#entropy-22-01333-f006" class="html-fig">Figure 6</a>. The colorscale shows the amount of information in each component, now computed using the pointwise solution of Finn and Lizier [<a href="#B19-entropy-22-01333" class="html-bibr">19</a>]. In this formulation, the PID components are the average difference between two subcomponents, the specificity and ambiguity, and can be negative when the latter exceeds the former. Visualizing this solution immediately highlights the differences in how it decomposes the information of functions and leads to an alternate interpretation of <math display="inline"><semantics> <mi>δ</mi> </semantics></math>-space.</p> ">
Abstract
:1. Introduction
2. Background
2.1. Interaction Information and Multi-Information
2.2. Information Decomposition
2.3. Solution from Bertschinger et al.
2.4. Information Deltas and Their Geometry
3. PID Mapped into Information Deltas
3.1. Information Decomposition in Terms of Deltas
3.2. Relationship between Diagonal and Interaction Information
3.3. The Function Plane
4. Solving the PID on the Function Plane
4.1. Transforming Probability Tensors within Q
4.2. -Coordinates in Q Are Always Restricted to a Plane
4.3. PID Calculation for All Functions
4.4. Alternate Solutions: Pointwise PID
5. Conclusions
- Construct a library (set) of distributions for all functions, . Specifically, record the -coordinates spanned by each distribution (e.g., as plotted in Figure 4) along with the corresponding function and its PID component values.
- For a set of variables in data for which we wish to find the decomposition, compute its -coordinates and then match them to the closest . This will then immediately yield the corresponding function and PID components.
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
PID | Partial Information Decomposition |
II | Interaction Information |
CI | Co-Information |
Unique Information in X | |
Unique Information in Y | |
R | Redundant Information |
S | Synergistic Information |
PPID | Pointwise Partial Information Decomposition |
References
- Williams, P.L.; Beer, R.D. Nonnegative decomposition of multivariate information. arXiv 2010, arXiv:1004.2515. [Google Scholar]
- James, R.G.; Crutchfield, J.P. Multivariate dependence beyond Shannon information. Entropy 2017, 19, 531. [Google Scholar] [CrossRef]
- Lizier, J.T.; Bertschinger, N.; Jost, J.; Wibral, M. Information Decomposition of Target Effects from Multi-Source Interactions: Perspectives on Previous, Current and Future Work. Entropy 2018, 20, 307. [Google Scholar] [CrossRef] [Green Version]
- Lizier, J.T.; Bertschinger, N.; Jost, J.; Wibral, M. Bivariate measure of redundant information. Phys. Rev. E 2013, 87, 012130. [Google Scholar]
- Griffith, V.; Chong, E.K.; James, R.G.; Ellison, C.J.; Crutchfield, J.P. Intersection information based on common randomness. Entropy 2014, 16, 1985–2000. [Google Scholar] [CrossRef] [Green Version]
- Barrett, A.B. Exploration of synergistic and redundant information sharing in static and dynamical Gaussian systems. Phys. Rev. E 2015, 91, 052802. [Google Scholar] [CrossRef] [Green Version]
- Ince, R.A. Measuring multivariate redundant information with pointwise common change in surprisal. Entropy 2017, 19, 318. [Google Scholar] [CrossRef] [Green Version]
- Rauh, J.; Banerjee, P.K.; Olbrich, E.; Jost, J.; Bertschinger, N. On extractable shared information. Entropy 2017, 19, 328. [Google Scholar] [CrossRef] [Green Version]
- Bertschinger, N.; Rauh, J.; Olbrich, E.; Jost, J.; Ay, N. Quantifying unique information. Entropy 2014, 16, 2161–2183. [Google Scholar] [CrossRef] [Green Version]
- Timme, N.; Alford, W.; Flecker, B.; Beggs, J.M. Synergy, redundancy, and multivariate information measures: An experimentalist’s perspective. J. Comput. Neurosci. 2014, 36, 119–140. [Google Scholar] [CrossRef]
- Stramaglia, S.; Cortes, J.M.; Marinazzo, D. Synergy and redundancy in the Granger causal analysis of dynamical networks. New J. Phys. 2014, 16, 105003. [Google Scholar] [CrossRef] [Green Version]
- Timme, N.M.; Ito, S.; Myroshnychenko, M.; Nigam, S.; Shimono, M.; Yeh, F.C.; Hottowy, P.; Litke, A.M.; Beggs, J.M. High-degree neurons feed cortical computations. PLoS Comput. Biol. 2016, 12, e1004858. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wibral, M.; Priesemann, V.; Kay, J.W.; Lizier, J.T.; Phillips, W.A. Partial information decomposition as a unified approach to the specification of neural goal functions. Brain Cogn. 2017, 112, 25–38. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wibral, M.; Finn, C.; Wollstadt, P.; Lizier, J.T.; Priesemann, V. Quantifying information modification in developing neural networks via partial information decomposition. Entropy 2017, 19, 494. [Google Scholar] [CrossRef] [Green Version]
- Kay, J.W.; Ince, R.A.; Dering, B.; Phillips, W.A. Partial and entropic information decompositions of a neuronal modulatory interaction. Entropy 2017, 19, 560. [Google Scholar] [CrossRef] [Green Version]
- Galas, D.J.; Sakhanenko, N.A.; Skupin, A.; Ignac, T. Describing the complexity of systems: Multivariable “set complexity” and the information basis of systems biology. J. Comput. Biol. 2014, 21, 118–140. [Google Scholar] [CrossRef] [Green Version]
- Sakhanenko, N.A.; Galas, D.J. Biological data analysis as an information theory problem: Multivariable dependence measures and the Shadows algorithm. J. Comput. Biol. 2015, 22, 1005–1024. [Google Scholar] [CrossRef] [Green Version]
- Sakhanenko, N.; Kunert-Graf, J.; Galas, D. The Information Content of Discrete Functions and Their Application in Genetic Data Analysis. J. Comp. Biol. 2017, 24, 1153–1178. [Google Scholar] [CrossRef] [Green Version]
- Finn, C.; Lizier, J.T. Pointwise partial information decomposition using the specificity and ambiguity lattices. Entropy 2018, 20, 297. [Google Scholar] [CrossRef] [Green Version]
- Kunert-Graf, J. kunert/deltaPID: Initial Release (Version v1.0.0). Zenodo 2020. [Google Scholar] [CrossRef]
- McGill, W. Multivariate information transmission. Trans. IRE Prof. Group Inf. Theory 1954, 4, 93–111. [Google Scholar] [CrossRef]
- Jakulin, A.; Bratko, I. Quantifying and Visualizing Attribute Interactions: An Approach Based on Entropy. arXiv 2003, arXiv:cs/0308002. [Google Scholar]
- Bell, A.J. The co-information lattice. In Proceedings of the Fifth International Workshop on Independent Component Analysis and Blind Signal Separation, ICA, Citeseer, Granada, Spain, 22–24 September 2003; Volume 2003. [Google Scholar]
- Watanabe, S. Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 1960, 4, 66–82. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kunert-Graf, J.; Sakhanenko, N.; Galas, D. Partial Information Decomposition and the Information Delta: A Geometric Unification Disentangling Non-Pairwise Information. Entropy 2020, 22, 1333. https://doi.org/10.3390/e22121333
Kunert-Graf J, Sakhanenko N, Galas D. Partial Information Decomposition and the Information Delta: A Geometric Unification Disentangling Non-Pairwise Information. Entropy. 2020; 22(12):1333. https://doi.org/10.3390/e22121333
Chicago/Turabian StyleKunert-Graf, James, Nikita Sakhanenko, and David Galas. 2020. "Partial Information Decomposition and the Information Delta: A Geometric Unification Disentangling Non-Pairwise Information" Entropy 22, no. 12: 1333. https://doi.org/10.3390/e22121333
APA StyleKunert-Graf, J., Sakhanenko, N., & Galas, D. (2020). Partial Information Decomposition and the Information Delta: A Geometric Unification Disentangling Non-Pairwise Information. Entropy, 22(12), 1333. https://doi.org/10.3390/e22121333