File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)

Article: Mutual information and the encoding of contingency tables

TitleMutual information and the encoding of contingency tables
Authors
Issue Date5-Dec-2024
PublisherAmerican Physical Society
Citation
Physical Review E, 2024, v. 110, n. 6 How to Cite?
AbstractMutual information is commonly used as a measure of similarity between competing labelings of a given set of objects, for example to quantify performance in classification and community detection tasks. As argued recently, however, the mutual information as conventionally defined can return biased results because it neglects the information cost of the so-called contingency table, a crucial component of the similarity calculation. In principle the bias can be rectified by subtracting the appropriate information cost, leading to the modified measure known as the reduced mutual information, but in practice one can only ever compute an upper bound on this information cost, and the value of the reduced mutual information depends crucially on how good a bound is established. In this paper we describe an improved method for encoding contingency tables that gives a substantially better bound in typical use cases and approaches the ideal value in the common case where the labelings are closely similar, as we demonstrate with extensive numerical results.
Persistent Identifierhttp://hdl.handle.net/10722/366329
ISSN
2023 Impact Factor: 2.2
2023 SCImago Journal Rankings: 0.805

 

DC FieldValueLanguage
dc.contributor.authorJerdee, Maximilian-
dc.contributor.authorKirkley, Alec-
dc.contributor.authorNewman, M. E.J.-
dc.date.accessioned2025-11-25T04:18:47Z-
dc.date.available2025-11-25T04:18:47Z-
dc.date.issued2024-12-05-
dc.identifier.citationPhysical Review E, 2024, v. 110, n. 6-
dc.identifier.issn2470-0045-
dc.identifier.urihttp://hdl.handle.net/10722/366329-
dc.description.abstractMutual information is commonly used as a measure of similarity between competing labelings of a given set of objects, for example to quantify performance in classification and community detection tasks. As argued recently, however, the mutual information as conventionally defined can return biased results because it neglects the information cost of the so-called contingency table, a crucial component of the similarity calculation. In principle the bias can be rectified by subtracting the appropriate information cost, leading to the modified measure known as the reduced mutual information, but in practice one can only ever compute an upper bound on this information cost, and the value of the reduced mutual information depends crucially on how good a bound is established. In this paper we describe an improved method for encoding contingency tables that gives a substantially better bound in typical use cases and approaches the ideal value in the common case where the labelings are closely similar, as we demonstrate with extensive numerical results.-
dc.languageeng-
dc.publisherAmerican Physical Society-
dc.relation.ispartofPhysical Review E-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.titleMutual information and the encoding of contingency tables-
dc.typeArticle-
dc.identifier.doi10.1103/PhysRevE.110.064306-
dc.identifier.scopuseid_2-s2.0-85211069808-
dc.identifier.volume110-
dc.identifier.issue6-
dc.identifier.eissn2470-0053-
dc.identifier.issnl2470-0045-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats