On The Salient Limitations Of The Methods Of Assembly Theory And their Classification Of Molecular Biosignatures
A recently introduced approach termed “Assembly Theory”, featuring a computable index based on basic principles of statistical compression has been claimed to be a novel and superior approach to classifying and distinguishing living from non-living systems and the complexity of molecular biosignatures.
Here, we demonstrate that the assembly pathway method underlying this index is a suboptimal restricted version of Huffman’s encoding (Shannon-Fano type), widely adopted in computer science in the 1950s, that is comparable (or inferior) to other popular statistical and computable compression schemes. We show how simple modular instructions can mislead the assembly index, leading to failure to capture subtleties beyond trivial statistical properties that are not realistic in biological systems.
We present cases whose low complexities can arbitrarily diverge from the random-like appearance to which the assembly pathway method would assign arbitrarily high statistical significance, and show that it fails in simple cases (synthetic or natural). Our theoretical and empirical results imply that the assembly index, whose computable nature we show is not an advantage, does not offer any substantial advantage over existing concepts and methods computable or uncomputable. Alternatives are discussed.
Abicumaran Uthamacumaran, Felipe S. Abrahão, Narsis A. Kiani, Hector Zenil
Comments: 32 pages with the appendix, 3 figures
Subjects: Information Theory (cs.IT)
Cite as: arXiv:2210.00901 [cs.IT] (or arXiv:2210.00901v2 [cs.IT] for this version)
https://doi.org/10.48550/arXiv.2210.00901
Focus to learn more
Submission history
From: Hector Zenil
[v1] Fri, 30 Sep 2022 11:19:53 UTC (1,113 KB)
[v2] Sun, 9 Oct 2022 00:33:31 UTC (557 KB)
https://arxiv.org/abs/2210.00901
Astrobiology