The Theory of Persistence
Identité

GFT — Gap Fundamental Theorem

$\log_2 m = D_\mathrm{KL} + H$ — fundamental principle of persistence.

Statement

For any probability distribution P=(p1,,pm)P = (p_1, \ldots, p_m) on the mm classes of Z/mZ\mathbb{Z}/m\mathbb{Z}, the following identity is tautological:

log2m=DKL(PUm)+H(P),\boxed{\log_2 m = D_{\mathrm{KL}}(P \,\|\, U_m) + H(P),}

where:

Total capacity is conserved: whatever is not “persistent information” (DKLD_{\mathrm{KL}}) is “noise” (HH), and conversely. This conservation is the fundamental principle of persistence.

Identité

Plain reading. Picture a fixed total budget of log2m\log_2 m bits (the “informational capacity”). Every distribution divides this budget into two parts: what is structured (deviates from pure randomness) and what stays disordered. Their sum is exactly the total budget. No more, no less. It’s the informational version of the first principle: capacity is neither created nor destroyed, only partitioned between persistence and entropy.

Why it matters

GFT is the master identity of PT, the fundamental principle of persistence. It provides the framework in which all conservation laws are formulated: at every sieve step, information that “persists” (DKLD_{\mathrm{KL}} growing) exactly compensates entropy that “releases” (HH decreasing). No loss, no net gain.

Direct consequences:

GFT also justifies the “persistence” semantics of the theory’s name: what “persists” is precisely DKLD_{\mathrm{KL}}.

Proof — outline

  1. Write DKL(PUm)=ipilog2(pi/(1/m))D_{\mathrm{KL}}(P \,\|\, U_m) = \sum_i p_i \log_2(p_i / (1/m)).
  2. Distribute the logarithm: log2(pi/(1/m))=log2pi+log2m\log_2(p_i / (1/m)) = \log_2 p_i + \log_2 m.
  3. Sum: ipilog2pi+ipilog2m\sum_i p_i \log_2 p_i + \sum_i p_i \log_2 m.
  4. Recognise: first term is H(P)-H(P), second is log2m\log_2 m.
  5. Rearrange: log2mH(P)=DKL\log_2 m - H(P) = D_{\mathrm{KL}}, so log2m=DKL+H\log_2 m = D_{\mathrm{KL}} + H.

Detailed proof

Step 1 — Definition of DKLD_{\mathrm{KL}}

The Kullback–Leibler divergence between P=(p1,,pm)P = (p_1, \ldots, p_m) and the uniform distribution Um=(1/m,,1/m)U_m = (1/m, \ldots, 1/m) is:

DKL(PUm)=i=1mpilog2pi1/m.D_{\mathrm{KL}}(P \,\|\, U_m) = \sum_{i=1}^m p_i \log_2 \frac{p_i}{1/m}.

Step 2 — Distributing the logarithm

By log properties:

log2pi1/m=log2pi+log2m.\log_2 \frac{p_i}{1/m} = \log_2 p_i + \log_2 m.

Substituting:

DKL=ipi(log2pi+log2m)=ipilog2pi+log2mipi.D_{\mathrm{KL}} = \sum_i p_i (\log_2 p_i + \log_2 m) = \sum_i p_i \log_2 p_i + \log_2 m \sum_i p_i.

Step 3 — Normalisation and entropy

Since PP is a distribution, ipi=1\sum_i p_i = 1, and by definition of Shannon entropy, H(P)=ipilog2piH(P) = -\sum_i p_i \log_2 p_i.

So:

DKL=H(P)+log2m,D_{\mathrm{KL}} = -H(P) + \log_2 m,

which rearranges to:

log2m=DKL+H(P).\log_2 m = D_{\mathrm{KL}} + H(P).

QED

The identity is purely algebraic. It depends neither on the nature of PP (arbitrary), nor on the physical origin of the mm states, nor on any dynamical hypothesis. This makes it an identity in the strong sense, stronger than a theorem: it cannot be falsified because it is true by algebra alone.

Consequence — Bekenstein bound

Since H(P)0H(P) \geq 0 for every distribution (positive entropy), immediately:

DKL(PUm)log2m.D_{\mathrm{KL}}(P \,\|\, U_m) \leq \log_2 m.

This is the universal Bekenstein bound: no distribution on mm states can have more than log2m\log_2 m bits of persistent structure. PT identifies this bound with the holographic information cap of a finite region.

Consequence — arrow of time

If a dynamical evolution preserves log2m\log_2 m (sieve case: mm fixed), then:

dHdDKL=1.\frac{dH}{dD_{\mathrm{KL}}} = -1.

Any increase in DKLD_{\mathrm{KL}} pays an equal decrease in HH, and vice versa. This is the PT arrow of time: natural evolution increases HH (second law), so decreases DKLD_{\mathrm{KL}} (“decrystallisation”). The sieve locally inverts this arrow — DKLD_{\mathrm{KL}} grows at every step, which defines “persistence”.

Consequence — Ruelle equivalence

For the transfer matrix TmT_m, the Ruelle partition function is:

ZRuelle=Tr(TmN)=λλN,Z_{\mathrm{Ruelle}} = \mathrm{Tr}(T_m^N) = \sum_\lambda \lambda^N,

where the sum is over eigenvalues of TmT_m. As NN \to \infty, this becomes the exponential of topological entropy. GFT gives the explicit identity with log2m\log_2 m as cap, and free energy FR=0F_R = 0 (in the pure Ruelle sense).

For the complete derivation and consequences (Bekenstein, arrow of time, Ruelle), see chapter 4 of the monograph.

See also