What are we encoding, when we speak of a finite set?

Given that there is a bijective map from a finite set with \(N\) elements to \([1,N]\), without loss of generality we may focus on encoding \([1,N]\). Now, encoding the structure of a set requires identifying the properties that its members have in common. So one correct definition of \([1,N]\) is given by:

\begin{equation} [1,N] = \{n: 1 \leq n \leq N\} \end{equation}

and crucially, the definition of a finite set is permutation invariant so we have:

\begin{equation} [1,N] = \bigcap_{\sigma \in S_N} \{\sigma \circ \{n\}_{n=1}^N\} \end{equation}

where \(\lvert S_N \rvert = N!\)

It follows that the minimum description length of \([1,N]\) is the program \(f\) which given the empty string \(\epsilon\) returns a sequence sampled uniformly from \(\{\sigma \circ \{n\}_{n=1}^N\}_{\sigma \in S_N}\). This implies that \(f\) comes equipped with a pseudo-RNG and that \(f\) has a representation of all \(N!\) permutations in \(S_N\).

A proof via the product formula:

If we define the computable function \(f\),

\begin{equation} \forall A \subset \mathbb{N}, f \circ A = \prod_i a_i \end{equation}

for a fixed Universal Turing Machine \(U\),

\begin{equation} K_U(f \circ A) \leq K_U(A) + \text{Cst} \end{equation}

where \(K_U(f) = \text{Cst}\).

Now, an integer \(N\) is incompressible if \(K_U(N) \sim \ln N\) so we may deduce the following upper and lower bounds on \(K_U([1,N])\):

\begin{equation} K_U(N!) + \text{Cst} \leq K_U([1,N]) \leq N \cdot \ln N - N \end{equation}

so it follows that for large \(N\):

\begin{equation} K_U([1,N]) \sim N \cdot \ln N - N \end{equation}

A proof using the Symmetric Group \(S_N\):

The reader may observe that the product of an integer-indexed set \(A\) is invariant to permutations of \(A\) so if we define the set:

\begin{equation} \Sigma = \{\sigma \circ [1,N]\}_{\sigma \in S_N} \end{equation}

where \(S_N\) denotes the symmetric group, we must have:

\begin{equation} \forall x,y \in \Sigma, K_U(x) = K_U(y) \end{equation}

and so if we define the random variable \(X\) sampled uniformly from \(\Sigma\), the fact that the expected Kolmogorov Complexity equals its Shannon entropy implies:

\begin{equation} K_U([1,N]) = \frac{1}{|\Sigma|} \sum_{i=1}^{|\Sigma|} K_U(\sigma_i \circ [1,N]) \sim H(X) \end{equation}

where for large \(N\),

\begin{equation} H(X) = \ln |\Sigma| \sim N \cdot \ln N - N \end{equation}

due to Stirling’s log-factorial approximation.


  1. A. N. Kolmogorov. Three approaches to the quantitative definition of information. Problems of Information and Transmission, 1(1):1–7, 1965

  2. Peter Grünwald and Paul Vitányi. Shannon Information and Kolmogorov Complexity. 2010.