Files
NoteNextra-origin/content/CSE5313/CSE5313_L8.md
2025-11-04 12:43:23 -06:00

7.8 KiB
Raw Permalink Blame History

CSE5313 Coding and information theory for data science (Lecture 8)

Review on Linear codes

The code Dimension k (effective message length) Minimum distance d Dual code Minimum distance of dual code
\mathbb{F}^n n 1 \{0\} 0
Parity code n-1 2 Repetition code n
Hamming code 2^m-m-1 3 Punctured Hadamard code 2^{m-1}

More on linear codes

Extended Hamming code

Consider the Hamming code [2^m-1,2^m-m-1,3]_{\mathbb{F}_2}.

Extend it to a cod eof length 2^m by adding a parity bit.

Recall the hamming code [7,4,3]_{2}.


H_{HAM}=
\begin{pmatrix}
1 & 0 & 0 & 1 & 1 & 0 & 1\\
0 & 1 & 0 & 1 & 0 & 1 & 1\\
0 & 0 & 1 & 0 & 1 & 1 & 1\\
\end{pmatrix}

The extended Hamming code is:


H_{EXT}=
\begin{pmatrix}
1 & 1 & 1 & 1 & 1 & 1 & 1 & 1\\
1 & 0 & 0 & 1 & 1 & 0 & 1 & 0\\
0 & 1 & 0 & 1 & 0 & 1 & 1 & 0\\
0 & 0 & 1 & 0 & 1 & 1 & 1 & 0\\
\end{pmatrix}

The minimum distance of the extended Hamming code is 4.

Proof for minimum distance

It is sufficient to show that every 3 columns are linearly independent.

Using the lemma for minimum distance, we have that the minimum distance is 4.

Notice that in \mathbb{F}_2, multiplication is equivalent to AND operation.

By simple observations, we know that every 2 columns are linearly independent.

Consider the following linear combination:


a_1\begin{pmatrix}
1\\
\vdots\\
\end{pmatrix}+a_2\begin{pmatrix}
1\\
\vdots\\
\end{pmatrix}+a_3\begin{pmatrix}
1\\
\vdots\\
\end{pmatrix}=0

Therefore, every 3 columns are linearly independent since the top row will always be 1. if a_1,a_2,a_3 are 1.

Therefore, the minimum distance is 4.

For the dimension of this code, we have k=2^m-m-1, with total code length m.

Augmented Hadamard code

Consider what is generated by the parity check matrix of the extended Hamming code.

Let xH_{EXT}

  • If xH_{EXT}=0, then (0,x_2,\ldots, x_m)H_{HAM}=(y,0), where y is the punctured hadamard codeword.
  • If xH_{EXT}=1, then (1,x_2,\ldots, x_m)H_{HAM}=(1,1,\ldots,1)+(y,0), where y is the punctured hadamard codeword.

Therefore, the code generated by H_{EXT} is given by:

Let \mathcal{C} be the hadamard code.

Let \mathcal{C}+\mathbb{I} be its shift by the all 1's vector (flip all bits of all words).

Augmented Hadamard code = \mathcal{C} \cup \mathcal{C} + \mathbb{I}.

The length of the code is 2^m.

The dimension of the code is m+1.

Since the code is still linear, the minimum distance is the minimum weight of the codewords.

So minimum distance of \mathcal{C}+\mathbb{I} is 2^{m-1}.

Summary for simple linear codes

The code Dimension k (effective message length) Minimum distance d Dual code Minimum distance of dual code
\mathbb{F}^n n 1 \{0\} 0
Parity code n-1 2 Repetition code n
Hamming code 2^m-m-1 (length 2^m-1) 3 Punctured Hadamard code 2^{m-1}
Extended Hamming code 2^m-m-1 (length 2^m) 4 Augmented Hadamard code 2^{m-1}

Boundary of linear codes

Natural questions:

  • Can we extend the table of linear codes infinitely?
  • What set of configuration (n,k,d)_q are impossible?
  • What set of configuration (n,k,d)_q are possible, even if we don't know how to construct them?

Boundary I: Singleton bound

Singleton is a name for the person who discovered this bound.

Theorem: For any linear code \mathcal{C}\subseteq \mathbb{F}^n_q, we have d\leq n-k+1.

Proof

Idea: Using the Pigeonhole principle.

Assume an code [n,k,d]_q exists.

Pigeons: All q^k possible code word of \mathcal{C}.

Holes: All q^\ell values of the first \ell entries of a codeword (for some \ell<n).

If q^\ell<q^k, then by Pigeonhole principle, there exists two codewords in \mathcal{C} that agrees on the first \ell entries.

So d\leq n-\ell.

So the largest \ell value for which this arguments works is n-k+1.

Definition of Maximum Distance Separable (MDS) code

A code \mathcal{C} = [n,k,d]_q with d = n - k + 1 is called a Maximum Distance Separable (MDS) code.

Examples for singleton bound

\mathbb{F}^n: any n, k = n, d = 1.

  • Attains with equality!

Parity: any n, k = n - 1, d = 2.

  • Attains with equality!

Hamming: n = 2^m - 1, k = 2^m - m - 1, d = 3.

n - k + 1 = m + 1 > 3.

This creates some trade-off between k and d.

Boundary II: The Sphere Packing Bound

Let r=\lfloor \frac{d-1}{2}\rfloor, then \sum_{i=0}^{r}\binom{n}{i}(q-1)^i\leq q^{n-k}.

Proof

Let c=(c_1,c_2,\ldots,c_n)\in \mathbb{F}^n_q, and let B(c,r)=\{y\in \mathbb{F}^n_q: d_H(c,y)\leq r\} for some r\leq n.

Computer |B(c,r)|.

|B(c,0)|=1

|\{y\in \mathbb{F}^n_q: d_H(c,y)=1\}|=n(q-1).

|\{y\in \mathbb{F}^n_q: d_H(c,y)=2\}|=\binom{n}{2}(q-1)^2=\frac{n(n-1)}{2}(q-1)^2.

So, |B(c,r)|=\sum_{i=0}^{r}\binom{n}{i}(q-1)^i.

Recall that \mathcal{C} of minimum distance d if and only if \forall c_1,c_2\in \mathcal{C}, B(c_1,\lfloor \frac{d-1}{2}\rfloor)\cap B(c_2,\lfloor \frac{d-1}{2}\rfloor)=\emptyset.

Therefore, let r=\lfloor \frac{d-1}{2}\rfloor, we have \sum_{i=0}^{r}\binom{n}{i}(q-1)^i\leq q^{n-k}.

Definition for perfect code

A code \mathcal{C} is called a perfect code if |C|=q^{n-k}.

Examples for sphere packing bound

Let q=2.

\mathbb{F}_2^n: any n, k = n, d = 1.

  • r = \frac{d-1}{2} = 0.
  • \Rightarrow \sum_{i=0}^{0}\binom{n}{i}(q-1)^i = 1 \leq q^{n-k} = 2^{n-n} = 1.
  • Attained with equality!

Parity: any n, k = n - 1, d = 2.

  • r = \frac{d-1}{2} = 0.

  • \Rightarrow \sum_{i=0}^{0}\binom{n}{i}(q-1)^i = 1 \leq q^{n-k} = 2^{n-k} = 2^{n-(n-1)} = 2.

  • q \geq 2 \Rightarrow NOT attained with equality.

Exercise: Equality is attained for the repetition code (dual of parity) for odd n.

Hamming: n = 2^m - 1, k = 2^m - m - 1, d = 3.

  • r = \frac{d-1}{2} = 1.
  • \Rightarrow \sum_{i=0}^{1}\binom{n}{i}(q-1)^i = 1 + (2^{m}-1) = 2^{m}.
  • \Rightarrow q^{n-k} = 2^{m}.
  • Attained with equality!

• Attained with equality!

Fortunately, there are only 4 types of binary linear perfect codes:

  • \mathbb{F}^n
  • Repetition code
  • Hamming code
  • [23,12,7]_2 Golay code

Boundary III: The Gilbert-Varshamov Bound

Let n,k,d,q such that V_q(n-k, d-2)\leq q^{n-k}, then there exists an [n,k,d]_q code.

Singleton, sphere-packing provide necessary conditions for existence of codes.

Are there sufficient conditions?

Recall:

  • Lemma: The minimum distance of \mathcal{C} is the maximum integer such that every d-1 columns of the parity-check matrix H are linearly independent.

Idea:

  • Construct H column by column, ensuring that no dependencies occur.

Idea:

Construct H column by column, ensuring that no dependencies occur.

Algorithm:

  • Begin with (n-k)\times (n-k) identity matrix.
  • Assume we choose columns h_1,h_2,\ldots,h_\ell (each h_i is in \mathbb{F}^n_q)
  • Then next column h_{\ell} must not be in the space of any previous d-2 columns.
    • h_{\ell} cannot be written as [h_1,h_2,\ldots,h_{\ell-1}]x^\top for x of Hamming weight at most d-2.
    • So the ineligible candidates for h_{\ell} is:
      • B_{\ell-1}(0,d-2)=\{x\in \mathbb{F}^{\ell-1}_q: d_H(0,x)\leq d-2\}.
      • |B_{\ell-1}(0,d-2)|=\sum_{i=0}^{d-2}\binom{\ell-1}{i}(q-1)^i, denoted by V_q(\ell-1, d-2).
    • So the candidates for h_{\ell} is:
      • Out of which V_q(\ell-1, d-2) are ineligible.
      • Need n columns overall, so we need V_q(n-k, d-2)\leq q^{n-k}.