7.8 KiB
CSE5313 Coding and information theory for data science (Lecture 8)
Review on Linear codes
| The code | Dimension k (effective message length) |
Minimum distance d |
Dual code | Minimum distance of dual code |
|---|---|---|---|---|
\mathbb{F}^n |
n |
1 |
\{0\} |
0 |
| Parity code | n-1 |
2 |
Repetition code | n |
| Hamming code | 2^m-m-1 |
3 |
Punctured Hadamard code | 2^{m-1} |
More on linear codes
Extended Hamming code
Consider the Hamming code [2^m-1,2^m-m-1,3]_{\mathbb{F}_2}.
Extend it to a cod eof length 2^m by adding a parity bit.
Recall the hamming code [7,4,3]_{2}.
H_{HAM}=
\begin{pmatrix}
1 & 0 & 0 & 1 & 1 & 0 & 1\\
0 & 1 & 0 & 1 & 0 & 1 & 1\\
0 & 0 & 1 & 0 & 1 & 1 & 1\\
\end{pmatrix}
The extended Hamming code is:
H_{EXT}=
\begin{pmatrix}
1 & 1 & 1 & 1 & 1 & 1 & 1 & 1\\
1 & 0 & 0 & 1 & 1 & 0 & 1 & 0\\
0 & 1 & 0 & 1 & 0 & 1 & 1 & 0\\
0 & 0 & 1 & 0 & 1 & 1 & 1 & 0\\
\end{pmatrix}
The minimum distance of the extended Hamming code is 4.
Proof for minimum distance
It is sufficient to show that every 3 columns are linearly independent.
Using the lemma for minimum distance, we have that the minimum distance is 4.
Notice that in \mathbb{F}_2, multiplication is equivalent to AND operation.
By simple observations, we know that every 2 columns are linearly independent.
Consider the following linear combination:
a_1\begin{pmatrix}
1\\
\vdots\\
\end{pmatrix}+a_2\begin{pmatrix}
1\\
\vdots\\
\end{pmatrix}+a_3\begin{pmatrix}
1\\
\vdots\\
\end{pmatrix}=0
Therefore, every 3 columns are linearly independent since the top row will always be 1. if a_1,a_2,a_3 are 1.
Therefore, the minimum distance is 4.
For the dimension of this code, we have k=2^m-m-1, with total code length m.
Augmented Hadamard code
Consider what is generated by the parity check matrix of the extended Hamming code.
Let xH_{EXT}
- If
xH_{EXT}=0, then(0,x_2,\ldots, x_m)H_{HAM}=(y,0), whereyis the punctured hadamard codeword. - If
xH_{EXT}=1, then(1,x_2,\ldots, x_m)H_{HAM}=(1,1,\ldots,1)+(y,0), whereyis the punctured hadamard codeword.
Therefore, the code generated by H_{EXT} is given by:
Let \mathcal{C} be the hadamard code.
Let \mathcal{C}+\mathbb{I} be its shift by the all 1's vector (flip all bits of all words).
– Augmented Hadamard code = \mathcal{C} \cup \mathcal{C} + \mathbb{I}.
The length of the code is 2^m.
The dimension of the code is m+1.
Since the code is still linear, the minimum distance is the minimum weight of the codewords.
So minimum distance of \mathcal{C}+\mathbb{I} is 2^{m-1}.
Summary for simple linear codes
| The code | Dimension k (effective message length) |
Minimum distance d |
Dual code | Minimum distance of dual code |
|---|---|---|---|---|
\mathbb{F}^n |
n |
1 |
\{0\} |
0 |
| Parity code | n-1 |
2 |
Repetition code | n |
| Hamming code | 2^m-m-1 (length 2^m-1) |
3 |
Punctured Hadamard code | 2^{m-1} |
| Extended Hamming code | 2^m-m-1 (length 2^m) |
4 |
Augmented Hadamard code | 2^{m-1} |
Boundary of linear codes
Natural questions:
- Can we extend the table of linear codes infinitely?
- What set of configuration
(n,k,d)_qare impossible? - What set of configuration
(n,k,d)_qare possible, even if we don't know how to construct them?
Boundary I: Singleton bound
Singleton is a name for the person who discovered this bound.
Theorem: For any linear code \mathcal{C}\subseteq \mathbb{F}^n_q, we have d\leq n-k+1.
Proof
Idea: Using the Pigeonhole principle.
Assume an code [n,k,d]_q exists.
Pigeons: All q^k possible code word of \mathcal{C}.
Holes: All q^\ell values of the first \ell entries of a codeword (for some \ell<n).
If q^\ell<q^k, then by Pigeonhole principle, there exists two codewords in \mathcal{C} that agrees on the first \ell entries.
So d\leq n-\ell.
So the largest \ell value for which this arguments works is n-k+1.
Definition of Maximum Distance Separable (MDS) code
A code \mathcal{C} = [n,k,d]_q with d = n - k + 1 is called a Maximum Distance Separable (MDS) code.
Examples for singleton bound
\mathbb{F}^n: any n, k = n, d = 1.
- Attains with equality!
Parity: any n, k = n - 1, d = 2.
- Attains with equality!
Hamming: n = 2^m - 1, k = 2^m - m - 1, d = 3.
n - k + 1 = m + 1 > 3.
This creates some trade-off between k and d.
Boundary II: The Sphere Packing Bound
Let r=\lfloor \frac{d-1}{2}\rfloor, then \sum_{i=0}^{r}\binom{n}{i}(q-1)^i\leq q^{n-k}.
Proof
Let c=(c_1,c_2,\ldots,c_n)\in \mathbb{F}^n_q, and let B(c,r)=\{y\in \mathbb{F}^n_q: d_H(c,y)\leq r\} for some r\leq n.
Computer |B(c,r)|.
|B(c,0)|=1
|\{y\in \mathbb{F}^n_q: d_H(c,y)=1\}|=n(q-1).
|\{y\in \mathbb{F}^n_q: d_H(c,y)=2\}|=\binom{n}{2}(q-1)^2=\frac{n(n-1)}{2}(q-1)^2.
So, |B(c,r)|=\sum_{i=0}^{r}\binom{n}{i}(q-1)^i.
Recall that \mathcal{C} of minimum distance d if and only if \forall c_1,c_2\in \mathcal{C}, B(c_1,\lfloor \frac{d-1}{2}\rfloor)\cap B(c_2,\lfloor \frac{d-1}{2}\rfloor)=\emptyset.
Therefore, let r=\lfloor \frac{d-1}{2}\rfloor, we have \sum_{i=0}^{r}\binom{n}{i}(q-1)^i\leq q^{n-k}.
Definition for perfect code
A code \mathcal{C} is called a perfect code if |C|=q^{n-k}.
Examples for sphere packing bound
Let q=2.
\mathbb{F}_2^n: any n, k = n, d = 1.
r = \frac{d-1}{2} = 0.\Rightarrow \sum_{i=0}^{0}\binom{n}{i}(q-1)^i = 1 \leq q^{n-k} = 2^{n-n} = 1.- Attained with equality!
Parity: any n, k = n - 1, d = 2.
-
r = \frac{d-1}{2} = 0. -
\Rightarrow \sum_{i=0}^{0}\binom{n}{i}(q-1)^i = 1 \leq q^{n-k} = 2^{n-k} = 2^{n-(n-1)} = 2. -
q \geq 2 \RightarrowNOT attained with equality.
Exercise: Equality is attained for the repetition code (dual of parity) for odd n.
Hamming: n = 2^m - 1, k = 2^m - m - 1, d = 3.
r = \frac{d-1}{2} = 1.\Rightarrow \sum_{i=0}^{1}\binom{n}{i}(q-1)^i = 1 + (2^{m}-1) = 2^{m}.\Rightarrow q^{n-k} = 2^{m}.- Attained with equality!
• Attained with equality!
Fortunately, there are only 4 types of binary linear perfect codes:
\mathbb{F}^n- Repetition code
- Hamming code
[23,12,7]_2Golay code
Boundary III: The Gilbert-Varshamov Bound
Let n,k,d,q such that V_q(n-k, d-2)\leq q^{n-k}, then there exists an [n,k,d]_q code.
Singleton, sphere-packing provide necessary conditions for existence of codes.
Are there sufficient conditions?
Recall:
- Lemma: The minimum distance of
\mathcal{C}is the maximum integer such that everyd-1columns of the parity-check matrixHare linearly independent.Idea:
- Construct
Hcolumn by column, ensuring that no dependencies occur.
Idea:
Construct H column by column, ensuring that no dependencies occur.
Algorithm:
- Begin with
(n-k)\times (n-k)identity matrix. - Assume we choose columns
h_1,h_2,\ldots,h_\ell(eachh_iis in\mathbb{F}^n_q) - Then next column
h_{\ell}must not be in the space of any previousd-2columns.h_{\ell}cannot be written as[h_1,h_2,\ldots,h_{\ell-1}]x^\topforxof Hamming weight at mostd-2.- So the ineligible candidates for
h_{\ell}is:B_{\ell-1}(0,d-2)=\{x\in \mathbb{F}^{\ell-1}_q: d_H(0,x)\leq d-2\}.|B_{\ell-1}(0,d-2)|=\sum_{i=0}^{d-2}\binom{\ell-1}{i}(q-1)^i, denoted byV_q(\ell-1, d-2).
- So the candidates for
h_{\ell}is:- Out of which
V_q(\ell-1, d-2)are ineligible. - Need
ncolumns overall, so we needV_q(n-k, d-2)\leq q^{n-k}.
- Out of which