# CSE5313 Coding and information theory for data science (Lecture 7)

## Continue on Linear codes

Let $\mathcal{C}= [n,k,d]_{\mathbb{F}}$ be a linear code.

There are two equivalent ways to describe a linear code:

1. A generator matrix $G\in \mathbb{F}^{k\times n}_q$ with $k$ rows and $n$ columns, entry taken from $\mathbb{F}_q$. $\mathcal{C}=\{xG|x\in \mathbb{F}^k\}$
2. A parity check matrix $H\in \mathbb{F}^{(n-k)\times n}_q$ with $(n-k)$ rows and $n$ columns, entry taken from $\mathbb{F}_q$. $\mathcal{C}=\{c\in \mathbb{F}^n:Hc^\top=0\}$

### Dual code

#### Definition of dual code

$C^{\perp}$ is the set of all vectors in $\mathbb{F}^n$ that are orthogonal to every vector in $C$.

$$
C^{\perp}=\{x\in \mathbb{F}^n:x\cdot c=0\text{ for all }c\in C\}
$$

Also, the alternative definition is:

1. $C^{\perp}=\{x\in \mathbb{F}^n:Gx^\top=0\}$ (only need to check basis of $C$)
2. $C^{\perp}=\{xH|x\in \mathbb{F}^{n-k}\}$

By rank-nullity theorem, $dim(C^{\perp})=n-dim(C)=n-k$.

> [!WARNING]
>
> $C^{\perp}\cap C=\{0\}$ is not always true.
>
> Let $C=\{(0,0),(1,1)\}\subseteq \mathbb{F}_2^2$. Then $C^{\perp}=\{(0,0),(1,1)\}\subseteq \mathbb{F}_2^2$ since $(1,1)\begin{pmatrix} 1\\ 1\end{pmatrix}=0$.

### Example of binary codes

#### Trivial code

Let $\mathbb{F}=\mathbb{F}_2$.

Let $C=\mathbb{F}^n$.

Generator matrix is identity matrix.

Parity check matrix is zero matrix.

Minimum distance is 1.

#### Parity code

Let $\mathbb{F}=\mathbb{F}_2$.

Let $C=\{(c_1,c_2,\cdots,c_{k},\sum_{i=1}^k c_i)\}$.

The generator matrix is:

$$
G=\begin{pmatrix}
1 & 0 & 0 & \cdots & 0 & 1\\
0 & 1 & 0 & \cdots & 0 & 1\\
0 & 0 & 1 & \cdots & 0 & 1\\
\vdots & \vdots & \vdots & \ddots & \vdots & \vdots\\
0 & 0 & 0 & \cdots & 1 & 1
\end{pmatrix}
$$

The parity check matrix is:

$$
H=\begin{pmatrix}
1 & 1 & 1 & \cdots & 1 & 1
\end{pmatrix}
$$

Minimum distance is 2.

$C^{\perp}$ is the repetition code.

#### Lemma for minimum distance

The minimum distance of $\mathcal{C}$ is the maximum integer $d$ such that every $d-1$ columns of $H$ are linearly independent.

<details>
<summary>Proof</summary>

Assume minimum distance is $d$. Show that every $d-1$ columns of $H$ are independent.

- Fact: In linear codes minimum distance is the minimum weight ($d_H(x,y)=w_H(x-y)$).

Indeed, if there exists a $d-1$ columns of $H$ that are linearly dependent, then we have $Hc^\top=0$ for some $c\in \mathcal{C}$ with $w_H(c)<d$.

Reverse are similar.

</details>

#### The Hamming code

Let $m\in \mathbb{N}$.

Take all $2^m-1$ non-zero vectors in $\mathbb{F}_2^m$.

Put them as columns of a matrix $H$.

Example: for $m=3$,

$$
H=\begin{pmatrix}
1 & 0 & 0 & 1 & 1 & 0 & 1\\
0 & 1 & 0 & 1 & 0 & 1 & 1\\
0 & 0 & 1 & 0 & 1 & 1 & 1\\
\end{pmatrix}
$$

Minimum distance is 3.

<details>
<summary>Proof for minimum distance</summary>

Using the lemma for minimum distance. Since $H$ are linearly independent, the minimum distance is 3.

</details>

So the maximum number of error correction is 1.

The length of code is $2^m-1$.

$k=2^m-m-1$.

#### Hadamard code

Define the code by encoding function:

$E(x): \mathbb{F}_2^m\to \mathbb{F}_2^{2^m}=(xy_1^\top,\cdots,xy_{2^m}^\top)$ ($y\in \mathbb{F}_2^m$)

Space of codewords is image of $E$.

This is a linear code since each term is a linear combination of $x$ and $y$.

If $x_1,x_2,\ldots,x_m$ is a basis of $\mathbb{F}_2^m$, then $E(x_1),E(x_2),\ldots,E(x_m)$ is a basis of $\mathcal{C}$.

So the dimension of $\mathcal{C}$ is $m$.

Minimum distance is: $2^{m-1}$.

<details>
<summary>Proof for minimum distance</summary>

Since the code is linear, then the minimum distance is the minimum weight of the codewords.

For each $x\in \mathbb{F}_2^m$, there exists $2^{m-1}$ such that $E(x)=1$.

For all non-zero $x$ we have $d(E(x),E(0))=2^{m-1}$.

</details>

There exists a redundant $y_i=0$, we remove it from $E(x)$ to get a puntured Hadamard codeword.

The length of the code is $2^{m-1}$.

SO $E: \mathbb{F}_2^m\to \mathbb{F}_2^{2^{m-1}}$ is a linear code.

The generator matrix is the parity check matrix of Hamming code.

The dual of Hamming code is the (puntured) Hadamard code.