NoteNextra-origin/content/CSE5313/CSE5313_L17.md

# CSE5313 Coding and information theory for data science (Lecture 17)

## Shannon's coding Theorem

**Shannon’s coding theorem**: For a discrete memoryless channel with capacity $C$,
every rate $R < C = \max_{x\in \mathcal{X}} I(X; Y)$ is achievable.

### Computing Channel Capacity

$X$: channel input (per 1 channel use), $Y$: channel output (per 1 channel use).

Let the rate of the code be $\frac{\log_F |C|}{n}$ (or $\frac{k}{n}$ if it is linear).

The Binary Erasure Channel (BEC): analog of BSC, but the bits are lost (not corrupted).

Let $\alpha$ be the fraction of erased bits.

### Corollary: The capacity of the BEC is $C = 1 - \alpha$.

<details>

<summary>Proof</summary>

$$
\begin{aligned}
C&=\max_{x\in \mathcal{X}} I(X;Y)\\
&=\max_{x\in \mathcal{X}} (H(Y)-H(Y|X))\\
&=H(Y)-H(\alpha)
\end{aligned}
$$

Suppose we denote $Pr(X=1)\coloneqq p$.

$Pr(Y=0)=Pr(X=0)Pr(no erasure)=(1-p)(1-\alpha)$

$Pr(Y=1)=Pr(X=1)Pr(no erasure)=p(1-\alpha)$

$Pr(Y=*)=\alpha$

So,

$$
\begin{aligned}
H(Y)&=H((1-p)(1-\alpha),p(1-\alpha),\alpha)\\
&=(1-p)(1-\alpha)\log_2 ((1-p)(1-\alpha))+p(1-\alpha)\log_2 (p(1-\alpha))+\alpha\log_2 (\alpha)\\
&=H(\alpha)+(1-\alpha)H(p)
\end{aligned}
$$

So $I(X;Y)=H(Y)-H(Y|X)=H(\alpha)+(1-\alpha)H(p)-H(\alpha)=(1-\alpha)H(p)$

So $C=\max_{x\in \mathcal{X}} I(X;Y)=\max_{p\in [0,1]} (1-\alpha)H(p)=(1-\alpha)$

So the capacity of the BEC is $C = 1 - \alpha$.

</details>

### General interpretation of capacity

Recall $I(X;Y)=H(Y)-H(Y|X)$.

Edge case:

- If $H(X|Y)=0$, then output $Y$ reveals all information about input $X$.
  - rate of $R=I(X;Y)=H(Y)$ is possible. (same as information compression)
- If $H(Y|X)=H(X)$, then $Y$ reveals no information about $X$.
  - rate of $R=I(X;Y)=0$ no information is transferred.

> [!NOTE]
>
> Compression is transmission without noise.

## Side notes for Cryptography

Goal: Quantify the amount of information that is leaked to the eavesdropper.

- Let:
  - $M$ be the message distribution.
  - Let $Z$ be the cyphertext distribution.
- How much information is leaked about $m$ to the eavesdropper (who sees $operatorname{Enc}(m)$)?
- Idea: One-time pad.

### One-time pad