Update CSE5313_L26.md
Some checks failed
Sync from Gitea (main→main, keep workflow) / mirror (push) Has been cancelled

This commit is contained in:
Trance-0
2025-12-02 13:54:12 -06:00
parent 726c6a4851
commit 0b736aa18d

View File

@@ -114,7 +114,7 @@ A codeword is a set of $M$ binary strings, each of length $L$.
"Sliced channel":
- The message 𝑚 is encoded to $c\in \{0,1\}^{ML}, and then sliced to 𝑀 equal parts.
- The message $m$ is encoded to $c\in \{0,1\}^{ML}, and then sliced to $M$ equal parts.
- Parts may be noisy (substitutions, deletions, etc.).
- Also useful in network packet transmission ($M$ packets of length $L$).
@@ -127,7 +127,7 @@ How to quantify the **merit** of a given code $\mathcal{C}$?
Redundance:
- Recall in linear codes,
- $redundancy=lengt-dimension=\log (size\ of\ space)-\log (size\ of\ code)$.
- $redundancy=length-dimension=\log (size\ of\ space)-\log (size\ of\ code)$.
- In sliced channel:
- $redundancy=\log (size\ of\ space)-\log (size\ of\ code)=\log \binom{2^L}{M}-\log |\mathcal{C}|$.
@@ -393,20 +393,53 @@ Tools:
- Want: Markers not to overlap.
- Solution: Take markers from a Mutually Uncorrelated Codes (existing notion).
- A code $\mathcal{M}$ is called mutually uncorrelated if no suffix of any 𝑚𝑖 ∈ if a prefix of another
𝑚𝑗 ∈ (including 𝑖 = 𝑗).
Many constructions exist.
• Theorem: For any integer there exists a mutually uncorrelated code 𝐶𝑀𝑈 of length
and size 𝐶𝑀𝑈
2
32
- A code $\mathcal{M}$ is called mutually uncorrelated if no suffix of any $m_i \in \mathcal{M}$ is if a prefix of another $m_j \in \mathcal{M}$ (including $i=j$).
- Many constructions exist.
Tool: Random encoding.
• Want: Codewords with many markers from 𝐶𝑀𝑈, that are not too far apart.
• Problem: Hard to achieve explicitly.
• Workaround: Show that a uniformly random string has this property.
• Random encoding:
Choose the message at random.
Suitable for embedding, say, printer ID.
Not suitable for dynamic information.
Theorem: For any integer $\ell$ there exists a mutually uncorrelated code $\mathcal{C}_{MU}$ of length $\ell$ and size $|\mathcal{C}_{MU}|\geq \frac{2^\ell}{32\ell}$.
#### Tool: Random encoding.
- Want: Codewords with many markers from $\mathcal{C}_{MU}$, that are not too far apart.
- Problem: Hard to achieve explicitly.
- Workaround: Show that a uniformly random string has this property.
Random encoding:
- Choose the message at random.
- Suitable for embedding, say, printer ID.
- Not suitable for dynamic information.
Let $m>0$ be a parameter.
Fix a mutually uncorrelated code $\mathcal{C}_{MU}$ of length $\Theta(\log m)$.
Fix $m_1,\ldots, m_t$ from $\mathcal{C}_{MU}$ as "special" markers.
Claim: With probability $1-\frac{1}{\poly(m)}$, in uniformly random string $z\in \{0,1\}^m$.
- Every $O(\log^2(m))$ bits contain a marker from $\mathcal{C}_{MU}$.
- Every two non-overlapping substrings of length $c\log m$ are distinct.
- $z$ does not contain any of the special markers $m_1,\ldots, m_t$.
Proof idea:
- Short substring are abundant.
- Long substring are rare.
#### Sketch of encoding for t-break codes.
Repeatedly sample $z\in \{0,1\}^m$ until it is "good".
Find all markers $m_{i_1},\ldots, m_{i_r}$ in it.
Build a $|\mathcal{C}_{MU}|\times |\mathcal{C}_{MU}|$ matrix $A$ which records order and distances:
- $A_{i,j}=0$ if $m_i,m_j$ are not adjacent.
- Otherwise, it is the distance between them (in bits).
Append $RS_{2t}(A)$ at the end, and use the special markers $m_1,\ldots, m_t$.
#### Sketch of decoding for t-break codes.
Construct a partial adjacency matrix $A'$ from fragments.