Update CSE5313_L26.md

2025-12-02 13:54:12 -06:00
parent 726c6a4851
commit 0b736aa18d
1 changed files with 51 additions and 18 deletions
--- a/content/CSE5313/CSE5313_L26.md
+++ b/content/CSE5313/CSE5313_L26.md
@@ -114,7 +114,7 @@ A codeword is a set of $M$ binary strings, each of length $L$.

 "Sliced channel":

- The message 𝑚 is encoded to $c\in \{0,1\}^{ML}, and then sliced to 𝑀 equal parts.
+- The message $m$ is encoded to $c\in \{0,1\}^{ML}, and then sliced to $M$ equal parts.
 - Parts may be noisy (substitutions, deletions, etc.).
 - Also useful in network packet transmission ($M$ packets of length $L$).

@@ -127,7 +127,7 @@ How to quantify the **merit** of a given code $\mathcal{C}$?
 Redundance:

 - Recall in linear codes,
-  - $redundancy=lengt-dimension=\log (size\ of\ space)-\log (size\ of\ code)$.
+  - $redundancy=length-dimension=\log (size\ of\ space)-\log (size\ of\ code)$.
 - In sliced channel:
  - $redundancy=\log (size\ of\ space)-\log (size\ of\ code)=\log \binom{2^L}{M}-\log |\mathcal{C}|$.

@@ -393,20 +393,53 @@ Tools:

 - Want: Markers not to overlap.
 - Solution: Take markers from a Mutually Uncorrelated Codes (existing notion).
-  - A code $\mathcal{M}$ is called mutually uncorrelated if no suffix of any 𝑚𝑖 ∈ ℳ if a prefix of another
-𝑚𝑗 ∈ ℳ (including 𝑖 = 𝑗).
-– Many constructions exist.
-• Theorem: For any integer ℓ there exists a mutually uncorrelated code 𝐶𝑀𝑈 of length
-ℓ and size 𝐶𝑀𝑈 ≥
-2
-ℓ
-32ℓ
+  - A code $\mathcal{M}$ is called mutually uncorrelated if no suffix of any $m_i \in \mathcal{M}$ is if a prefix of another $m_j \in \mathcal{M}$ (including $i=j$).
+- Many constructions exist.

-Tool: Random encoding.
-• Want: Codewords with many markers from 𝐶𝑀𝑈, that are not too far apart.
-• Problem: Hard to achieve explicitly.
-• Workaround: Show that a uniformly random string has this property.
-• Random encoding:
-– Choose the message at random.
-– Suitable for embedding, say, printer ID.
-– Not suitable for dynamic information.
+Theorem: For any integer $\ell$ there exists a mutually uncorrelated code $\mathcal{C}_{MU}$ of length $\ell$ and size $|\mathcal{C}_{MU}|\geq \frac{2^\ell}{32\ell}$.
+
+#### Tool: Random encoding.
+
+- Want: Codewords with many markers from $\mathcal{C}_{MU}$, that are not too far apart.
+- Problem: Hard to achieve explicitly.
+- Workaround: Show that a uniformly random string has this property.
+
+Random encoding:
+
+- Choose the message at random.
+- Suitable for embedding, say, printer ID.
+- Not suitable for dynamic information.
+
+Let $m>0$ be a parameter.
+
+Fix a mutually uncorrelated code $\mathcal{C}_{MU}$ of length $\Theta(\log m)$.
+
+Fix $m_1,\ldots, m_t$ from $\mathcal{C}_{MU}$ as "special" markers.
+
+Claim: With probability $1-\frac{1}{\poly(m)}$, in uniformly random string $z\in \{0,1\}^m$.
+
+- Every $O(\log^2(m))$ bits contain a marker from $\mathcal{C}_{MU}$.
+- Every two non-overlapping substrings of length $c\log m$ are distinct.
+- $z$ does not contain any of the special markers $m_1,\ldots, m_t$.
+
+Proof idea:
+
+- Short substring are abundant.
+- Long substring are rare.
+
+#### Sketch of encoding for t-break codes.
+
+Repeatedly sample $z\in \{0,1\}^m$ until it is "good".
+
+Find all markers $m_{i_1},\ldots, m_{i_r}$ in it.
+
+Build a $|\mathcal{C}_{MU}|\times |\mathcal{C}_{MU}|$ matrix $A$ which records order and distances:
+
+- $A_{i,j}=0$ if $m_i,m_j$ are not adjacent.
+- Otherwise, it is the distance between them (in bits).
+
+Append $RS_{2t}(A)$ at the end, and use the special markers $m_1,\ldots, m_t$.
+
+#### Sketch of decoding for t-break codes.
+
+Construct a partial adjacency matrix $A'$ from fragments.