diff --git a/content/CSE5313/CSE5313_L26.md b/content/CSE5313/CSE5313_L26.md index 54dfddb..a075a4f 100644 --- a/content/CSE5313/CSE5313_L26.md +++ b/content/CSE5313/CSE5313_L26.md @@ -114,7 +114,7 @@ A codeword is a set of $M$ binary strings, each of length $L$. "Sliced channel": -- The message π‘š is encoded to $c\in \{0,1\}^{ML}, and then sliced to 𝑀 equal parts. +- The message $m$ is encoded to $c\in \{0,1\}^{ML}, and then sliced to $M$ equal parts. - Parts may be noisy (substitutions, deletions, etc.). - Also useful in network packet transmission ($M$ packets of length $L$). @@ -127,7 +127,7 @@ How to quantify the **merit** of a given code $\mathcal{C}$? Redundance: - Recall in linear codes, - - $redundancy=lengt-dimension=\log (size\ of\ space)-\log (size\ of\ code)$. + - $redundancy=length-dimension=\log (size\ of\ space)-\log (size\ of\ code)$. - In sliced channel: - $redundancy=\log (size\ of\ space)-\log (size\ of\ code)=\log \binom{2^L}{M}-\log |\mathcal{C}|$. @@ -393,20 +393,53 @@ Tools: - Want: Markers not to overlap. - Solution: Take markers from a Mutually Uncorrelated Codes (existing notion). - - A code $\mathcal{M}$ is called mutually uncorrelated if no suffix of any π‘šπ‘– ∈ β„³ if a prefix of another -π‘šπ‘— ∈ β„³ (including 𝑖 = 𝑗). -– Many constructions exist. -β€’ Theorem: For any integer β„“ there exists a mutually uncorrelated code πΆπ‘€π‘ˆ of length -β„“ and size πΆπ‘€π‘ˆ β‰₯ -2 -β„“ -32β„“ + - A code $\mathcal{M}$ is called mutually uncorrelated if no suffix of any $m_i \in \mathcal{M}$ is if a prefix of another $m_j \in \mathcal{M}$ (including $i=j$). +- Many constructions exist. -Tool: Random encoding. -β€’ Want: Codewords with many markers from πΆπ‘€π‘ˆ, that are not too far apart. -β€’ Problem: Hard to achieve explicitly. -β€’ Workaround: Show that a uniformly random string has this property. -β€’ Random encoding: -– Choose the message at random. -– Suitable for embedding, say, printer ID. -– Not suitable for dynamic information. \ No newline at end of file +Theorem: For any integer $\ell$ there exists a mutually uncorrelated code $\mathcal{C}_{MU}$ of length $\ell$ and size $|\mathcal{C}_{MU}|\geq \frac{2^\ell}{32\ell}$. + +#### Tool: Random encoding. + +- Want: Codewords with many markers from $\mathcal{C}_{MU}$, that are not too far apart. +- Problem: Hard to achieve explicitly. +- Workaround: Show that a uniformly random string has this property. + +Random encoding: + +- Choose the message at random. +- Suitable for embedding, say, printer ID. +- Not suitable for dynamic information. + +Let $m>0$ be a parameter. + +Fix a mutually uncorrelated code $\mathcal{C}_{MU}$ of length $\Theta(\log m)$. + +Fix $m_1,\ldots, m_t$ from $\mathcal{C}_{MU}$ as "special" markers. + +Claim: With probability $1-\frac{1}{\poly(m)}$, in uniformly random string $z\in \{0,1\}^m$. + +- Every $O(\log^2(m))$ bits contain a marker from $\mathcal{C}_{MU}$. +- Every two non-overlapping substrings of length $c\log m$ are distinct. +- $z$ does not contain any of the special markers $m_1,\ldots, m_t$. + +Proof idea: + +- Short substring are abundant. +- Long substring are rare. + +#### Sketch of encoding for t-break codes. + +Repeatedly sample $z\in \{0,1\}^m$ until it is "good". + +Find all markers $m_{i_1},\ldots, m_{i_r}$ in it. + +Build a $|\mathcal{C}_{MU}|\times |\mathcal{C}_{MU}|$ matrix $A$ which records order and distances: + +- $A_{i,j}=0$ if $m_i,m_j$ are not adjacent. +- Otherwise, it is the distance between them (in bits). + +Append $RS_{2t}(A)$ at the end, and use the special markers $m_1,\ldots, m_t$. + +#### Sketch of decoding for t-break codes. + +Construct a partial adjacency matrix $A'$ from fragments. \ No newline at end of file