diff --git a/content/CSE510/CSE510_L11.md b/content/CSE510/CSE510_L11.md index dcf08ea..4cbfe33 100644 --- a/content/CSE510/CSE510_L11.md +++ b/content/CSE510/CSE510_L11.md @@ -198,20 +198,20 @@ $$ Take the softmax policy as example: -Weight actions using the linear combination of features $\phi(s,a)^T\theta$: +Weight actions using the linear combination of features $\phi(s,a)^\top\theta$: Probability of action is proportional to the exponentiated weights: $$ -\pi_\theta(s,a) \propto \exp(\phi(s,a)^T\theta) +\pi_\theta(s,a) \propto \exp(\phi(s,a)^\top\theta) $$ The score function is $$ \begin{aligned} -\nabla_\theta \ln\left[\frac{\exp(\phi(s,a)^T\theta)}{\sum_{a'\in A}\exp(\phi(s,a')^T\theta)}\right] &= \nabla_\theta(\ln \exp(\phi(s,a)^T\theta) - (\ln \sum_{a'\in A}\exp(\phi(s,a')^T\theta))) \\ -&= \nabla_\theta\left(\phi(s,a)^T\theta -\frac{\phi(s,a)\sum_{a'\in A}\exp(\phi(s,a')^T\theta)}{\sum_{a'\in A}\exp(\phi(s,a')^T\theta)}\right) \\ +\nabla_\theta \ln\left[\frac{\exp(\phi(s,a)^\top\theta)}{\sum_{a'\in A}\exp(\phi(s,a')^\top\theta)}\right] &= \nabla_\theta(\ln \exp(\phi(s,a)^\top\theta) - (\ln \sum_{a'\in A}\exp(\phi(s,a')^\top\theta))) \\ +&= \nabla_\theta\left(\phi(s,a)^\top\theta -\frac{\phi(s,a)\sum_{a'\in A}\exp(\phi(s,a')^\top\theta)}{\sum_{a'\in A}\exp(\phi(s,a')^\top\theta)}\right) \\ &=\phi(s,a) - \sum_{a'\in A} \prod_\theta(s,a') \phi(s,a') &= \phi(s,a) - \mathbb{E}_{a'\sim \pi_\theta(s,a')}[\phi(s,a')] \end{aligned} @@ -221,7 +221,7 @@ $$ In continuous action spaces, a Gaussian policy is natural -Mean is a linear combination of state features $\mu(s) = \phi(s)^T\theta$ +Mean is a linear combination of state features $\mu(s) = \phi(s)^\top\theta$ Variance may be fixed $\sigma^2$, or can also parametrized diff --git a/content/CSE510/CSE510_L12.md b/content/CSE510/CSE510_L12.md index 19c76c5..1e3454c 100644 --- a/content/CSE510/CSE510_L12.md +++ b/content/CSE510/CSE510_L12.md @@ -53,7 +53,7 @@ $$ Action-Value Actor-Critic - Simple actor-critic algorithm based on action-value critic -- Using linear value function approximation $Q_w(s,a)=\phi(s,a)^T w$ +- Using linear value function approximation $Q_w(s,a)=\phi(s,a)^\top w$ Critic: updates $w$ by linear $TD(0)$ Actor: updates $\theta$ by policy gradient diff --git a/content/CSE510/CSE510_L13.md b/content/CSE510/CSE510_L13.md index 43ef0f9..b2f0c84 100644 --- a/content/CSE510/CSE510_L13.md +++ b/content/CSE510/CSE510_L13.md @@ -193,7 +193,7 @@ $$ Make linear approximation to $L_{\pi_{\theta_{old}}}$ and quadratic approximation to KL term. -Maximize $g\cdot(\theta-\theta_{old})-\frac{\beta}{2}(\theta-\theta_{old})^T F(\theta-\theta_{old})$ +Maximize $g\cdot(\theta-\theta_{old})-\frac{\beta}{2}(\theta-\theta_{old})^\top F(\theta-\theta_{old})$ where $g=\frac{\partial}{\partial \theta}L_{\pi_{\theta_{old}}}(\pi_{\theta})\vert_{\theta=\theta_{old}}$ and $F=\frac{\partial^2}{\partial \theta^2}\overline{KL}_{\pi_{\theta_{old}}}(\pi_{\theta})\vert_{\theta=\theta_{old}}$ @@ -201,7 +201,7 @@ where $g=\frac{\partial}{\partial \theta}L_{\pi_{\theta_{old}}}(\pi_{\theta})\ve Taylor Expansion of KL Term $$ -D_{KL}(\pi_{\theta_{old}}|\pi_{\theta})\approx D_{KL}(\pi_{\theta_{old}}|\pi_{\theta_{old}})+d^T \nabla_\theta D_{KL}(\pi_{\theta_{old}}|\pi_{\theta})\vert_{\theta=\theta_{old}}+\frac{1}{2}d^T \nabla_\theta^2 D_{KL}(\pi_{\theta_{old}}|\pi_{\theta})\vert_{\theta=\theta_{old}}d +D_{KL}(\pi_{\theta_{old}}|\pi_{\theta})\approx D_{KL}(\pi_{\theta_{old}}|\pi_{\theta_{old}})+d^\top \nabla_\theta D_{KL}(\pi_{\theta_{old}}|\pi_{\theta})\vert_{\theta=\theta_{old}}+\frac{1}{2}d^\top \nabla_\theta^2 D_{KL}(\pi_{\theta_{old}}|\pi_{\theta})\vert_{\theta=\theta_{old}}d $$ $$ @@ -220,9 +220,9 @@ $$ \begin{aligned} \nabla_\theta^2 D_{KL}(\pi_{\theta_{old}}|\pi_{\theta})\vert_{\theta=\theta_{old}}&=-\mathbb{E}_{x\sim \pi_{\theta_{old}}}\nabla_\theta^2 \log P_\theta(x)\vert_{\theta=\theta_{old}}\\ &=-\mathbb{E}_{x\sim \pi_{\theta_{old}}}\nabla_\theta \left(\frac{\nabla_\theta P_\theta(x)}{P_\theta(x)}\right)\vert_{\theta=\theta_{old}}\\ -&=-\mathbb{E}_{x\sim \pi_{\theta_{old}}}\left(\frac{\nabla_\theta^2 P_\theta(x)-\nabla_\theta P_\theta(x)\nabla_\theta P_\theta(x)^T}{P_\theta(x)^2}\right)\vert_{\theta=\theta_{old}}\\ -&=-\mathbb{E}_{x\sim \pi_{\theta_{old}}}\left(\frac{\nabla_\theta^2 P_\theta(x)\vert_{\theta=\theta_{old}}}P_{\theta_{old}}(x)\right)+\mathbb{E}_{x\sim \pi_{\theta_{old}}}\left(\nabla_\theta \log P_\theta(x)\nabla_\theta \log P_\theta(x)^T\right)\vert_{\theta=\theta_{old}}\\ -&=\mathbb{E}_{x\sim \pi_{\theta_{old}}}\nabla_\theta\log P_\theta(x)\nabla_\theta\log P_\theta(x)^T\vert_{\theta=\theta_{old}}\\ +&=-\mathbb{E}_{x\sim \pi_{\theta_{old}}}\left(\frac{\nabla_\theta^2 P_\theta(x)-\nabla_\theta P_\theta(x)\nabla_\theta P_\theta(x)^\top}{P_\theta(x)^2}\right)\vert_{\theta=\theta_{old}}\\ +&=-\mathbb{E}_{x\sim \pi_{\theta_{old}}}\left(\frac{\nabla_\theta^2 P_\theta(x)\vert_{\theta=\theta_{old}}}P_{\theta_{old}}(x)\right)+\mathbb{E}_{x\sim \pi_{\theta_{old}}}\left(\nabla_\theta \log P_\theta(x)\nabla_\theta \log P_\theta(x)^\top\right)\vert_{\theta=\theta_{old}}\\ +&=\mathbb{E}_{x\sim \pi_{\theta_{old}}}\nabla_\theta\log P_\theta(x)\nabla_\theta\log P_\theta(x)^\top\vert_{\theta=\theta_{old}}\\ \end{aligned} $$ diff --git a/content/CSE510/CSE510_L14.md b/content/CSE510/CSE510_L14.md index 96263a9..50e52ce 100644 --- a/content/CSE510/CSE510_L14.md +++ b/content/CSE510/CSE510_L14.md @@ -27,7 +27,7 @@ $\theta_{new}=\theta_{old}+d$ First order Taylor expansion for the loss and second order for the KL: $$ -\approx \arg\max_{d} J(\theta_{old})+\nabla_\theta J(\theta)\mid_{\theta=\theta_{old}}d-\frac{1}{2}\lambda(d^T\nabla_\theta^2 D_{KL}\left[\pi_{\theta_{old}}||\pi_{\theta}\right]\mid_{\theta=\theta_{old}}d)+\lambda \delta +\approx \arg\max_{d} J(\theta_{old})+\nabla_\theta J(\theta)\mid_{\theta=\theta_{old}}d-\frac{1}{2}\lambda(d^\top\nabla_\theta^2 D_{KL}\left[\pi_{\theta_{old}}||\pi_{\theta}\right]\mid_{\theta=\theta_{old}}d)+\lambda \delta $$ If you are really interested, try to fill the solving the KL Constrained Problem section. @@ -38,7 +38,7 @@ Setting the gradient to zero: $$ \begin{aligned} -0&=\frac{\partial}{\partial d}\left(-\nabla_\theta J(\theta)\mid_{\theta=\theta_{old}}d+\frac{1}{2}\lambda(d^T F(\theta_{old})d\right)\\ +0&=\frac{\partial}{\partial d}\left(-\nabla_\theta J(\theta)\mid_{\theta=\theta_{old}}d+\frac{1}{2}\lambda(d^\top F(\theta_{old})d\right)\\ &=-\nabla_\theta J(\theta)\mid_{\theta=\theta_{old}}+\frac{1}{2}\lambda F(\theta_{old})d \end{aligned} $$ @@ -58,15 +58,15 @@ $$ $$ $$ -D_{KL}(\pi_{\theta_{old}}||\pi_{\theta})\approx \frac{1}{2}(\theta-\theta_{old})^T F(\theta_{old})(\theta-\theta_{old}) +D_{KL}(\pi_{\theta_{old}}||\pi_{\theta})\approx \frac{1}{2}(\theta-\theta_{old})^\top F(\theta_{old})(\theta-\theta_{old}) $$ $$ -\frac{1}{2}(\alpha g_N)^T F(\alpha g_N)=\delta +\frac{1}{2}(\alpha g_N)^\top F(\alpha g_N)=\delta $$ $$ -\alpha=\sqrt{\frac{2\delta}{g_N^T F g_N}} +\alpha=\sqrt{\frac{2\delta}{g_N^\top F g_N}} $$ However, due to the quadratic approximation, the KL constrains may be violated. diff --git a/content/CSE510/CSE510_L18.md b/content/CSE510/CSE510_L18.md index 3f4ef74..61d0033 100644 --- a/content/CSE510/CSE510_L18.md +++ b/content/CSE510/CSE510_L18.md @@ -16,7 +16,7 @@ So we can learn $f(s_t,a_t)$ from data, and _then_ plan through it. Model-based reinforcement learning version **0.5**: -1. Run base polity $\pi_0$ (e.g. random policy) to collect $\mathcal{D} = \{(s_t, a_t, s_{t+1})\}_{t=0}^T$ +1. Run base polity $\pi_0$ (e.g. random policy) to collect $\mathcal{D} = \{(s_t, a_t, s_{t+1})\}_{t=0}^\top$ 2. Learn dynamics model $f(s_t,a_t)$ to minimize $\sum_{i}\|f(s_i,a_i)-s_{i+1}\|^2$ 3. Plan through $f(s_t,a_t)$ to choose action $a_t$ @@ -52,10 +52,10 @@ Version 2.0: backpropagate directly into policy Final version: -1. Run base polity $\pi_0$ (e.g. random policy) to collect $\mathcal{D} = \{(s_t, a_t, s_{t+1})\}_{t=0}^T$ +1. Run base polity $\pi_0$ (e.g. random policy) to collect $\mathcal{D} = \{(s_t, a_t, s_{t+1})\}_{t=0}^\top$ 2. Learn dynamics model $f(s_t,a_t)$ to minimize $\sum_{i}\|f(s_i,a_i)-s_{i+1}\|^2$ 3. Backpropagate through $f(s_t,a_t)$ into the policy to optimized $\pi_\theta(s_t,a_t)$ -4. Run the policy $\pi_\theta(s_t,a_t)$ to collect $\mathcal{D} = \{(s_t, a_t, s_{t+1})\}_{t=0}^T$ +4. Run the policy $\pi_\theta(s_t,a_t)$ to collect $\mathcal{D} = \{(s_t, a_t, s_{t+1})\}_{t=0}^\top$ 5. Goto 2 ## Model Learning with High-Dimensional Observations diff --git a/content/CSE5313/CSE5313_L10.md b/content/CSE5313/CSE5313_L10.md index db7ce6c..6dff312 100644 --- a/content/CSE5313/CSE5313_L10.md +++ b/content/CSE5313/CSE5313_L10.md @@ -40,20 +40,20 @@ Let $G$ and $H$ be the generator and parity-check matrices of (any) linear code #### Lemma 1 $$ -H G^T = 0 +H G^\top = 0 $$
Proof -By definition of generator matrix and parity-check matrix, $forall e_i\in H$, $e_iG^T=0$. +By definition of generator matrix and parity-check matrix, $forall e_i\in H$, $e_iG^\top=0$. -So $H G^T = 0$. +So $H G^\top = 0$.
#### Lemma 2 -Any matrix $M\in \mathbb{F}_q^{(n-k)\times n}$ such that $\operatorname{rank}(M) = n - k$ and $M G^T = 0$ is a parity-check matrix for $C$ (i.e. $C = \ker M$). +Any matrix $M\in \mathbb{F}_q^{(n-k)\times n}$ such that $\operatorname{rank}(M) = n - k$ and $M G^\top = 0$ is a parity-check matrix for $C$ (i.e. $C = \ker M$).
Proof @@ -62,7 +62,7 @@ It is sufficient to show that the two statements 1. $\forall c\in C, c=uG, u\in \mathbb{F}^k$ -$M c^T = M(uG)^T = M(G^T u^T) = 0$ since $M G^T = 0$. +$M c^\top = M(uG)^\top = M(G^\top u^\top) = 0$ since $M G^\top = 0$. Thus $C \subseteq \ker M$. @@ -84,15 +84,15 @@ We proceed by applying the lemma 2. 1. $\operatorname{rank}(H) = n - k$ since $H$ is a Vandermonde matrix times a diagonal matrix with no zero entries, so $H$ is invertible. -2. $H G^T = 0$. +2. $H G^\top = 0$. -note that $\forall$ row $i$ of $H$, $0\leq i\leq n-k-1$, $\forall$ column $j$ of $G^T$, $0\leq j\leq k-1$ +note that $\forall$ row $i$ of $H$, $0\leq i\leq n-k-1$, $\forall$ column $j$ of $G^\top$, $0\leq j\leq k-1$ So $$ \begin{aligned} -H G^T &= \begin{bmatrix} +H G^\top &= \begin{bmatrix} 1 & 1 & \cdots & 1\\ \alpha_1 & \alpha_2 & \cdots & \alpha_n\\ \alpha_1^2 & \alpha_2^2 & \cdots & \alpha_n^2\\ diff --git a/content/CSE5313/CSE5313_L11.md b/content/CSE5313/CSE5313_L11.md index 287eefc..7591d19 100644 --- a/content/CSE5313/CSE5313_L11.md +++ b/content/CSE5313/CSE5313_L11.md @@ -101,7 +101,7 @@ $$ Let $\mathcal{C}=[n,k,d]_q$. -The dual code of $\mathcal{C}$ is $\mathcal{C}^\perp=\{x\in \mathbb{F}^n_q|xc^T=0\text{ for all }c\in \mathcal{C}\}$. +The dual code of $\mathcal{C}$ is $\mathcal{C}^\perp=\{x\in \mathbb{F}^n_q|xc^\top=0\text{ for all }c\in \mathcal{C}\}$.
Example @@ -151,7 +151,7 @@ So $\langle f,h\rangle=0$.
Proof for the theorem -Recall that the dual code of $\operatorname{RM}(r,m)^\perp=\{x\in \mathbb{F}_2^m|xc^T=0\text{ for all }c\in \operatorname{RM}(r,m)\}$. +Recall that the dual code of $\operatorname{RM}(r,m)^\perp=\{x\in \mathbb{F}_2^m|xc^\top=0\text{ for all }c\in \operatorname{RM}(r,m)\}$. So $\operatorname{RM}(m-r-1,m)\subseteq \operatorname{RM}(r,m)^\perp$. diff --git a/content/CSE5313/CSE5313_L14.md b/content/CSE5313/CSE5313_L14.md index 4b8bbc1..218ac23 100644 --- a/content/CSE5313/CSE5313_L14.md +++ b/content/CSE5313/CSE5313_L14.md @@ -230,7 +230,7 @@ Step 1: Arrange the $B=\binom{k+1}{2}+k(d-k)$ symbols in a matrix $M$ follows: $$ M=\begin{pmatrix} S & T\\ -T^T & 0 +T^\top & 0 \end{pmatrix}\in \mathbb{F}_q^{d\times d} $$ @@ -267,15 +267,15 @@ Repair from (any) nodes $H = \{h_1, \ldots, h_d\}$. Newcomer contacts each $h_j$: “My name is $i$, and I’m lost.” -Node $h_j$ sends $c_{h_j}M c_i^T$ (inner product). +Node $h_j$ sends $c_{h_j}M c_i^\top$ (inner product). -Newcomer assembles $C_H Mc_i^T$. +Newcomer assembles $C_H Mc_i^\top$. $CH$ invertible by construction! -- Recover $Mc_i^T$. +- Recover $Mc_i^\top$. -- Recover $c_i^TM$ ($M$ is symmetric) +- Recover $c_i^\topM$ ($M$ is symmetric) #### Reconstruction on Product-Matrix MBR codes @@ -292,9 +292,9 @@ DC assembles $C_D M$. $\Psi_D$ invertible by construction. -- DC computes $\Psi_D^{-1}C_DM = (S+\Psi_D^{-1}\Delta_D^T, T)$ +- DC computes $\Psi_D^{-1}C_DM = (S+\Psi_D^{-1}\Delta_D^\top, T)$ - DC obtains $T$. -- Subtracts $\Psi_D^{-1}\Delta_D T^T$ from $S+\Psi_D^{-1}\Delta_D T^T$ to obtain $S$. +- Subtracts $\Psi_D^{-1}\Delta_D T^\top$ from $S+\Psi_D^{-1}\Delta_D T^\top$ to obtain $S$.
Fill an example here please. diff --git a/content/CSE5313/CSE5313_L19.md b/content/CSE5313/CSE5313_L19.md new file mode 100644 index 0000000..98ba394 --- /dev/null +++ b/content/CSE5313/CSE5313_L19.md @@ -0,0 +1,232 @@ +# CSE5313 Coding and information theory for data science (Lecture 19) + +## Private information retrieval + +### Problem setup + +Premise: + +- Database $X = \{x_1, \ldots, x_m\}$, each $x_i \in \mathbb{F}_q^k$ is a "file" (e.g., medical record). +- $X$ is coded $X \mapsto \{y_1, \ldots, y_n\}$, $y_j$ stored at server $j$. +- The user (physician) wants $x_i$. +- The user sends a query $q_j \sim Q_j$ to server $j$. +- Server $j$ responds with $a_j \sim A_j$. + +Decodability: + +- The user can retrieve the file: $H(X_i | A_1, \ldots, A_n) = 0$. + +Privacy: + +- $i$ is seen as $i \sim U = U_{m}$, reflecting server's lack of knowledge. +- $i$ must be kept private: $I(Q_j; U) = 0$ for all $j \in n$. + +> In short, we want to retrieve $x_i$ from the servers without revealing $i$ to the servers. + +### Private information retrieval from Replicated Databases + +#### Simple case, one server + +Say $n = 1, y_1 = X$. + +- All data is stored in one server. +- Simple solution: +- $q_1 =$ "send everything". +- $a_1 = y_1 = X$. + +Theorem: Information Theoretic PIR with $n = 1$ can only be achieved by downloading the entire database. + +- Can we do better if $n > 1$? + +#### Collusion parameter + +Key question for $n > 1$: Can servers collude? + +- I.e., does server $j$ see any $Q_\ell$, $\ell \neq j$? +- Key assumption: + - Privacy parameter $z$. + - At most $z$ servers can collude. + - $z = 1\implies$ No collusion. +- Requirement for $z = 1$: $I(Q_j; U) = 0$ for all $j \in n$. +- Requirement for a general $z$: + - $I(Q_\mathcal{T}; U) = 0$ for all $\mathcal{T} \in n$, $|\mathcal{T}| \leq z$, where $Q_\mathcal{T} = Q_\ell$ for all $\ell \in \mathcal{T}$. +- Motivation: + - Interception of communication links. + - Data breaches. + +Other assumptions: + +- Computational Private information retrieval (even all the servers are hacked, still cannot get the information -> solve np-hard problem): +- Non-zero MI + +#### Private information retrieval from 2-replicated databases + +First PIR protocol: Chor et al. FOCS ‘95. + +- The data $X = \{x_1, \ldots, x_m\}$ is replicated on two servers. + - $z = 1$, i.e., no collusion. +- Protocol: User has $i \sim U_{m}$. + - User generates $r \sim U_{\mathbb{F}_q^m}$. + - $q_1 = r, q_2 = r + e_i$ ($e_i \in \mathbb{F}_q^m$ is the $i$-th unit vector, $q_2$ is equivalent to one-time pad encryption of $x_i$ with key $r$). + - $a_j = q_j X^\top = \sum_{\ell \in m} q_j, \ell x_\ell$ + - Linear combination of the files according to the query vector $q_j$. +- Decoding? + - $a_2 - a_1 = q_2 - q_1 X^\top = e_i X^\top = x_i$. +- Download? + - $a_j =$ size of file $\implies$ downloading **twice** the size of the file. +- Privacy? + - Since $z = 1$, need to show $I(U; Q_i) = 0$. + - $I(U; Q_1) = I(e_U; F) = 0$ since $U$ and $F$ are independent. + - $I(U; Q_2) = I(e_U; F + e_U) = 0$ since this is one-time pad! + +##### Parameters and notations in PIR + +Parameters of the system: + +- $n =$ # servers (as in storage). +- $m =$ # files. +- $k =$ size of each file (as in storage). +- $z =$ max. collusion (as in secret sharing). +- $t =$ # of answers required to obtain $x_i$ (as in secret sharing). + - $n - t$ servers are “stragglers”, i.e., might not respond. + +Figures of merit: + +- PIR-rate = $\#$ desired symbols / $\#$ downloaded symbols +- PIR-capacity = largest possible rate. + +Notaional conventions: + +-The dataset $X = \{x_j\}_{j \in m} = \{x_{j, \ell}\}_{(j, \ell) \in [m] \times [k]}$ is seen as a vector in $\mathbb{F}_q^{mk}$. + +- Index $\mathbb{F}_q^{mk}$ using $[m] \times [k]$, i.e., $x_{j, \ell}$ is the $\ell$-th symbol of the $j$-th file. + +#### Private information retrieval from 4-replicated databases + +Consider $n = 4$ replicated servers, file size $k = 2$, collusion $z = 1$. + +Protocol: User has $i \sim U_{m}$. + +- Fix distinct nonzero $\alpha_1, \ldots, \alpha_4 \in \mathbb{F}_q$. +- Choose $r \sim U_{\mathbb{F}_q^{2m}}$. +- User sends $q_j = e_{i, 1} + \alpha_j e_{i, 2} + \alpha_j^2 r$ to each server $j$. +- Server $j$ responds with + $$ + a_j = q_j X^\top = e_{i, 1} X^\top + \alpha_j e_{i, 2} X^\top + \alpha_j^2 r X^\top + $$ + - This is an evaluation at $\alpha_j$ of the polynomial $f_i(w) = x_{i, 1} + x_{i, 2} \cdot w + r \cdot w^2$. + - Where $r$ is some random combination of the entries of $X$. +- Decoding? + - Any 3 responses suffice to interpolate $f_i$ and obtain $x_i = x_{i, 1}, x_{i, 2}$. + - $\implies t = 3$, (one straggler is allowed) +- Privacy? + - Does $q_j = e_{i, 1} + \alpha_j e_{i, 2} + \alpha_j^2 r$ look familiar? + - This is a share in [ramp scheme](CSE5313_L18.md#scheme-2-ramp-secret-sharing-scheme-mceliece-sarwate-scheme) with vector messages $m_1 = e_{i, 1}, m_2 = e_{i, 2}, m_i \in \mathbb{F}_q^{2m}$. + - This is equivalent to $2m$ "parallel" ramp scheme over $\mathbb{F}_q$. + - Each one reveals nothing to any $z = 1$ shareholders $\implies$ Private! + +### Private information retrieval from general replicated databases + +$n$ servers, $m$ files, file size $k$, $X \in \mathbb{F}_q^{mk}$. + +Server decodes $x_i$ from any $t$ responses. + +Any $\leq z$ servers might collude to infer $i$ ($z < t$). + +Protocol: User has $i \sim U_{m}$. + +- User chooses $r_1, \ldots, r_z \sim U_{\mathbb{F}_q^{mk}}$. +- User sends $q_j = \sum_{\ell=1}^k e_{i, \ell} \alpha_j^{\ell-1} + \sum_{\ell=1}^z r_\ell \alpha_j^{k+\ell-1}$ to each server $j$. +- Server $j$ responds with $a_j = q_j X^\top = f_i(\alpha_j)$. + - $f_i(w) = \sum_{\ell=1}^k e_{i, \ell} X^\top w^{\ell-1} + \sum_{\ell=1}^z r_\ell X^\top w^{k+\ell-1}$ (random combinations of $X$). + - Caveat: must have $t = k + z$. + - $\implies \deg f_i = k + z - 1 = t - 1$. +- Decoding? + - Interpolation from any $t$ evaluations of $f_i$. +- Privacy? + - Against any $z = t - k$ colluding servers, immediate from the proof of the ramp scheme. + +PIR-rate? + +- Each $a_j$ is a single field element. +- Download $t = k + z$ elements in $\mathbb{F}_q$ in order to obtain $x_i \in \mathbb{F}_q^k$. +- $\implies$ PIR-rate = $\frac{k}{k+z} = \frac{k}{t}$. + +#### Theorem: PIR-capacity for general replicated databases + +The PIR-capacity for $n$ replicated databases with $z$ colluding servers, $n - t$ unresponsive servers, and $m$ files is $C = \frac{1-\frac{z}{t}}{1-(\frac{z}{t})^m}$. + +- When $m \to \infty$, $C \to 1 - \frac{z}{t} = \frac{t-z}{t} = \frac{k}{t}$. +- The above scheme achieves PIR-capacity as $m \to \infty$ + +### Private information retrieval from coded databases + +#### Problem setup: + +Example: + +- $n = 3$ servers, $m$ files $x_j$, $x_j = x_{j, 1}, x_{j, 2}$, $k = 2$, and $q = 2$. +- Code each file with a parity code: $x_{j, 1}, x_{j, 2} \mapsto x_{j, 1}, x_{j, 2}, x_{j, 1} + x_{j, 2}$. +- Server $j \in 3$ stores all $j$-th symbols of all coded files. + +Queries, answers, decoding, and privacy must be tailored for the code at hand. + +With respect to a code $C$ and parameters $n, k, t, z$, such scheme is called coded-PIR. + +- The content for server $j$ is denoted by $c_j = c_{j, 1}, \ldots, c_{j, m}$. +- $C$ is usually an MDS code. + +#### Private information retrieval from parity-check codes + +Example: + + Say $z = 1$ (no collusion). + +- Protocol: User has $i \sim U_{m}$. +- User chooses $r_1, r_2 \sim U_{\mathbb{F}_2^m}$. +- Two queries to each server: + - $q_{1, 1} = r_1 + e_i$, $q_{1, 2} = r_2$. + - $q_{2, 1} = r_1$, $q_{2, 2} = r_2 + e_i$. + - $q_{3, 1} = r_1$, $q_{3, 2} = r_2$. +- Server $j$ responds with $q_{j, 1} c_j^\top$ and $q_{j, 2} c_j^\top$. +- Decoding? + - $q_{1, 1} c_1^\top + q_{2, 1} c_2^\top + q_{3, 1} c_3^\top = r_1 c_1 + c_2 + c_3 + e_i c_1^\top = r_1 \cdot 0^\top + x_{i, 1} = x_{i, 1}$. + - $q_{1, 1} c_1^\top + q_{2, 1} c_2^\top + q_{3, 1} c_3^\top = r_1 \cdot 0^\top + x_{i, 1} = x_{i, 1}$. + - $q_{1, 2} c_1^\top + q_{2, 2} c_2^\top + q_{3, 2} c_3^\top = r_2 c_1 + c_2 + c_3^\top + e_i c_2^\top = x_{i, 2}$. +- Privacy? + - Every server sees two uniformly random vectors in $\mathbb{F}_2^m$. + +
+Proof from coding-theoretic interpretation + +Let $G = g_1^\top, g_2^\top, g_3^\top$ be the generator matrix. + +- For every file $x_j = x_{j, 1}, x_{j, 2}$ we encode $x_j G = (x_{j, 1} g_1^\top, x_{j, 2} g_2^\top, x_{j, 1} g_3^\top) = (c_{j, 1}, c_{j, 2}, c_{j, 3})$. +- Server $j$ stores $X g_j^\top = (x_1^\top, \ldots, x_m^\top)^\top g_j^\top = (c_{j, 1}, \ldots, c_{j, m})^\top$. + +- By multiplying by $r_1$, the servers together store a codeword in $C$: + - $r_1 X g_1^\top, r_1 X g_2^\top, r_1 X g_3^\top = r_1 X G$. +- By replacing one of the $r_1$’s by $r_1 + e_i$, we introduce an error in that entry: + - $\left((r_1 + e_i) X g_1^\top, r_1 X g_2^\top, r_1 X g_3^\top\right) = r_1 X G + (e_i X g_1^\top, 0,0)$. +- Downloading this “erroneous” word from the servers and multiply by $H = h_1^\top, h_2^\top, h_3^\top$ be the parity-check matrix. + +$$ +\begin{aligned} +\left((r_1 + e_i) X g_1^\top, r_1 X g_2^\top, r_1 X g_3^\top\right) H^\top &= \left(r_1 X G + (e_i X g_1^\top, 0,0)\right) H^\top \\ +&= r_1 X G H^\top + (e_i X g_1^\top, 0,0) H^\top \\ +&= 0 + x_{i, 1} g_1^\top \\ +&= x_{i, 1}. +\end{aligned} +$$ + +> In homework we will show tha this work with any MDS code ($z=1$). + +- Say we obtained $x_{i, 1} g_1^\top, \ldots, x_{i, k} g_k^\top$ (𝑑 − 1 at a time, how?). +- $x_{i, 1} g_1^\top, \ldots, x_{i, k} g_k^\top = x_{i, B}$, where $B$ is a $k \times k$ submatrix of $G$. +- $B$ is a $k \times k$ submatrix of $G$ $\implies$ invertible! $\implies$ Obtain $x_{i}$. + +
+ +> [!TIP] +> +> error + known location $\implies$ erasure. $d = 2 \implies$ 1 erasure is correctable. diff --git a/content/CSE5313/CSE5313_L6.md b/content/CSE5313/CSE5313_L6.md index 9c19f0b..897b07a 100644 --- a/content/CSE5313/CSE5313_L6.md +++ b/content/CSE5313/CSE5313_L6.md @@ -92,10 +92,10 @@ Two equivalent ways to constructing a linear code: - A **parity check** matrix $H\in \mathbb{F}^{(n-k)\times n}$ with $(n-k)$ rows and $n$ columns. $$ - \mathcal{C}=\{c\in \mathbb{F}^n:Hc^T=0\} + \mathcal{C}=\{c\in \mathbb{F}^n:Hc^\top=0\} $$ - The right kernel of $H$ is $\mathcal{C}$. - - Multiplying $c^T$ by $H$ "checks" if $c\in \mathcal{C}$. + - Multiplying $c^\top$ by $H$ "checks" if $c\in \mathcal{C}$. ### Encoding of linear codes @@ -144,7 +144,7 @@ Decoding: $(y+e)\to x$, $y=xG$. Use **syndrome** to identify which coset $\mathcal{C}_i$ that the noisy-code to $\mathcal{C}_i+e$ belongs to. $$ -H(y+e)^T=H(y+e)=Hx+He=He +H(y+e)^\top=H(y+e)=Hx+He=He $$ ### Syndrome decoding @@ -215,7 +215,7 @@ Fourth row is $\mathcal{C}+(00100)$. Any two elements in a row are of the form $y_1'=y_1+e$ and $y_2'=y_2+e$ for some $e\in \mathbb{F}^n$. -Same syndrome if $H(y_1'+e)^T=H(y_2'+e)^T$. +Same syndrome if $H(y_1'+e)^\top=H(y_2'+e)^\top$. Entries in different rows have different syndrome. diff --git a/content/CSE5313/CSE5313_L7.md b/content/CSE5313/CSE5313_L7.md index 37b54e8..145769a 100644 --- a/content/CSE5313/CSE5313_L7.md +++ b/content/CSE5313/CSE5313_L7.md @@ -7,7 +7,7 @@ Let $\mathcal{C}= [n,k,d]_{\mathbb{F}}$ be a linear code. There are two equivalent ways to describe a linear code: 1. A generator matrix $G\in \mathbb{F}^{k\times n}_q$ with $k$ rows and $n$ columns, entry taken from $\mathbb{F}_q$. $\mathcal{C}=\{xG|x\in \mathbb{F}^k\}$ -2. A parity check matrix $H\in \mathbb{F}^{(n-k)\times n}_q$ with $(n-k)$ rows and $n$ columns, entry taken from $\mathbb{F}_q$. $\mathcal{C}=\{c\in \mathbb{F}^n:Hc^T=0\}$ +2. A parity check matrix $H\in \mathbb{F}^{(n-k)\times n}_q$ with $(n-k)$ rows and $n$ columns, entry taken from $\mathbb{F}_q$. $\mathcal{C}=\{c\in \mathbb{F}^n:Hc^\top=0\}$ ### Dual code @@ -21,7 +21,7 @@ $$ Also, the alternative definition is: -1. $C^{\perp}=\{x\in \mathbb{F}^n:Gx^T=0\}$ (only need to check basis of $C$) +1. $C^{\perp}=\{x\in \mathbb{F}^n:Gx^\top=0\}$ (only need to check basis of $C$) 2. $C^{\perp}=\{xH|x\in \mathbb{F}^{n-k}\}$ By rank-nullity theorem, $dim(C^{\perp})=n-dim(C)=n-k$. @@ -87,7 +87,7 @@ Assume minimum distance is $d$. Show that every $d-1$ columns of $H$ are indepen - Fact: In linear codes minimum distance is the minimum weight ($d_H(x,y)=w_H(x-y)$). -Indeed, if there exists a $d-1$ columns of $H$ that are linearly dependent, then we have $Hc^T=0$ for some $c\in \mathcal{C}$ with $w_H(c) @@ -276,7 +276,7 @@ In [The random Matrix Theory of the Classical Compact groups](https://case.edu/a $O(n)$ (the group of all $n\times n$ **orthogonal matrices** over $\mathbb{R}$), $$ -O(n)=\{A\in \mathbb{R}^{n\times n}: AA^T=A^T A=I_n\} +O(n)=\{A\in \mathbb{R}^{n\times n}: AA^\top=A^\top A=I_n\} $$ $U(n)$ (the group of all $n\times n$ **unitary matrices** over $\mathbb{C}$), @@ -296,7 +296,7 @@ $$ $Sp(2n)$ (the group of all $2n\times 2n$ symplectic matrices over $\mathbb{C}$), $$ -Sp(2n)=\{U\in U(2n): U^T J U=UJU^T=J\} +Sp(2n)=\{U\in U(2n): U^\top J U=UJU^\top=J\} $$ where $J=\begin{pmatrix} diff --git a/content/Math401/Freiwald_summer/Math401_P1_2.md b/content/Math401/Freiwald_summer/Math401_P1_2.md index 8f17cf0..e10a222 100644 --- a/content/Math401/Freiwald_summer/Math401_P1_2.md +++ b/content/Math401/Freiwald_summer/Math401_P1_2.md @@ -8,10 +8,10 @@ The page's lemma is a fundamental result in quantum information theory that prov The special orthogonal group $SO(n)$ is the set of all **distance preserving** linear transformations on $\mathbb{R}^n$. -It is the group of all $n\times n$ orthogonal matrices ($A^T A=I_n$) on $\mathbb{R}^n$ with determinant $1$. +It is the group of all $n\times n$ orthogonal matrices ($A^\top A=I_n$) on $\mathbb{R}^n$ with determinant $1$. $$ -SO(n)=\{A\in \mathbb{R}^{n\times n}: A^T A=I_n, \det(A)=1\} +SO(n)=\{A\in \mathbb{R}^{n\times n}: A^\top A=I_n, \det(A)=1\} $$
@@ -22,7 +22,7 @@ In [The random Matrix Theory of the Classical Compact groups](https://case.edu/a $O(n)$ (the group of all $n\times n$ **orthogonal matrices** over $\mathbb{R}$), $$ -O(n)=\{A\in \mathbb{R}^{n\times n}: AA^T=A^T A=I_n\} +O(n)=\{A\in \mathbb{R}^{n\times n}: AA^\top=A^\top A=I_n\} $$ $U(n)$ (the group of all $n\times n$ **unitary matrices** over $\mathbb{C}$), @@ -42,7 +42,7 @@ $$ $Sp(2n)$ (the group of all $2n\times 2n$ symplectic matrices over $\mathbb{C}$), $$ -Sp(2n)=\{U\in U(2n): U^T J U=UJU^T=J\} +Sp(2n)=\{U\in U(2n): U^\top J U=UJU^\top=J\} $$ where $J=\begin{pmatrix} diff --git a/content/Math401/Freiwald_summer/Math401_T2.md b/content/Math401/Freiwald_summer/Math401_T2.md index 1bfddad..5e4ebe5 100644 --- a/content/Math401/Freiwald_summer/Math401_T2.md +++ b/content/Math401/Freiwald_summer/Math401_T2.md @@ -74,7 +74,7 @@ $c\in \mathbb{C}$. The matrix transpose is defined by $$ -u^T=(a_1,a_2,\cdots,a_n)^T=\begin{pmatrix} +u^\top=(a_1,a_2,\cdots,a_n)^\top=\begin{pmatrix} a_1 \\ a_2 \\ \vdots \\ @@ -694,7 +694,7 @@ $$ The unitary group $U(n)$ is the group of all $n\times n$ unitary matrices. -Such that $A^*=A$, where $A^*$ is the complex conjugate transpose of $A$. $A^*=(\overline{A})^T$. +Such that $A^*=A$, where $A^*$ is the complex conjugate transpose of $A$. $A^*=(\overline{A})^\top$. #### Cyclic group $\mathbb{Z}_n$ diff --git a/content/Math429/Math429_L12.md b/content/Math429/Math429_L12.md index d35b456..d5af221 100644 --- a/content/Math429/Math429_L12.md +++ b/content/Math429/Math429_L12.md @@ -25,7 +25,7 @@ Let $A$ be an $m \times n$ matrix, then * The column rank of $A$ is the dimension of the span of the columns in $\mathbb{F}^{m,1}$. * The row range of $A$ is the dimension of the span of the row in $\mathbb{F}^{1,n}$. -> Transpose: $A^t=A^T$ refers to swapping rows and columns +> Transpose: $A^t=A^\top$ refers to swapping rows and columns #### Theorem 3.56 (Column-Row Factorization) @@ -64,7 +64,7 @@ Proof: Note that by **Theorem 3.56**, if $A$ is $m\times n$ and has column rank $c$. $A=CR$ for some $C$ is a $m\times c$ matrix, $R$ is a $c\times n$ matrices, ut the rows of $CR$ are a linear combination of the rows of $R$, and row rank of $R\leq C$. So row rank $A\leq$ column rank of $A$. -Taking a transpose of matrix, then row rank of $A^T$ (column rank of $A$) $\leq$ column rank of $A^T$ (row rank $A$). +Taking a transpose of matrix, then row rank of $A^\top$ (column rank of $A$) $\leq$ column rank of $A^\top$ (row rank $A$). So column rank is equal to row rank. diff --git a/content/Math429/Math429_L18.md b/content/Math429/Math429_L18.md index a371cd4..21f16bc 100644 --- a/content/Math429/Math429_L18.md +++ b/content/Math429/Math429_L18.md @@ -39,13 +39,13 @@ $T$ is surjective $\iff range\ T=W\iff null\ T'=0\iff T'$ injective Let $V,W$ be a finite dimensional vector space, $T\in \mathscr{L}(V,W)$ -Then $M(T')=(M(T))^T$. Where the basis for $M(T)'$ are the dual basis to the ones for $M(T)$ +Then $M(T')=(M(T))^\top$. Where the basis for $M(T)'$ are the dual basis to the ones for $M(T)$ #### Theorem 3.133 $col\ rank\ A=row\ rank\ A$ -Proof: $col\ rank\ A=col\ rank\ (M(T))=dim\ range\ T=dim\ range\ T'=dim\ range\ T'=col\ rank\ (M(T'))=col\ rank\ (M(T)^T)=row\ rank\ (M(T))$ +Proof: $col\ rank\ A=col\ rank\ (M(T))=dim\ range\ T=dim\ range\ T'=dim\ range\ T'=col\ rank\ (M(T'))=col\ rank\ (M(T)^\top)=row\ rank\ (M(T))$ ## Chapter V Eigenvalue and Eigenvectors