This commit is contained in:
Trance-0
2025-09-10 11:10:38 -05:00
parent 384e538bc9
commit cbe8e9895d
21 changed files with 644 additions and 21 deletions

View File

@@ -0,0 +1,110 @@
# Math 401 Paper 1: Concentration of measure effects in quantum information (Patrick Hayden)
[Concentration of measure effects in quantum information](https://www.ams.org/books/psapm/068/2762144)
A more comprehensive version of this paper is in [Aspect of generic entanglement](https://arxiv.org/pdf/quant-ph/0407049).
## Quantum codes
### Preliminaries
#### Daniel Gottesman's mathematics of quantum error correction
##### Quantum channels
Encoding channel and decoding channel
That is basically two maps that encode and decode the qbits. You can think of them as a quantum channel.
#### Quantum capacity for a quantum channel
The quantum capacity of a quantum channel is governed by the HSW noisy coding theorem, which is the counterpart for the Shannon's noisy coding theorem in quantum information settings.
#### Lloyd-Shor-Devetak theorem
Note, the model of the noisy channel in quantum settings is a map $\eta$: that maps a state $\rho$ to another state $\eta(\rho)$. This should be a CPTP map.
Let $A'\cong A$ and $|\psi\rangle\in A'\otimes A$.
Then $Q(\mathcal{N})\geq H(B)_\sigma-H(A'B)_\sigma$.
where $\sigma=(I_{A'}\otimes \mathcal{N})\circ|\psi\rangle\langle\psi|$.
(above is the official statement in the paper of Patrick Hayden)
That should means that in the limit of many uses, the optimal rate at which A can reliably sent qbits to $B$ ($1/n\log d$) through $\eta$ is given by the regularization of the formula
$$
Q(\eta)=\max_{\phi_{AB}}[-H(B|A)_\sigma]
$$
where $H(B|A)_\sigma$ is the conditional entropy of $B$ given $A$ under the state $\sigma$.
$\phi_{AB}=(I_{A'}\otimes \eta)\circ\omega_{AB}$
(above formula is from the presentation of Patrick Hayden.)
For now we ignore this part if we don't consider the application of the following sections. The detailed explanation will be added later (hopefully very soon).
---
### Surprise in high-dimensional quantum systems
#### Levy's lemma
Given an $\eta$-Lipschitz function $f:S^n\to \mathbb{R}$ with median $M$, the probability that a random $x\in_R S^n$ is further than $\epsilon$ from $M$ is bounded above by $\exp(-\frac{C(n-1)\epsilon^2}{\eta^2})$, for some constant $C>0$.
$$
\operatorname{Pr}[|f(x)-M|>\epsilon]\leq \exp(-\frac{C(n-1)\epsilon^2}{\eta^2})
$$
[Decomposing the statement in detail as side note 3](Math401_P1_3.md)
### Random states and random subspaces
Choose a random pure state $\sigma=|\psi\rangle\langle\psi|$ from $A'\otimes A$.
The expected value of the entropy of entanglement is known and satisfies a concentration inequality.
$$
\mathbb{E}[H(\psi_A)] \geq \log_2(d_A)-\frac{1}{2\ln(2)}\frac{d_A}{d_B}
$$
[Decomposing the statement in detail as side note 2](Math401_P1_2.md)
From the Levy's lemma, we have
If we define $\beta=\frac{d_A}{\log_2(d_B)}$, then we have
$$
\operatorname{Pr}[H(\psi_A) < \log_2(d_A)-\alpha-\beta] \leq \exp\left(-\frac{(d_Ad_B-1)C\alpha^2}{(\log_2(d_A))^2}\right)
$$
where $C$ is a small constant and $d_B\geq d_A\geq 3$.
> Noted in [Aspect of generic entanglement](https://arxiv.org/pdf/quant-ph/0407049) $C_3=(8\pi^2\ln(2))^{-1}$.
#### ebits and qbits
### Superdense coding of quantum states
It is a procedure defined as follows:
Suppose $A$ and $B$ share a Bell state $|\Phi^+\rangle=\frac{1}{\sqrt{2}}(|00\rangle+|11\rangle)$, where $A$ holds the first part and $B$ holds the second part.
$A$ wish to send 2 classical bits to $B$.
$A$ performs one of four Pauli unitaries on the combined state of entangled qubits $\otimes$ one qubit. Then $A$ sends the resulting one qubit to $B$.
This operation extends the initial one entangled qubit to a system of one of four orthogonal Bell states.
$B$ performs a measurement on the combined state of the one qubit and the entangled qubits he holds.
$B$ decodes the result and obtains the 2 classical bits sent by $A$.
### Consequences for mixed state entanglement measures
#### Quantum mutual information
### Multipartite entanglement
> The role of the paper in Physics can be found in (15.86) on book Geometry of Quantum states.

View File

@@ -0,0 +1,154 @@
# Math 401 Paper 1, Side note 1: Quantum information theory and Measure concentration
## Typicality
> The idea of typicality in high-dimensions is very important topic in understanding this paper and taking it to the next level of detail under language of mathematics. I'm trying to comprehend these material and write down my understanding in this note.
Let $X$ be the alphabet of our source of information.
Let $x^n=x_1,x_2,\cdots,x_n$ be a sequence with $x_i\in X$.
We say that $x^n$ is $\epsilon$-typical with respect to $p(x)$ if
- For all $a\in X$ with $p(a)>0$, we have
$$
\|\frac{1}{n}N(a|x^n)-p(a)|\leq \frac{\epsilon}{\|X\|}
$$
- For all $a\in X$ with $p(a)=0$, we have
$$
N(a|x^n)=0
$$
Here $N(a|x^n)$ is the number of times $a$ appears in $x^n$. That's basically saying that:
1. The difference between **the probability of $a$ appearing in $x^n$** and the **probability of $a$ appearing in the source of information $p(a)$** should be within $\epsilon$ divided by the size of the alphabet $X$ of the source of information.
2. The probability of $a$ not appearing in $x^n$ should be 0.
Here are two easy propositions that can be proved:
For $\epsilon>0$, the probability of a sequence being $\epsilon$-typical goes to 1 as $n$ goes to infinity.
If $x^n$ is $\epsilon$-typical, then the probability of $x^n$ is produced is $2^{-n[H(X)+\epsilon]}\leq p(x^n)\leq 2^{-n[H(X)-\epsilon]}$.
The number of $\epsilon$-typical sequences is at least $2^{n[H(X)+\epsilon]}$.
Recall that $H(X)=-\sum_{a\in X}p(a)\log_2 p(a)$ is the entropy of the source of information.
## Shannon theory in Quantum information theory
Shannon theory provides a way to quantify the amount of information in a message.
Practically speaking:
- A holy grail for error-correcting codes
- Conceptually speaking:
- An operationally-motivated way of thinking about correlations
- Whats missing (for a quantum mechanic)?
- Features from linear structure:
- Entanglement and non-orthogonality
## Partial trace and purification
### Partial trace
Recall that the bipartite state of a quantum system is a linear operator on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$, where $\mathscr{A}$ and $\mathscr{B}$ are finite-dimensional Hilbert spaces.
#### Definition of partial trace for arbitrary linear operators
Let $T$ be a linear operator on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$, where $\mathscr{A}$ and $\mathscr{B}$ are finite-dimensional Hilbert spaces.
An operator $T$ on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$ can be written as (by the definition of [tensor product of linear operators](https://notenextra.trance-0.com/Math401/Math401_T2#tensor-products-of-linear-operators))
$$
T=\sum_{i=1}^n a_i A_i\otimes B_i
$$
where $A_i$ is a linear operator on $\mathscr{A}$ and $B_i$ is a linear operator on $\mathscr{B}$.
The $\mathscr{B}$-partial trace of $T$ ($\operatorname{Tr}_{\mathscr{B}}(T):\mathcal{L}(\mathscr{A}\otimes \mathscr{B})\to \mathcal{L}(\mathscr{A})$) is the linear operator on $\mathscr{A}$ defined by
$$
\operatorname{Tr}_{\mathscr{B}}(T)=\sum_{i=1}^n a_i \operatorname{Tr}(B_i) A_i
$$
#### Partial trace for density operators
Let $\rho$ be a density operator in $\mathscr{H}_1\otimes\mathscr{H}_2$, the partial trace of $\rho$ over $\mathscr{H}_2$ is the density operator in $\mathscr{H}_1$ (reduced density operator for the subsystem $\mathscr{H}_1$) given by:
$$
\rho_1\coloneqq\operatorname{Tr}_2(\rho)
$$
<details>
<summary>Examples</summary>
Let $\rho=\frac{1}{\sqrt{2}}(|01\rangle+|10\rangle)$ be a density operator on $\mathscr{H}=\mathbb{C}^2\otimes \mathbb{C}^2$.
Expand the expression of $\rho$ in the basis of $\mathbb{C}^2\otimes\mathbb{C}^2$ using linear combination of basis vectors:
$$
\rho=\frac{1}{2}(|01\rangle\langle 01|+|01\rangle\langle 10|+|10\rangle\langle 01|+|10\rangle\langle 10|)
$$
Note $\operatorname{Tr}_2(|ab\rangle\langle cd|)=|a\rangle\langle c|\cdot \langle b|d\rangle$.
Then the reduced density operator of the subsystem $\mathbb{C}^2$ in first qubit is, note the $\langle 0|0\rangle=\langle 1|1\rangle=1$ and $\langle 0|1\rangle=\langle 1|0\rangle=0$:
$$
\begin{aligned}
\rho_1&=\operatorname{Tr}_2(\rho)\\
&=\frac{1}{2}(\langle 1|1\rangle |0\rangle\langle 0|+\langle 0|1\rangle |0\rangle\langle 1|+\langle 1|0\rangle |1\rangle\langle 0|+\langle 0|0\rangle |1\rangle\langle 1|)\\
&=\frac{1}{2}(|0\rangle\langle 0|+|1\rangle\langle 1|)\\
&=\frac{1}{2}I
\end{aligned}
$$
is a mixed state.
</details>
### Purification
Let $\rho$ be any [state](https://notenextra.trance-0.com/Math401/Math401_T6#pure-states) (may not be pure) on the finite dimensional Hilbert space $\mathscr{H}$. then there exists a unit vector $w\in \mathscr{H}\otimes \mathscr{H}$ such that $\rho=\operatorname{Tr}_2(|w\rangle\langle w|)$ is a pure state.
<details>
<summary>Proof</summary>
Let $(u_1,u_2,\cdots,u_n)$ be an orthonormal basis of $\mathscr{H}$ consisting of eigenvectors of $\rho$ for the eigenvalues $p_1,p_2,\cdots,p_n$. As $\rho$ is a states, $p_i\geq 0$ for all $i$ and $\sum_{i=1}^n p_i=1$.
We can write $\rho$ as
$$
\rho=\sum_{i=1}^n p_i |u_i\rangle\langle u_i|
$$
Let $w=\sum_{i=1}^n \sqrt{p_i} u_i\otimes u_i$, note that $w$ is a unit vector (pure state). Then
$$
\begin{aligned}
\operatorname{Tr}_2(|w\rangle\langle w|)&=\operatorname{Tr}_2(\sum_{i=1}^n \sum_{j=1}^n \sqrt{p_ip_j} |u_i\otimes u_i\rangle \langle u_j\otimes u_j|)\\
&=\sum_{i=1}^n \sum_{j=1}^n \sqrt{p_ip_j} \operatorname{Tr}_2(|u_i\otimes u_i\rangle \langle u_j\otimes u_j|)\\
&=\sum_{i=1}^n \sum_{j=1}^n \sqrt{p_ip_j} \langle u_i|u_j\rangle |u_i\rangle\langle u_i|\\
&=\sum_{i=1}^n \sum_{j=1}^n \sqrt{p_ip_j} \delta_{ij} |u_i\rangle\langle u_i|\\
&=\sum_{i=1}^n p_i |u_i\rangle\langle u_i|\\
&=\rho
\end{aligned}
$$
is a pure state.
QED
</details>
## Drawing the connection between the space $S^{2n+1}$, $CP^n$, and $\mathbb{R}$
A pure quantum state of size $N$ can be identified with a **Hopf circle** on the sphere $S^{2N-1}$.
A random pure state $|\psi\rangle$ of a bipartite $N\times K$ system such that $K\geq N\geq 3$.
The partial trace of such system produces a mixed state $\rho(\psi)=\operatorname{Tr}_K(|\psi\rangle\langle \psi|)$, with induced measure $\mu_K$. When $K=N$, the induced measure $\mu_K$ is the Hilbert-Schmidt measure.
Consider the function $f:S^{2N-1}\to \mathbb{R}$ defined by $f(x)=S(\rho(\psi))$, where $S(\cdot)$ is the von Neumann entropy. The Lipschitz constant of $f$ is $\sim \ln N$.

View File

@@ -0,0 +1,101 @@
# Math 401 Paper 1, Side note 2: Page's lemma
The page's lemma is a fundamental result in quantum information theory that provides a lower bound on the probability of error in a quantum channel.
## Basic definitions
### $SO(n)$
The special orthogonal group $SO(n)$ is the set of all **distance preserving** linear transformations on $\mathbb{R}^n$.
It is the group of all $n\times n$ orthogonal matrices ($A^T A=I_n$) on $\mathbb{R}^n$ with determinant $1$.
$$
SO(n)=\{A\in \mathbb{R}^{n\times n}: A^T A=I_n, \det(A)=1\}
$$
<details>
<summary>Extensions</summary>
In [The random Matrix Theory of the Classical Compact groups](https://case.edu/artsci/math/esmeckes/Haar_book.pdf), the author gives a more general definition of the Haar measure on the compact group $SO(n)$,
$O(n)$ (the group of all $n\times n$ **orthogonal matrices** over $\mathbb{R}$),
$$
O(n)=\{A\in \mathbb{R}^{n\times n}: AA^T=A^T A=I_n\}
$$
$U(n)$ (the group of all $n\times n$ **unitary matrices** over $\mathbb{C}$),
$$
U(n)=\{A\in \mathbb{C}^{n\times n}: A^*A=AA^*=I_n\}
$$
Recall that $A^*$ is the complex conjugate transpose of $A$.
$SU(n)$ (the group of all $n\times n$ unitary matrices over $\mathbb{C}$ with determinant $1$),
$$
SU(n)=\{A\in \mathbb{C}^{n\times n}: A^*A=AA^*=I_n, \det(A)=1\}
$$
$Sp(2n)$ (the group of all $2n\times 2n$ symplectic matrices over $\mathbb{C}$),
$$
Sp(2n)=\{U\in U(2n): U^T J U=UJU^T=J\}
$$
where $J=\begin{pmatrix}
0 & I_n \\
-I_n & 0
\end{pmatrix}$ is the standard symplectic matrix.
</details>
### Haar measure
Let $(SO(n), \| \cdot \|, \mu)$ be a metric measure space where $\| \cdot \|$ is the [Hilbert-Schmidt norm](https://notenextra.trance-0.com/Math401/Math401_T2#definition-of-hilbert-schmidt-norm) and $\mu$ is the measure function.
The Haar measure on $SO(n)$ is the unique probability measure that is invariant under the action of $SO(n)$ on itself.
That is also called _translation-invariant_.
That is, fixing $B\in SO(n)$, $\forall A\in SO(n)$, $\mu(A\cdot B)=\mu(B\cdot A)=\mu(B)$.
The Haar measure is the unique probability measure that is invariant under the action of $SO(n)$ on itself.
_The existence and uniqueness of the Haar measure is a theorem in compact lie group theory. For this research topic, we will not prove it._
### Random sampling on the $\mathbb{C}P^n$
Note that the space of pure state in bipartite system
## Statement
Choosing a random pure quantum state $\rho$ from the bi-partite pure state space $\mathcal{H}_A\otimes\mathcal{H}_B$ with the uniform distribution, the expected entropy of the reduced state $\rho_A$ is:
$$
\mathbb{E}[H(\rho_A)]\geq \ln d_A -\frac{1}{2\ln 2} \frac{d_A}{d_B}
$$
## Page's conjecture
A quantum system $AB$ with the Hilbert space dimension $mn$ in a pure state $\rho_{AB}$ has entropy $0$ but the entropy of the reduced state $\rho_A$, assume $m\leq n$, then entropy of the two subsystem $A$ and $B$ is greater than $0$.
unless $A$ and $B$ are separable.
In the original paper, the entropy of the average state taken under the unitary invariant Haar measure is:
$$
S_{m,n}=\sum_{k=n+1}^{mn}\frac{1}{k}-\frac{m-1}{2n}\simeq \ln m-\frac{m}{2n}
$$
## References
- [The random Matrix Theory of the Classical Compact groups](https://case.edu/artsci/math/esmeckes/Haar_book.pdf)
- [Page's conjecture](https://journals.aps.org/prl/pdf/10.1103/PhysRevLett.71.1291)
- [Page's conjecture simple proof](https://journals.aps.org/pre/pdf/10.1103/PhysRevE.52.5653)
- [Geometry of quantum states an introduction to quantum entanglement second edition](https://www.cambridge.org/core/books/geometry-of-quantum-states/46B62FE3F9DA6E0B4EDDAE653F61ED8C)

View File

@@ -0,0 +1,299 @@
# Math 401 Paper 1, Side note 3: Levy's concentration theorem
Our goal is to prove the generalized version of Levy's concentration theorem used in Hayden's work for $\eta$-Lipschitz functions.
Let $f:S^{n-1}\to \mathbb{R}$ be a $\eta$-Lipschitz function. Let $M_f$ denote the median of $f$ and $\langle f\rangle$ denote the mean of $f$. (Note this can be generalized to many other manifolds.)
Select a random point $x\in S^{n-1}$ with $n>2$ according to the uniform measure (Haar measure). Then the probability of observing a value of $f$ much different from the reference value is exponentially small.
$$
\operatorname{Pr}[|f(x)-M_f|>\epsilon]\leq \exp(-\frac{n\epsilon^2}{2\eta^2})
$$
$$
\operatorname{Pr}[|f(x)-\langle f\rangle|>\epsilon]\leq 2\exp(-\frac{(n-1)\epsilon^2}{2\eta^2})
$$
> This version of Levy's concentration theorem can be found in [Geometry of Quantum states](https://www.cambridge.org/core/books/geometry-of-quantum-states/46B62FE3F9DA6E0B4EDDAE653F61ED8C) 15.84 and 15.85.
## Basic definitions
### Lipschitz function
#### $\eta$-Lipschitz function
Let $(X,\operatorname{dist}_X)$ and $(Y,\operatorname{dist}_Y)$ be two metric spaces. A function $f:X\to Y$ is said to be $\eta$-Lipschitz if there exists a constant $L\in \mathbb{R}$ such that
$$
\operatorname{dist}_Y(f(x),f(y))\leq L\operatorname{dist}_X(x,y)
$$
for all $x,y\in X$. And $\eta=\|f\|_{\operatorname{Lip}}=\inf_{L\in \mathbb{R}}L$.
That basically means that the function $f$ should not change the distance between any two pairs of points in $X$ by more than a factor of $L$.
## Levy's concentration theorem in _High-dimensional probability_ by Roman Vershynin
### Levy's concentration theorem (Vershynin's version)
> This theorem is exactly the 5.1.4 on the _High-dimensional probability_ by Roman Vershynin.
#### Isoperimetric inequality on $\mathbb{R}^n$
Among all subsets $A\subset \mathbb{R}^n$ with a given volume, the Euclidean ball has the minimal area.
That is, for any $\epsilon>0$, Euclidean balls minimize the volume of the $\epsilon$-neighborhood of $A$.
Where the volume of the $\epsilon$-neighborhood of $A$ is defined as
$$
A_\epsilon(A)\coloneqq \{x\in \mathbb{R}^n: \exists y\in A, \|x-y\|_2\leq \epsilon\}=A+\epsilon B_2^n
$$
Here the $\|\cdot\|_2$ is the Euclidean norm. (The theorem holds for both geodesic metric on sphere and Euclidean metric on $\mathbb{R}^n$.)
#### Isoperimetric inequality on the sphere
Let $\sigma_n(A)$ denotes the normalized area of $A$ on $n$ dimensional sphere $S^n$. That is $\sigma_n(A)\coloneqq\frac{\operatorname{Area}(A)}{\operatorname{Area}(S^n)}$.
Let $\epsilon>0$. Then for any subset $A\subset S^n$, given the area $\sigma_n(A)$, the spherical caps minimize the volume of the $\epsilon$-neighborhood of $A$.
> The above two inequalities is not proved in the Book _High-dimensional probability_. But you can find it in the Appendix C of Gromov's book _Metric Structures for Riemannian and Non-Riemannian Spaces_.
To continue prove the theorem, we use sub-Gaussian concentration *(Chapter 3 of _High-dimensional probability_ by Roman Vershynin)* of sphere $\sqrt{n}S^n$.
This will leads to some constant $C>0$ such that the following lemma holds:
#### The "Blow-up" lemma
Let $A$ be a subset of sphere $\sqrt{n}S^n$, and $\sigma$ denotes the normalized area of $A$. Then if $\sigma\geq \frac{1}{2}$, then for every $t\geq 0$,
$$
\sigma(A_t)\geq 1-2\exp(-ct^2)
$$
where $A_t=\{x\in S^n: \operatorname{dist}(x,A)\leq t\}$ and $c$ is some positive constant.
#### Proof of the Levy's concentration theorem
Proof:
Without loss of generality, we can assume that $\eta=1$. Let $M$ denotes the median of $f(X)$.
So $\operatorname{Pr}[|f(X)\leq M|]\geq \frac{1}{2}$, and $\operatorname{Pr}[|f(X)\geq M|]\geq \frac{1}{2}$.
Consider the sub-level set $A\coloneqq \{x\in \sqrt{n}S^n: |f(x)|\leq M\}$.
Since $\operatorname{Pr}[X\in A]\geq \frac{1}{2}$, by the blow-up lemma, we have
$$
\operatorname{Pr}[X\in A_t]\geq 1-2\exp(-ct^2)
$$
And since
$$
\operatorname{Pr}[X\in A_t]\leq \operatorname{Pr}[f(X)\leq M+t]
$$
Combining the above two inequalities, we have
$$
\operatorname{Pr}[f(X)\leq M+t]\geq 1-2\exp(-ct^2)
$$
## Levy's concentration theorem in _Metric Structures for Riemannian and Non-Riemannian Spaces_ by M. Gromov
### Levy's concentration theorem (Gromov's version)
> The Levy's lemma can also be found in _Metric Structures for Riemannian and Non-Riemannian Spaces_ by M. Gromov. $3\frac{1}{2}.19$ The Levy concentration theory.
#### Theorem $3\frac{1}{2}.19$ Levy concentration theorem:
An arbitrary 1-Lipschitz function $f:S^n\to \mathbb{R}$ concentrates near a single value $a_0\in \mathbb{R}$ as strongly as the distance function does.
That is
$$
\mu\{x\in S^n: |f(x)-a_0|\geq\epsilon\} < \kappa_n(\epsilon)\leq 2\exp(-\frac{(n-1)\epsilon^2}{2})
$$
where
$$
\kappa_n(\epsilon)=\frac{\int_\epsilon^{\frac{\pi}{2}}\cos^{n-1}(t)dt}{\int_0^{\frac{\pi}{2}}\cos^{n-1}(t)dt}
$$
$a_0$ is the **Levy mean** of function $f$, that is the level set of $f^{-1}:\mathbb{R}\to S^n$ divides the sphere into equal halves, characterized by the following equality:
$$
\mu(f^{-1}(-\infty,a_0])\geq \frac{1}{2} \text{ and } \mu(f^{-1}[a_0,\infty))\geq \frac{1}{2}
$$
Hardcore computing may generates the bound but M. Gromov did not make the detailed explanation here.
> Detailed proof by Takashi Shioya.
>
> The central idea is to draw the connection between the given three topological spaces, $S^{2n+1}$, $CP^n$ and $\mathbb{R}$.
First, we need to introduce the following distribution and lemmas/theorems:
**OBSERVATION**
consider the orthogonal projection from $\mathbb{R}^{n+1}$, the space where $S^n$ is embedded, to $\mathbb{R}^k$, we denote the restriction of the projection as $\pi_{n,k}:S^n(\sqrt{n})\to \mathbb{R}^k$. Note that $\pi_{n,k}$ is a 1-Lipschitz function (projection will never increase the distance between two points).
We denote the normalized Riemannian volume measure on $S^n(\sqrt{n})$ as $\sigma^n(\cdot)$, and $\sigma^n(S^n(\sqrt{n}))=1$.
#### Definition of Gaussian measure on $\mathbb{R}^k$
We denote the Gaussian measure on $\mathbb{R}^k$ as $\gamma^k$.
$$
d\gamma^k(x)\coloneqq\frac{1}{\sqrt{2\pi}^k}\exp(-\frac{1}{2}\|x\|^2)dx
$$
$x\in \mathbb{R}^k$, $\|x\|^2=\sum_{i=1}^k x_i^2$ is the Euclidean norm, and $dx$ is the Lebesgue measure on $\mathbb{R}^k$.
Basically, you can consider the Gaussian measure as the normalized Lebesgue measure on $\mathbb{R}^k$ with standard deviation $1$.
#### Maxwell-Boltzmann distribution law
> It is such a wonderful fact for me, that the projection of $n+1$ dimensional sphere with radius $\sqrt{n}$ to $\mathbb{R}^k$ is a Gaussian distribution as $n\to \infty$.
For any natural number $k$,
$$
\frac{d(\pi_{n,k})_*\sigma^n(x)}{dx}\to \frac{d\gamma^k(x)}{dx}
$$
where $(\pi_{n,k})_*\sigma^n$ is the push-forward measure of $\sigma^n$ by $\pi_{n,k}$.
In other words,
$$
(\pi_{n,k})_*\sigma^n\to \gamma^k\text{ weakly as }n\to \infty
$$
<details>
<summary>Proof</summary>
We denote the $n$ dimensional volume measure on $\mathbb{R}^k$ as $\operatorname{vol}_k$.
Observe that $\pi_{n,k}^{-1}(x),x\in \mathbb{R}^k$ is isometric to $S^{n-k}(\sqrt{n-\|x\|^2})$, that is, for any $x\in \mathbb{R}^k$, $\pi_{n,k}^{-1}(x)$ is a sphere with radius $\sqrt{n-\|x\|^2}$ (by the definition of $\pi_{n,k}$).
So,
$$
\begin{aligned}
\frac{d(\pi_{n,k})_*\sigma^n(x)}{dx}&=\frac{\operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(x))}{\operatorname{vol}_k(S^n(\sqrt{n}))}\\
&=\frac{(n-\|x\|^2)^{\frac{n-k}{2}}}{\int_{\|x\|\leq \sqrt{n}}(n-\|x\|^2)^{\frac{n-k}{2}}dx}\\
\end{aligned}
$$
as $n\to \infty$.
note that $\lim_{n\to \infty}{(1-\frac{a}{n})^n}=e^{-a}$ for any $a>0$.
$(n-\|x\|^2)^{\frac{n-k}{2}}=\left(n(1-\frac{\|x\|^2}{n})\right)^{\frac{n-k}{2}}\to n^{\frac{n-k}{2}}\exp(-\frac{\|x\|^2}{2})$
So
$$
\begin{aligned}
\frac{(n-\|x\|^2)^{\frac{n-k}{2}}}{\int_{\|x\|\leq \sqrt{n}}(n-\|x\|^2)^{\frac{n-k}{2}}dx}&=\frac{e^{-\frac{\|x\|^2}{2}}}{\int_{x\in \mathbb{R}^k}e^{-\frac{\|x\|^2}{2}}dx}\\
&=\frac{1}{(2\pi)^{\frac{k}{2}}}e^{-\frac{\|x\|^2}{2}}\\
&=\frac{d\gamma^k(x)}{dx}
\end{aligned}
$$
QED
</details>
#### Proof of the Levy's concentration theorem via the Maxwell-Boltzmann distribution law
We use the Maxwell-Boltzmann distribution law and Levy's isoperimetric inequality to prove the Levy's concentration theorem.
The goal is the same as the Gromov's version, first we bound the probability of the sub-level set of $f$ by the $\kappa_n(\epsilon)$ function by Levy's isoperimetric inequality. Then we claim that the $\kappa_n(\epsilon)$ function is bounded by the Gaussian distribution.
Note, this section is not rigorous enough in sense of mathematics and the author should add sections about Levy family and observable diameter to make the proof more rigorous and understandable.
<details>
<summary>Proof</summary>
Let $f:S^n\to \mathbb{R}$ be a 1-Lipschitz function.
Consider the two sets of points on the sphere $S^n$ with radius $\sqrt{n}$:
$$
\Omega_+=\{x\in S^n: f(x)\leq a_0-\epsilon\}, \Omega_-=\{x\in S^n: f(x)\geq a_0+\epsilon\}
$$
Note that $\Omega_+\cup \Omega_-$ is the whole sphere $S^n(\sqrt{n})$.
By the Levy's isoperimetric inequality, we have
$$
\operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(\epsilon))\leq \operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(\Omega_+))+\operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(\Omega_-))
$$
We define $\kappa_n(\epsilon)$ as the following:
$$
\kappa_n(\epsilon)=\frac{\operatorname{vol}_{n-k}(\pi_{n,k}^{-1}(\epsilon))}{\operatorname{vol}_k(S^n(\sqrt{n}))}=\frac{\int_\epsilon^{\frac{\pi}{2}}\cos^{n-1}(t)dt}{\int_0^{\frac{\pi}{2}}\cos^{n-1}(t)dt}
$$
By the Levy's isoperimetric inequality, and the Maxwell-Boltzmann distribution law, we have
$$
\mu\{x\in S^n: |f(x)-a_0|\geq\epsilon\} < \kappa_n(\epsilon)\leq 2\exp(-\frac{(n-1)\epsilon^2}{2})
$$
</details>
## Levy's Isoperimetric inequality
> This section is from the Appendix $C_+$ of Gromov's book _Metric Structures for Riemannian and Non-Riemannian Spaces_.
Not very edible for undergraduates.
## Crash course on Riemannian manifolds
> This part might be extended to a separate note, let's check how far we can go from this part.
>
> References:
>
> - [Riemannian Geometry by John M. Lee](https://www.amazon.com/Introduction-Riemannian-Manifolds-Graduate-Mathematics/dp/3319917544?dib=eyJ2IjoiMSJ9.88u0uIXulwPpi3IjFn9EdOviJvyuse9V5K5wZxQEd6Rto5sCIowzEJSstE0JtQDW.QeajvjQEbsDmnEMfPzaKrfVR9F5BtWE8wFscYjCAR24&dib_tag=se&keywords=riemannian+manifold+by+john+m+lee&qid=1753238983&sr=8-1)
### Riemannian manifolds
A Riemannian manifold is a smooth manifold equipped with a **Riemannian metric**, which is a smooth assignment of an inner product to each tangent space $T_pM$ of the manifold.
An example of Riemannian manifold is the sphere $\mathbb{C}P^n$.
### Riemannian metric
A Riemannian metric is a smooth assignment of an inner product to each tangent space $T_pM$ of the manifold.
An example of Riemannian metric is the Euclidean metric on $\mathbb{R}^n$.
### Notion of Connection
A connection is a way to define the directional derivative of a vector field along a curve on a Riemannian manifold.
For every $p\in M$, where $M$ denote the manifold, suppose $M=\mathbb{R}^n$, then let $X=(f_1,\cdots,f_n)$ be a vector field on $M$. The directional derivative of $X$ along the point $p$ is defined as
$$
D_VX=\lim_{h\to 0}\frac{X(p+h)-X(p)}{h}
$$
### Nabla notation and Levi-Civita connection
### Ricci curvature
## References
- [High-dimensional probability by Roman Vershynin](https://www.math.uci.edu/~rvershyn/papers/HDP-book/HDP-2.pdf)
- [Metric Structures for Riemannian and Non-Riemannian Spaces by M. Gromov](https://www.amazon.com/Structures-Riemannian-Non-Riemannian-Progress-Mathematics/dp/0817638989/ref=tmm_hrd_swatch_0?_encoding=UTF8&dib_tag=se&dib=eyJ2IjoiMSJ9.Tp8dXvGbTj_D53OXtGj_qOdqgCgbP8GKwz4XaA1xA5PGjHj071QN20LucGBJIEps.9xhBE0WNB0cpMfODY5Qbc3gzuqHnRmq6WZI_NnIJTvc&qid=1750973893&sr=8-1)
- [Metric Measure Geometry by Takashi Shioya](https://arxiv.org/pdf/1410.0428)

View File

@@ -0,0 +1,276 @@
# Math401 Topic 1: Probability under language of measure theory
## Section 1: Uniform Random Numbers
### Basic Definitions
#### Definition of Random Variables
A random variable is a function $f:[0,1]\to S$, where $[0,1]\subset \mathbb{R}$ and $S$ is a set of potential outcomes of a random phenomenon.
#### Definition of Uniform Distribution
The uniform distribution is defined by the length of function on subsets of $[0,1]$ as a measure of probability ([Lebesgue measure](https://notenextra.trance-0.com/Math4121/Math4121_L30#lebesgue-measure) by default).
Let $X$ be a random number taken from $[0,1]$ and having the uniform distribution. The probability that $X$ should be the probability of the event that $X$ lies in $A$.
$$
\operatorname{Prob}(X\in A) =\lambda(A)=\text{length of }A
$$
#### Definition of Expectation
Let $f:[0,1]\to \mathbb{R}$ be a random variable (with nice properties such that it is integrable). Then the expectation of $f$ is defined as
$$
\mathbb{E}[f]=\mathbb{E}[f(X)]=\int_0^1 f(x)dx
$$
#### Definition of Indicator Function
The indicator function of an event $A$ is defined as
$$
\mathbb{I}_A(x)=\begin{cases}
1 & \text{if } x\in A \\
0 & \text{if } x\notin A
\end{cases}
$$
#### Definition of Law of variable X
The law of a random variable $X$ is the probability distribution of $X$.
Let $Y$ be the outcome of $f(X)$. Then the law of $Y$ is the probability distribution of $Y$.
$$
\mu_Y(A)=\lambda(f^{-1}(A))=\lambda(\{x\in [0,1]: f(x)\in A\})
$$
### 1.1 Mathematical Coin Flip model
A coin flip if a random experiment with two possible outcomes: $S=\{0,1\}$. with probability $p$ for $0$ and $1-p$ for $1$, where $p\in (0,1)\subset \mathbb{R}$.
#### Definition of Independent Events
Two events $A$ and $B$ are independent if
$$
\lambda(A\cap B)=\lambda(A)\lambda(B)
$$
or equivalently,
$$
\operatorname{Prob}(X\in A\cap B)=\operatorname{Prob}(X\in A)\operatorname{Prob}(X\in B)
$$
Generalization to $n$ events:
$$
\lambda(A_1\cap A_2\cap \cdots \cap A_n)=\lambda(A_1)\lambda(A_2)\cdots \lambda(A_n)
$$
#### Definition of Outcome selecting function
Let the set of all possible outcomes represented by a Cartesian product $S=\{0,1\}^{\mathbb{N}}$. $(a_1,a_2,a_3,\cdots)\subset S$ is an infinite (or finite) sequence of coin flips.
$\pi_i:S\to \{0,1\}$ is the $i$-th projection function defined as $\pi_i((a_1,a_2,a_3,\cdots))=a_i$.
> Note, this representation is isomorphic to the dyadic rationals (i.e., numbers that can be written as a fraction whose denominator is a power of 2) in the interval $[0,1]$.
## Section 2: Formal definitions
> Recall, the $\sigma$-algebra (denoted as $\mathcal{A}$ in Math4121) is the collection of all subsets of a set $S$ satisfying the following properties:
>
> 1. $\emptyset\in \mathcal{A}$ (empty set is in the $\sigma$-algebra)
> 2. If $A\in \mathcal{A}$, then $A^c\in \mathcal{A}$ (if a set is in the $\sigma$-algebra, then its complement is in the $\sigma$-algebra)
> 3. If $A_1,A_2,A_3,\cdots\in \mathcal{A}$, then $\bigcup_{i=1}^{\infty}A_i\in \mathcal{A}$ (if a countable sequence of sets is in the $\sigma$-algebra, then their union is in the $\sigma$-algebra)
### Event, probability, and random variable
Let $\Omega$ be a non-empty set.
Let $\mathscr{F}$ be a $\sigma$-algebra on $\Omega$ (Note, $\mathscr{F}$ is a collection of subsets of $\Omega$ that satisfies the properties of a $\sigma$-algebra).
#### Definition of Event
An event is a element of $\mathscr{F}$.
#### Definition of Probability Measure
A probability measure $P$ is a function $P:\mathscr{F}\to [0,1]$ satisfying the following properties:
1. $P(\Omega)=1$
2. If $A_1,A_2,A_3,\cdots\in \mathscr{F}$ are pairwise disjoint ($\forall i\neq j, A_i\cap A_j=\emptyset$), then $P(\bigcup_{i=1}^{\infty}A_i)=\sum_{i=1}^{\infty}P(A_i)$
#### Definition of Probability Space
A probability space is a triple $(\Omega, \mathscr{F}, P)$ defined above.
An event $A$ is said to occur almost surely (a.s.) if $P(A)=1$.
#### Definition of Random Variable
A random variable is a function $X:\Omega\to \mathbb{R}$ that is measurable with respect to the $\sigma$-algebra $\mathscr{F}$.
That is, for any Borel set $B\subset \mathbb{R}$, the preimage $f^{-1}(B)\in \mathscr{F}$.
$$
f^{-1}(B)=\{x\in \Omega: f(x)\in B\}\in \mathscr{F}
$$
#### Definition of sigma-algebra generated by a random variable
Let $\{f_\alpha:\Omega\to \mathbb{R},\alpha\in I\}$ be a family of functions where $I$ is an index set which is not necessarily finite or countable. The $\sigma$-algebra generated by the family of functions $\{f_\alpha:\alpha\in I\}$, denoted as $\sigma\{f_\alpha:\alpha\in I\}$, is the smallest $\sigma$-algebra containing all the subsets of $\Omega$ of the form
$$
f_\alpha^{-1}(B)=\{\omega\in \Omega: f_\alpha(\omega)\in B\}\in \mathscr{F}
$$
for all $\alpha\in I$ and $B\in \mathscr{B}(\mathbb{R})$.
Equivalently,
$$
\sigma\{f_\alpha:\alpha\in I\}=\sigma\left(\bigcup_{\alpha\in I}f_\alpha^{-1}(B)\right)
$$
the sigma-algebra generated by a random variable $X$ is the intersection of all $\sigma$-algebras on $\Omega$ containing the sets $f_\alpha^{-1}(B)$ for all $\alpha\in I$ and $B\in \mathscr{B}(\mathbb{R})$.
#### Definition of distribution of random variable
Let $f:\Omega\to \mathbb{R}$ be a random variable. The distribution of $f$ is the probability measure $P_f$ on $\mathbb{R}$ defined by
$$
P_f(B)=P(f^{-1}(B))=P(\{x\in \Omega: f(x)\in B\})
$$
also noted as $f_*P$.
#### Definition of joint distribution of random variables
Let $f_1,f_2,\cdots,f_n:\Omega\to \mathbb{R}$ be random variables. The joint distribution of $f_1,f_2,\cdots,f_n$ is the probability measure $P_{f_1,f_2,\cdots,f_n}$ on $\mathbb{R}^n$ defined by
$$
P_{f_1,f_2,\cdots,f_n}(B)=P(f_1^{-1}(B_1)\cap f_2^{-1}(B_2)\cap \cdots \cap f_n^{-1}(B_n))=P(\omega\in \Omega: (f_1(\omega),f_2(\omega),\cdots,f_n(\omega))\in B)
$$
### Expectation of a random variable
Let $f:\Omega\to \mathbb{R}$ be a random variable. The expectation of $f$ is defined as
$$
\mathbb{E}[f]=\mathbb{E}[f(X)]=\int_\Omega f(x)dP
$$
Note, $P$ is the probability measure on $\Omega$.
#### Definition of variance
The variance of a random variable $f$ is defined as
$$
\operatorname{Var}(f)=\mathbb{E}[(f-\mathbb{E}[f])^2]=\mathbb{E}[f^2]-(\mathbb{E}[f])^2
$$
#### Definition of covariance
The covariance of two random variables $f,g:\Omega\to \mathbb{R}$ is defined as
$$
\operatorname{Cov}(f,g)=\mathbb{E}[(f-\mathbb{E}[f])(g-\mathbb{E}[g])]
$$
### Point measures
#### Definition of Dirac measure
The Dirac measure is a probability measure on $\Omega$ defined as
$$
\delta_\omega(A)=\begin{cases}
1 & \text{if } \omega\in A \\
0 & \text{if } \omega\notin A
\end{cases}
$$
Note that $\int_\Omega f(x)d\delta_\omega(x)=f(\omega)$.
### Infinite sequence of independent coin flips
> Side notes from basic topology:
>
> **Definition of product topology**:
>
> It is a set constructed by the Cartesian product of the sets. Suppose $X_i$ is a set for all $i\in I$. The element of the product set is a tuple $(x_i)_{i\in I}$ where $x_i\in X_i$ for all $i\in I$.
>
> For example, if $X_i=[0,1]$ for all $i\in \mathbb{N}$, then the product set is $[0,1]^{\mathbb{N}}$. An element of such product set is $(1,0.5,0.25,\cdots)$.
The set of outcomes of such infinite sequence of coin flips is the product set of the set of outcomes of each coin flip.
$$
S=\{0,1\}^{\mathbb{N}}
$$
### Conditional probability
#### Definition of conditional probability
The conditional probability of an event $A$ given an event $B$ is defined as
$$
P(A|B)=\frac{P(A\cap B)}{P(B)}
$$
The law of total probability:
$$
P(A)=\sum_{i=1}^{\infty}P(A|B_i)P(B_i)
$$
Bayes' theorem:
$$
P(B_i|A)=\frac{P(A|B_i)P(B_i)}{\sum_{j=1}^{\infty}P(A|B_j)P(B_j)}
$$
#### Definition of independence of random variables
Two random variables $f,g:\Omega\to \mathbb{R}$ are independent if for any Borel sets $A,B\subset \mathscr{B}(\mathbb{R})$ the events
$$
\{\omega\in \Omega: f(\omega)\in A\}\text{ and } \{\omega\in \Omega: g(\omega)\in B\}
$$
are independent.
In general, a finite or infinite family of random variables $f_1,f_2,\cdots,f_n:\Omega\to \mathbb{R}$ are independent if every finite collection of random variables from this family are independent.
#### Definition of independence of sigma-algebras
Let $\mathscr{G}$ and $\mathscr{H}$ be two $\sigma$-algebras on $\Omega$. They are independent if for any Borel sets $A\subset \mathscr{B}(\mathbb{R})$ and $B\subset \mathscr{B}(\mathbb{R})$, the finite collection of events are independent.
## Section 3: Further definitions in measure theory and integration
### $L^2$ space
#### Definition of $L^2$ space
Let $(\Omega, \mathscr{F}, P)$ be a measure space. The $L^2$ space is the space of all square integrable, complex-valued measurable functions on $\Omega$.
Denoted by $L^2(\Omega, \mathscr{F}, P)$.
The square integrable functions are the functions $f:\Omega\to \mathbb{C}$ such that
$$
\int_\Omega |f(\omega)|^2 dP(\omega)<\infty
$$
With inner product defined by
$$
\langle f,g\rangle=\int_\Omega \overline{f(\omega)}g(\omega)dP(\omega)
$$
The $L^2(\Omega, \mathscr{F}, P)$ space is a Hilbert space.

View File

@@ -0,0 +1,812 @@
# Math401 Topic 2: Finite-dimensional Hilbert spaces
Recall the complex number is a tuple of two real numbers, $z=(a,b)$ with addition and multiplication defined by
$$
(a,b)+(c,d)=(a+c,b+d)
$$
$$
(a,b)\cdot(c,d)=(ac-bd,ad+bc)
$$
or in polar form,
$$
z=re^{i\theta}=r(\cos\theta+i\sin\theta)
$$
where $r=\sqrt{a^2+b^2}=\sqrt{z\overline{z}}$ and $\theta=\tan^{-1}(b/a)$.
The complex conjugate of $z$ is $\overline{z}=(a,-b)$.
## Section 1: Finite-dimensional Complex Vector Spaces
Here, we use the field $\mathbb{C}$ of complex numbers. or the field $\mathbb{R}$ of real numbers as the field $\mathbb{F}$ we are going to encounter.
### Definition of vector space
A vector space $\mathscr{V}$ over a field $\mathbb{F}$ is a set equipped with an **addition** and a **scalar multiplication**, satisfying the following axioms:
1. Addition is associative and commutative. For all $u,v,w\in \mathscr{V}$,
Associativity:
$$
(u+v)+w=u+(v+w)
$$
Commutativity:
$$
u+v=v+u
$$
2. Additive identity: There exists an element $0\in \mathscr{V}$ such that $v+0=v$ for all $v\in \mathscr{V}$.
3. Additive inverse: For each $v\in \mathscr{V}$, there exists an element $-v\in \mathscr{V}$ such that $v+(-v)=0$.
4. Multiplicative identity: There exists an element $1\in \mathbb{F}$ such that $v\cdot 1=v$ for all $v\in \mathscr{V}$.
5. Multiplicative inverse: For each $v\in \mathscr{V}$ and $c\in \mathbb{F}$, there exists an element $c^{-1}\in \mathbb{F}$ such that $v\cdot c^{-1}=1$.
6. Distributivity: For all $u,v\in \mathscr{V}$ and $c,d\in \mathbb{F}$,
$$
c(u+v)=cu+cv
$$
A vector is an ordered pair of elements over the field $\mathbb{F}$.
If we consider $\mathbb{F}=\mathbb{C}^n$, $n\in \mathbb{N}$, then $u=(a_1,a_2,\cdots,a_n), v=(b_1,b_2,\cdots,b_n)\in \mathbb{C}^n$ are vectors.
The addition and scalar multiplication are defined by
$$
u+v=(a_1+b_1,a_2+b_2,\cdots,a_n+b_n)
$$
$$
cu=(ca_1,ca_2,\cdots,ca_n)
$$
$c\in \mathbb{C}$.
The matrix transpose is defined by
$$
u^T=(a_1,a_2,\cdots,a_n)^T=\begin{pmatrix}
a_1 \\
a_2 \\
\vdots \\
a_n
\end{pmatrix}
$$
The complex conjugate transpose is defined by
$$
u^*=(a_1,a_2,\cdots,a_n)^*=\begin{pmatrix}
\overline{a_1} \\
\overline{a_2} \\
\vdots \\
\overline{a_n}
\end{pmatrix}
$$
> In physics, the complex conjugate is sometimes denoted by $z^*$ instead of $\overline{z}$.
> The complex conjugate transpose is sometimes denoted by $u^\dagger$ instead of $u^*$.
### Hermitian inner product and norms
On $\mathbb{C}^n$, the Hermitian inner product is defined by
$$
\langle u,v\rangle=\sum_{i=1}^n \overline{u_i}v_i
$$
The norm is defined by
$$
\|u\|=\sqrt{\langle u,u\rangle}
$$
#### Definition of Inner product
Let $\mathscr{H}$ be a complex vector space. An inner product on $\mathscr{H}$ is a function $\langle \cdot, \cdot \rangle: \mathscr{H}\times \mathscr{H}\to \mathbb{C}$ satisfying the following axioms:
1. For each $u\in \mathscr{H}$, $v\mapsto \langle u,v\rangle$ is a linear map.
$$
\langle u,av+bw\rangle=a\langle u,v\rangle+b\langle u,w\rangle
$$
For all $u,v,w\in \mathscr{H}$ and $a,b\in \mathbb{C}$.
2. For all $u,v\in \mathscr{H}$, $\langle u,v\rangle=\overline{\langle v,u\rangle}$.
$u\mapsto \langle u,v\rangle$ is a conjugate linear map.
3. $\langle u,u\rangle\geq 0$ and $\langle u,u\rangle=0$ if and only if $u=0$.
#### Definition of norm
Let $\mathscr{H}$ be a complex vector space. A norm on $\mathscr{H}$ is a function $\|\cdot\|: \mathscr{H}\to \mathbb{R}$ satisfying the following axioms:
1. For all $u\in \mathscr{H}$, $\|u\|\geq 0$ and $\|u\|=0$ if and only if $u=0$.
2. For all $u\in \mathscr{H}$ and $c\in \mathbb{C}$, $\|cu\|=|c|\|u\|$.
3. Triangle inequality: For all $u,v\in \mathscr{H}$, $\|u+v\|\leq \|u\|+\|v\|$.
#### Definition of inner product space
A complex vector space $\mathscr{H}$ with an inner product is called a **Hilbert space**.
#### Cauchy-Schwarz inequality
For all $u,v\in \mathscr{H}$,
$$
|\langle u,v\rangle|\leq \|u\|\|v\|
$$
#### Parallelogram law
For all $u,v\in \mathscr{H}$,
$$
\|u+v\|^2+\|u-v\|^2=2(\|u\|^2+\|v\|^2)
$$
#### Polarization identity
For all $u,v\in \mathscr{H}$,
$$
\langle u,v\rangle=\frac{1}{4}(\|u+v\|^2-\|u-v\|^2+i\|u+iv\|^2-i\|u-iv\|^2)
$$
#### Additional definitions
Let $u,v\in \mathscr{H}$.
$\|v\|$ is the length of $v$.
$v$ is a unit vector if $\|v\|=1$.
$u,v$ are orthogonal if $\langle u,v\rangle=0$.
#### Definition of orthonormal basis
A set of vectors $\{e_1,e_2,\cdots,e_n\}$ in a Hilbert space $\mathscr{H}$ is called an orthonormal basis if
1. $\langle e_i,e_j\rangle=\delta_{ij}$ for all $i,j\in \{1,2,\cdots,n\}$.
$$
\delta_{ij}=\begin{cases}
1 & \text{if } i=j \\
0 & \text{if } i\neq j
\end{cases}
$$
2. $n=\dim \mathscr{H}$.
### Subspaces and orthonormal bases
#### Definition of subspace
A subset $\mathscr{W}$ of a vector space $\mathscr{V}$ is a subspace if it is closed under addition and scalar multiplication.
#### Definition of orthogonal complement
Let $E$ be a subset of a Hilbert space $\mathscr{H}$. The orthogonal complement of $E$ is the set of all vectors in $\mathscr{H}$ that are orthogonal to every vector in $E$.
$$
E^\perp=\{v\in \mathscr{H}: \langle v,w\rangle=0 \text{ for all } w\in E\}
$$
#### Definition of orthogonal projection
Let $E$ be a $m$-dimensional subspace of a Hilbert space $\mathscr{H}$. An orthogonal projection of $E$ is a linear map $P_E: \mathscr{H}\to E$
$$
P_E(v)=\sum_{i=1}^m \langle v,e_i\rangle e_i
$$
#### Definition of orthonormal direct sum
A inner product space $\mathscr{H}$ is the direct sum of $E_1,E_2,\cdots,E_n$ if
$$
\mathscr{H}=E_1\oplus E_2\oplus \cdots \oplus E_n
$$
and $E_i\cap E_j=\{0\}$ for all $i\neq j$.
That is, $\forall v\in \mathscr{H}$, there exists a unique $v_i\in E_i$ such that $v=v_1+v_2+\cdots+v_n$.
#### Definition of meet and join of subspaces
Let $E$ and $F$ be two subspaces of a Hilbert space $\mathscr{H}$. The meet of $E$ and $F$ is the subspace $\mathscr{H}$ such that
$$
E\land F=E\cap F
$$
The join of $E$ and $F$ is the subspace $\mathscr{H}$ such that
$$
E\lor F=\{u+v: u\in E, v\in F\}
$$
### Null space and range
#### Definition of null space
Let $A$ be a linear map from a vector space $\mathscr{V}$ to a vector space $\mathscr{W}$. The null space of $A$ is the set of all vectors in $\mathscr{V}$ that are mapped to the zero vector in $\mathscr{W}$.
$$
\text{Null}(A)=\{v\in \mathscr{V}: Av=0\}
$$
#### Definition of range
Let $A$ be a linear map from a vector space $\mathscr{V}$ to a vector space $\mathscr{W}$. The range of $A$ is the set of all vectors in $\mathscr{W}$ that are mapped from $\mathscr{V}$.
$$
\text{Range}(A)=\{w\in \mathscr{W}: \exists v\in \mathscr{V}, Av=w\}
$$
### Dual spaces and adjoints of linear maps
#### Definition of linear map
A linear map $T: \mathscr{V}\to \mathscr{W}$ is a function that satisfies the following axioms:
1. Additivity: For all $u,v\in \mathscr{V}$ and $a,b\in \mathbb{F}$,
$$
T(au+bv)=aT(u)+bT(v)
$$
2. Homogeneity: For all $u\in \mathscr{V}$ and $a\in \mathbb{F}$,
$$
T(au)=aT(u)
$$
#### Definition of linear functionals
A linear functional $f: \mathscr{V}\to \mathbb{F}$ is a linear map from $\mathscr{V}$ to $\mathbb{F}$.
Here, $\mathbb{F}$ is the field of complex numbers.
#### Definition of dual space
Let $\mathscr{V}$ be a vector space over a field $\mathbb{F}$. The dual space of $\mathscr{V}$ is the set of all linear functionals on $\mathscr{V}$.
$$
\mathscr{V}^*=\{f:\mathscr{V}\to \mathbb{F}: f\text{ is linear}\}
$$
If $\mathscr{H}$ is a finite-dimensional Hilbert space, then $\mathscr{H}^*$ is isomorphic to $\mathscr{H}$.
Note $v\in \mathscr{H}\mapsto \langle v,\cdot\rangle\in \mathscr{H}^*$ is a conjugate linear isomorphism.
#### Definition of adjoint of a linear map
Let $T: \mathscr{V}\to \mathscr{W}$ be a linear map. The adjoint of $T$ is the linear map $T^*: \mathscr{W}\to \mathscr{V}$ such that
$$
\langle Tv,w\rangle=\langle v,T^*w\rangle
$$
for all $v\in \mathscr{V}$ and $w\in \mathscr{W}$.
#### Definition of self-adjoint operators
A linear operator $T: \mathscr{V}\to \mathscr{V}$ is self-adjoint if $T^*=T$.
#### Definition of unitary operators
A linear map $T: \mathscr{V}\to \mathscr{V}$ is unitary if $T^*T=TT^*=I$.
### Dirac's bra-ket notation
#### Definition of bra and ket
Let $\mathscr{H}$ be a Hilbert space. The bra-ket notation is a notation for vectors in $\mathscr{H}$.
$$
\langle v|w\rangle
$$
is the inner product of $v$ and $w$. That is, $\langle v|w\rangle: \mathscr{H}\to \mathbb{C}$ is a linear functional satisfying the property of inner product.
$$
|v\rangle
$$
is the vector (or linear map) $v$.
$$
|u\rangle\langle v|
$$
is a linear map from $\mathscr{H}$ to $\mathscr{H}$.
### The spectral theorem for self-adjoint operators
### Spectral theorem for self-adjoint operators
#### Definition of spectral theorem
Let $\mathscr{H}$ be a Hilbert space. A self-adjoint operator $T: \mathscr{H}\to \mathscr{H}$ is a linear operator that is equal to its adjoint.
Then all the eigenvalues of $T$ are real and there exists an orthonormal basis of $\mathscr{H}$ consisting of eigenvectors of $T$.
#### Definition of spectrum
The spectrum of a linear operator on finite-dimensional Hilbert space $T: \mathscr{H}\to \mathscr{H}$ is the set of all distinct eigenvalues of $T$.
$$
\operatorname{sp}(T)=\{\lambda: \lambda\text{ is an eigenvalue of } T\}\subset \mathbb{C}
$$
#### Definition of Eigenspace
If $\lambda$ is an eigenvalue of $T$, the eigenspace of $T$ corresponding to $\lambda$ is the set of all eigenvectors of $T$ corresponding to $\lambda$.
$$
E_\lambda(T)=\{v\in \mathscr{H}: Tv=\lambda v\}
$$
We denote $P_\lambda(T):\mathscr{H}\to E_\lambda(T)$ the orthogonal projection onto $E_\lambda(T)$.
#### Definition of Operator norm
The operator norm of a linear operator $T: \mathscr{H}\to \mathscr{H}$ is the largest eigenvalue of $T$.
$$
\|T\|=\max_{\|v\|=1} \|Tv\|
$$
We say $T$ is **bounded** if $\|T\|<\infty$.
We denote $B(\mathscr{H})$ the set of all bounded linear operators on $\mathscr{H}$.
### Partial trace
#### Definition of trace
Let $T$ be a linear operator on $\mathscr{H}$, $(e_1,e_2,\cdots,e_n)$ be a basis of $\mathscr{H}$ and $(\epsilon_1,\epsilon_2,\cdots,\epsilon_n)$ be a basis of dual space $\mathscr{H}^*$. Then the trace of $T$ is defined by
$$
\operatorname{Tr}(T)=\sum_{i=1}^n \epsilon_i(T(e_i))=\sum_{i=1}^n \langle e_i,T(e_i)\rangle
$$
This is equivalent to the sum of the diagonal elements of $T$.
> Note, I changed the order of the definitions for the trace to pack similar concepts together. Check the rest of the section defining the partial trace by viewing the [tensor product section](https://notenextra.trance-0.com/Math401/Math401_T2#tensor-products-of-finite-dimensional-hilbert-spaces) first, and return to this section after reading the tensor product of linear operators.
#### Definition of partial trace
Let $T$ be a linear operator on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$, where $\mathscr{A}$ and $\mathscr{B}$ are finite-dimensional Hilbert spaces.
An operator $T$ on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$ can be written as (by the definition of [tensor product of linear operators](https://notenextra.trance-0.com/Math401/Math401_T2#tensor-products-of-linear-operators))
$$
T=\sum_{i=1}^n a_i A_i\otimes B_i
$$
where $A_i$ is a linear operator on $\mathscr{A}$ and $B_i$ is a linear operator on $\mathscr{B}$.
The $\mathscr{B}$-partial trace of $T$ ($\operatorname{Tr}_{\mathscr{B}}(T):\mathcal{L}(\mathscr{A}\otimes \mathscr{B})\to \mathcal{L}(\mathscr{A})$) is the linear operator on $\mathscr{A}$ defined by
$$
\operatorname{Tr}_{\mathscr{B}}(T)=\sum_{i=1}^n a_i \operatorname{Tr}(B_i) A_i
$$
Or we can define the map $L_v: \mathscr{A}\to \mathscr{A}\otimes \mathscr{B}$ by
$$
L_v(u)=u\otimes v
$$
Note that $\langle u,L_v^*(u')\otimes v'\rangle=\langle u,u'\rangle \langle v,v'\rangle=\langle u\otimes v,u'\otimes v'\rangle=\langle L_v(u),u'\otimes v'\rangle$.
Therefore, $L_v^*\sum_{j} u_j\otimes v_j=\sum_{j} \langle v,v_j\rangle u_j$.
Then the partial trace of $T$ can also be defined by
**Let $\{v_j\}$ be a set of orthonormal basis of $\mathscr{B}$.**
$$
\operatorname{Tr}_{\mathscr{B}}(T)=\sum_{j} L^*_{v_j}(T)L_{v_j}
$$
#### Definition of partial trace with respect to a state
Let $T$ be a linear operator on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$, where $\mathscr{A}$ and $\mathscr{B}$ are finite-dimensional Hilbert spaces.
Let $\rho$ be a state on $\mathscr{B}$ consisting of orthonormal basis $\{v_j\}$ and eigenvalue $\{\lambda_j\}$.
The partial trace of $T$ with respect to $\rho$ is the linear operator on $\mathscr{A}$ defined by
$$
\operatorname{Tr}_{\mathscr{A}}(T)=\sum_{j} \lambda_j L^*_{v_j}(T)L_{v_j}
$$
### Space of Bounded Linear Operators
> Recall the trace of a matrix is the sum of its diagonal elements.
#### Hilbert-Schmidt inner product
Let $T,S\in B(\mathscr{H})$. The Hilbert-Schmidt inner product of $T$ and $S$ is defined by
$$
\langle T,S\rangle=\operatorname{Tr}(T^*S)
$$
> Note here, $T^*$ is the complex conjugate transpose of $T$.
If we introduce the basis $\{e_i\}$ in $\mathscr{H}$, then we can write the the space of bounded linear operators as $n\times n$ complex-valued matrices $M_n(\mathbb{C})$.
For $T=(a_{ij})$, $S=(b_{ij})$, we have
$$
\operatorname{Tr}(A^*B)=\sum_{i=1}^n \sum_{j=1}^n \overline{a_{ij}}b_{ij}
$$
The inner product is the standard Hermitian inner product in $\mathbb{C}^{n\times n}$.
#### Definition of Hilbert-Schmidt norm (also called Frobenius norm)
The Hilbert-Schmidt norm of a linear operator $T: \mathscr{H}\to \mathscr{H}$ is defined by
$$
\|T\|=\sqrt{\sum_{i=1}^n \sum_{j=1}^n |a_{ij}|^2}
$$
**[The trace of operator does not depend on the basis.](https://notenextra.trance-0.com/Math429/Math429_L38#theorem-850)**
### Tensor products of finite-dimensional Hilbert spaces
Let $X=X_1\times X_2\times \cdots \times X_n$ be a Cartesian product of $n$ sets.
Let $x=(x_1,x_2,\cdots,x_n)$ be a vector in $X$.
$x_j\in X_j$ for $j=1,2,\cdots,n$.
Let $a\in X_j$ for $j=1,2,\cdots,n$.
Let's denote the space of all functions from $X$ to $\mathbb{C}$ by $\mathscr{H}$ and the space of all functions from $X_j$ to $\mathbb{C}$ by $\mathscr{H}_j$.
$$
\epsilon_{a}^{(j)}(x_j)=\begin{cases}
1 & \text{if } x_j=a \\
0 & \text{if } x_j\neq a
\end{cases}
$$
Then we can define a basis of $\mathscr{H}_j$ by $\{\epsilon_{a}^{(j)}(x_j)\}_{a\in X_j}$.
_Any function $f:X_j\to \mathbb{C}$ can be written as a linear combination of the basis vectors._
$$
f(x_j)=\sum_{a\in X_j} f(a)\epsilon_{a}^{(j)}(x_j)
$$
<details>
<summary>Proof</summary>
Note that a function is a map for all elements in the domain.
For each $a\in X_j$, $\epsilon_{a}^{(j)}(x_j)=1$ if $x_j=a$ and $0$ otherwise. So
$$
f(x_j)=\sum_{a\in X_j} f(a)\epsilon_{a}^{(j)}(x_j)=f(x_j)
$$
QED.
</details>
Now, let $a=(a_1,a_2,\cdots,a_n)$ be a vector in $X$, and $x=(x_1,x_2,\cdots,x_n)$ be a vector in $X$. Note that $a_j,x_j\in X_j$ for $j=1,2,\cdots,n$.
Define
$$
\epsilon_a(x)=\prod_{j=1}^n \epsilon_{a_j}^{(j)}(x_j)=\begin{cases}
1 & \text{if } a_j=x_j \text{ for all } j=1,2,\cdots,n \\
0 & \text{otherwise}
\end{cases}
$$
Then we can define a basis of $\mathscr{H}$ by $\{\epsilon_a\}_{a\in X}$.
_Any function $f:X\to \mathbb{C}$ can be written as a linear combination of the basis vectors._
$$
f(x)=\sum_{a\in X} f(a)\epsilon_a(x)
$$
<details>
<summary>Proof</summary>
This basically follows the same rascal as the previous proof. This time, the epsilon function only returns $1$ when $x_j=a_j$ for all $j=1,2,\cdots,n$.
$$
f(x)=\sum_{a\in X} f(a)\epsilon_a(x)=f(x)
$$
QED.
</details>
#### Definition of tensor product of basis elements
**The tensor product of basis elements** is defined by
$$
\epsilon_a\coloneqq\epsilon_{a_1}^{(1)}\otimes \epsilon_{a_2}^{(2)}\otimes \cdots \otimes \epsilon_{a_n}^{(n)}
$$
This is a basis of $\mathscr{H}$, here $\mathscr{H}$ is the set of all functions from $X=X_1\times X_2\times \cdots \times X_n$ to $\mathbb{C}$.
#### Definition of tensor product of two finite-dimensional Hilbert spaces
**The tensor product of two finite-dimensional Hilbert spaces** (in $\mathscr{H}$) is defined by
Let $\mathscr{H}_1$ and $\mathscr{H}_2$ be two finite dimensional Hilbert spaces. Let $u_1\in \mathscr{H}_1$ and $v_1\in \mathscr{H}_2$.
$$
u_1\otimes v_1
$$
is a bi-anti-linear map from $\mathscr{H}_1\times \mathscr{H}_2$ (the Cartesian product of $\mathscr{H}_1$ and $\mathscr{H}_2$, a tuple of two elements where first element is in $\mathscr{H}_1$ and second element is in $\mathscr{H}_2$) to $\mathbb{F}$ (in this case, $\mathbb{C}$). And $\forall u\in \mathscr{H}_1, v\in \mathscr{H}_2$,
$$
(u_1\otimes v_1)(u, v)=\langle u,u_1\rangle \langle v,v_1\rangle
$$
We call such forms **decomposable**. The tensor product of two finite-dimensional Hilbert spaces, denoted by $\mathscr{H}_1\otimes \mathscr{H}_2$, is the set of all linear combinations of decomposable forms. Represented by the following:
$$
\left(\sum_{i=1}^n a_i u_i\otimes v_i\right)(u, v) \coloneqq \sum_{i=1}^n a_j(u_j\otimes v_j)(u,v)=\sum_{i=1}^n a_i \langle v,u_i\rangle \langle v_i,u\rangle
$$
Note that $a_i\in \mathbb{C}$ for complex-vector spaces.
This is a linear space of dimension $\dim \mathscr{H}_1\times \dim \mathscr{H}_2$.
We define the inner product of two elements of $\mathscr{H}_1\otimes \mathscr{H}_2$ ($u_1\otimes v_1:(\mathscr{H}_1\otimes \mathscr{H}_2)\to \mathbb{C}$, $u_2\otimes v_2:(\mathscr{H}_1\otimes \mathscr{H}_2)\to \mathbb{C}$ $\in \mathscr{H}_1\otimes \mathscr{H}_2$) by
$$
\langle u_1\otimes v_1, u_2\otimes v_2\rangle\coloneqq\langle u_1,u_2\rangle \langle v_1,v_2\rangle=(u_1\otimes v_1)(u_2,v_2)
$$
### Tensor products of linear operators
Let $T_1$ be a linear operator on $\mathscr{H}_1$ and $T_2$ be a linear operator on $\mathscr{H}_2$, where $\mathscr{H}_1$ and $\mathscr{H}_2$ are finite-dimensional Hilbert spaces. The tensor product of $T_1$ and $T_2$ (denoted by $T_1\otimes T_2$) on $\mathscr{H}_1\otimes \mathscr{H}_2$, such that **on decomposable elements** is defined by
$$
(T_1\otimes T_2)(v_1\otimes v_2)=T_1(v_1)\otimes T_2(v_2)=\langle v_1,T_1(v_1)\rangle \langle v_2,T_2(v_2)\rangle
$$
for all $v_1\in \mathscr{H}_1$ and $v_2\in \mathscr{H}_2$.
The tensor product of two linear operators $T_1$ and $T_2$ is a linear combination in the form as follows:
$$
\sum_{i=1}^n a_i T_1(u_i)\otimes T_2(v_i)
$$
for all $u_i\in \mathscr{H}_1$ and $v_i\in \mathscr{H}_2$.
Such tensor product of linear operators is well defined.
<details>
<summary>Proof</summary>
If $\sum_{i=1}^n a_i u_i\otimes v_i=\sum_{j=1}^m b_j u_j\otimes v_j$, then $a_i=b_j$ for all $i=1,2,\cdots,n$ and $j=1,2,\cdots,m$.
Then $\sum_{i=1}^n a_i T_1(u_i)\otimes T_2(v_i)=\sum_{j=1}^m b_j T_1(u_j)\otimes T_2(v_j)$.
QED
</details>
An example of
#### Tensor product of linear operators on Hilbert spaces
Let $T_1$ be a linear operator on $\mathscr{H}_1$ and $T_2$ be a linear operator on $\mathscr{H}_2$, where $\mathscr{H}_1$ and $\mathscr{H}_2$ are finite-dimensional Hilbert spaces. The tensor product of $T_1$ and $T_2$ (denoted by $T_1\otimes T_2$) on $\mathscr{H}_1\otimes \mathscr{H}_2$, such that **on decomposable elements** is defined by
$$
(T_1\otimes T_2)(v_1\otimes v_2)=T_1(v_1)\otimes T_2(v_2)=\langle v_1,T_1(v_1)\rangle \langle v_2,T_2(v_2)\rangle
$$
#### Extended Dirac notation
Suppose $\mathscr{H}=\mathbb{C}^n$ with the standard basis $\{e_i\}$.
$e_j=|j\rangle$ and
$$
|j_1\dots j_n\rangle=e_{j_1}\otimes e_{j_2}\otimes \cdots \otimes e_{j_n}=
$$
#### The Hadamard Transform
Let $\mathscr{H}=\mathbb{C}^2$ with the standard basis $\{e_1,e_2\}=\{|0\rangle,|1\rangle\}$.
The linear operator $H_2$ is defined by
$$
H_2=\frac{1}{\sqrt{2}}\begin{pmatrix}
1 & 1 \\
1 & -1
\end{pmatrix}=\frac{1}{\sqrt{2}}(|0\rangle\langle 0|+|1\rangle\langle 0|+|0\rangle\langle 1|-|1\rangle\langle 1|)
$$
The Hadamard transform is the linear operator $H_2$ on $\mathbb{C}^2$.
### Singular value and Schmidt decomposition
#### Definition of SVD (Singular Value Decomposition)
Let $T:\mathscr{U}\to \mathscr{V}$ be a linear operator between two finite-dimensional Hilbert spaces $\mathscr{U}$ and $\mathscr{V}$.
We denote the inner product of $\mathscr{U}$ and $\mathscr{V}$ by $\langle \cdot, \cdot \rangle$.
Then there exists a decomposition of $T$
$$
T=d_1 T_1+d_2 T_2+\cdots +d_n T_n
$$
with $d_1>d_2>\cdots >d_n>0$ and $T_i:\mathscr{U}\to \mathscr{V}$ such that:
1. $T_iT_j^*=0$, $T_i^*T_j=0$ for $i\neq j$(
2. $T_i|_{\mathscr{R}(T_i^*)}:\mathscr{R}(T_i^*)\to \mathscr{R}(T_i)$ is an isomorphism with inverse $T_i^*$ where $\mathscr{R}(\cdot)$ is the range of the operator.
The $d_i$ are called the singular values of $T$.
[Gram-Schmidt Decomposition](https://notenextra.trance-0.com/Math429/Math429_L27#theorem-632-gram-schmidt)
## Basic Group Theory
### Finite groups
#### Definition of group
A group is a set $G$ with a binary operation $\cdot$ that satisfies the following axioms:
1. **Closure**: For all $a,b\in G$, $a\cdot b\in G$.
2. **Associativity**: For all $a,b,c\in G$, $(a\cdot b)\cdot c=a\cdot (b\cdot c)$.
3. **Identity**: There exists an element $e\in G$ such that for all $a\in G$, $a\cdot e=e\cdot a=a$.
4. **Inverses**: For all $a\in G$, there exists an element $b\in G$ such that $a\cdot b=b\cdot a=e$.
#### Symmetric group $S_n$
The symmetric group $S_n$ is the group of all permutations of $n$ elements.
$$
S_n=\{f: \{1,2,\cdots,n\}\to \{1,2,\cdots,n\} \text{ is a bijection}\}
$$
#### Unitary group $U(n)$
The unitary group $U(n)$ is the group of all $n\times n$ unitary matrices.
Such that $A^*=A$, where $A^*$ is the complex conjugate transpose of $A$. $A^*=(\overline{A})^T$.
#### Cyclic group $\mathbb{Z}_n$
The cyclic group $\mathbb{Z}_n$ is the group of all integers modulo $n$.
$$
\mathbb{Z}_n=\{0,1,2,\cdots,n-1\}
$$
#### Definition of group homomorphism
A group homomorphism is a function $\varPhi:G\to H$ between two groups $G$ and $H$ that satisfies the following axiom:
$$
\varPhi(a\cdot b)=\varPhi(a)\cdot \varPhi(b)
$$
A bijective group homomorphism is called group isomorphism.
#### Homomorphism sends identity to identity, inverses to inverses
Let $\varPhi:G\to H$ be a group homomorphism. $e_G$ and $e_H$ are the identity elements of $G$ and $H$ respectively. Then
1. $\varPhi(e_G)=e_H$
2. $\varPhi(a^{-1})=\varPhi(a)^{-1}$. $\forall a\in G$
### More on the symmetric group
#### General linear group over $\mathbb{C}$
The general linear group over $\mathbb{C}$ is the group of all $n\times n$ invertible complex matrices.
$$
GL(n,\mathbb{C})=\{A\in M_n(\mathbb{C}) \text{ is invertible}\}
$$
The map $T: S_n\to GL(n,\mathbb{C})$ is a group homomorphism.
#### Definition of sign of a permutation
Let $T:S_n\to GL(n,\mathbb{C})$ be the group homomorphism. The sign of a permutation $\sigma\in S_n$ is defined by
$$
\operatorname{sgn}(\sigma)=\det(T(\sigma))
$$
We say $\sigma$ is even if $\operatorname{sgn}(\sigma)=1$ and odd if $\operatorname{sgn}(\sigma)=-1$.
### Fourier Transform in $\mathbb{Z}_N$.
The vector space $L^2(\mathbb{Z}_N)$ is the set of all complex-valued functions on $\mathbb{Z}_N$ with the inner product
$$
\langle f,g\rangle=\sum_{k=0}^{N-1} \overline{f(k)}g(k)
$$
An orthonormal basis of $L^2(\mathbb{Z}_N)$ is given by $\delta_y,y\in \mathbb{Z}_N$.
$$
\delta_y(k)=\begin{cases}
1 & \text{if } k=y \\
0 & \text{otherwise}
\end{cases}
$$
in Dirac notation, we have
$$
\delta_y=|y\rangle=|y+N\rangle
$$
#### Definition of Fourier transform
Define $\varphi_k(x)=\frac{1}{\sqrt{N}}e^{2\pi i kx/N}$ for $k\in \mathbb{Z}_N$. $\varphi_k:\mathbb{Z}\to \mathbb{C}$ is a function.
The Fourier transform of a function $F\in L^2(\mathbb{Z}_N)$ such that $(Ff)(k)=\langle \varphi_k,f\rangle$ is defined by
$$
F=\frac{1}{\sqrt{N}}\sum_{j=0}^{N-1} \sum_{k=0}^{N-1} e^{2\pi i kj/N}|k\rangle\langle j|
$$
### Symmetric and anti-symmetric tensors
Let $\mathscr{H}^{\otimes n}$ be the $n$-fold tensor product of a Hilbert space $\mathscr{H}$.
We define the $S_n$ on $\mathscr{H}^{\otimes n}$ by
Let $\eta\in S_n$ be a permutation.
$$
\prod(\eta)v_1\otimes v_2\otimes \cdots \otimes v_n=v_{\eta^{-1}(1)}\otimes v_{\eta^{-1}(2)}\otimes \cdots \otimes v_{\eta^{-1}(n)}
$$
And extend to $\mathscr{H}^{\otimes n}$ by linearity.
This gives the property that $\zeta,\eta\in S_n$, $\prod(\zeta\eta)=\prod(\zeta)\prod(\eta)$.
#### Definition of symmetric and anti-symmetric tensors
Let $\mathscr{H}$ be a finite-dimensional Hilbert space.
An element in $\mathscr{H}^{\otimes n}$ is called symmetric if it is invariant under the action of $S_n$. Let $\alpha\in \mathscr{H}^{\otimes n}$
$$\prod(\eta)\alpha=\alpha \text{ for all } \eta\in S_n.$$
It is called anti-symmetric if
$$
\prod(\eta)\alpha=\operatorname{sgn}(\eta)\alpha \text{ for all } \eta\in S_n.
$$

View File

@@ -0,0 +1,351 @@
# Math401 Topic 3: Separable Hilbert spaces
## Infinite-dimensional Hilbert spaces
Recall from Topic 1.
[$L^2$ space](https://notenextra.trance-0.com/Math401/Math401_T1#section-3-further-definitions-in-measure-theory-and-integration)
Let $\lambda$ be a measure on $\mathbb{R}$, or any other field you are interested in.
A function is square integrable if
$$
\int_\mathbb{R} |f(x)|^2 d\lambda(x)<\infty
$$
### $L^2$ space and general Hilbert spaces
#### Definition of $L^2(\mathbb{R},\lambda)$
The space $L^2(\mathbb{R},\lambda)$ is the space of all square integrable, measurable functions on $\mathbb{R}$ with respect to the measure $\lambda$ (The Lebesgue measure).
The Hermitian inner product is defined by
$$
\langle f,g\rangle=\int_\mathbb{R} \overline{f(x)}g(x) d\lambda(x)
$$
The norm is defined by
$$
\|f\|=\sqrt{\int_\mathbb{R} |f(x)|^2 d\lambda(x)}
$$
The space $L^2(\mathbb{R},\lambda)$ is complete.
[Proof ignored here]
> Recall the definition of [complete metric space](https://notenextra.trance-0.com/Math4111/Math4111_L17#definition-312).
The inner product space $L^2(\mathbb{R},\lambda)$ is complete.
> Note that **by some general result in point-set topology**, a normed vector space can always be enlarged so as to become complete. This process is called completion of the normed space.
>
> Some exercise is showing some hints for this result:
>
> Show that the subspace of $L^2(\mathbb{R},\lambda)$ consisting of square integrable continuous functions is not closed.
>
> Suggestion: consider the sequence of continuous functions $f_1(x), f_2(x),\cdots$, where $f_n(x)$ is defined by the following graph:
>
> ![function.png](https://notenextra.trance-0.com/Math401/L2_square_integrable_problem.png)
>
> Show that $f_n$ converges in the $L^2$ norm to a function $f\in L^2(\mathbb{R},\lambda)$ but the limit function $f$ is not continuous. Draw the graph of $f_n$ to make this clear.
#### Definition of general Hilbert space
A Hilbert space is a complete inner product vector space.
#### General Pythagorean theorem
Let $u_1,u_2,\cdots,u_N$ be an orthonormal set in an inner product space $\mathscr{V}$ (may not be complete). Then for all $v\in \mathscr{V}$,
$$
\|v\|^2=\sum_{i=1}^N |\langle v,u_i\rangle|^2+\left\|v-\sum_{i=1}^N \langle v,u_i\rangle u_i\right\|^2
$$
[Proof ignored here]
#### Bessel's inequality
Let $u_1,u_2,\cdots,u_N$ be an orthonormal set in an inner product space $\mathscr{V}$ (may not be complete). Then for all $v\in \mathscr{V}$,
$$
\sum_{i=1}^N |\langle v,u_i\rangle|^2\leq \|v\|^2
$$
Immediate from the general Pythagorean theorem.
### Orthonormal bases
An orthonormal subset $S$ of a Hilbert space $\mathscr{H}$ is a set all of whose elements have norm 1 and are mutually orthogonal. ($\forall u,v\in S, \langle u,v\rangle=0$)
#### Definition of orthonormal basis
An orthonormal subset of $S$ of a Hilbert space $\mathscr{H}$ is an orthonormal basis of $\mathscr{H}$ if there are no other orthonormal subsets of $\mathscr{H}$ that contain $S$ as a proper subset.
#### Theorem of existence of orthonormal basis
Every separable Hilbert space has an orthonormal basis.
[Proof ignored here]
#### Theorem of Fourier series
Let $\mathscr{H}$ be a separable Hilbert space with an orthonormal basis $\{e_n\}$. Then for any $f\in \mathscr{H}$,
$$
f=\sum_{n=1}^\infty \langle f,e_n\rangle e_n
$$
The series converges to some $g\in \mathscr{H}$.
[Proof ignored here]
#### Fourier series in $L^2([0,2\pi],\lambda)$
Let $f\in L^2([0,2\pi],\lambda)$.
$$
f_N(x)=\sum_{n:|n|\leq N} c_n\frac{e^{inx}}{\sqrt{2\pi}}
$$
where $c_n=\frac{1}{2\pi}\int_0^{2\pi} f(x)e^{-inx} dx$.
The series converges to some $f\in L^2([0,2\pi],\lambda)$ as $N\to \infty$.
This is the Fourier series of $f$.
#### Hermite polynomials
The subspace spanned by polynomials is dense in $L^2(\mathbb{R},\lambda)$.
An orthonormal basis of $L^2(\mathbb{R},\lambda)$ can be obtained by the Gram-Schmidt process on $\{1,x,x^2,\cdots\}$.
The polynomials are called the Hermite polynomials.
### Isomorphism and $\ell_2$ space
#### Definition of isomorphic Hilbert spaces
Let $\mathscr{H}_1$ and $\mathscr{H}_2$ be two Hilbert spaces.
$\mathscr{H}_1$ and $\mathscr{H}_2$ are isomorphic if there exists a surjective linear map $U:\mathscr{H}_1\to \mathscr{H}_2$ that is bijective and preserves the inner product.
$$
\langle Uf,Ug\rangle=\langle f,g\rangle
$$
for all $f,g\in \mathscr{H}_1$.
When $\mathscr{H}_1=\mathscr{H}_2$, the map $U$ is called unitary.
#### $\ell_2$ space
The space $\ell_2$ is the space of all square summable sequences.
$$
\ell_2=\left\{(a_n)_{n=1}^\infty: \sum_{n=1}^\infty |a_n|^2<\infty\right\}
$$
An example of element in $\ell_2$ is $(1,0,0,\cdots)$.
With inner product
$$
\langle (a_n)_{n=1}^\infty, (b_n)_{n=1}^\infty\rangle=\sum_{n=1}^\infty \overline{a_n}b_n
$$
It is a Hilbert space (every Cauchy sequence in $\ell_2$ converges to some element in $\ell_2$).
### Bounded operators and continuity
Let $T:\mathscr{V}\to \mathscr{W}$ be a linear map between two vector spaces $\mathscr{V}$ and $\mathscr{W}$.
We define the norm of $\|\cdot\|$ on $\mathscr{V}$ and $\mathscr{W}$.
Then $T$ is continuous if for all $u\in \mathscr{V}$, if $u_n\to u$ in $\mathscr{V}$, then $T(u_n)\to T(u)$ in $\mathscr{W}$.
Using the delta-epsilon language, we can say that $T$ is continuous if for all $\epsilon>0$, there exists a $\delta>0$ such that if $\|u-v\|<\delta$, then $\|T(u)-T(v)\|<\epsilon$.
#### Definition of bounded operator
A linear map $T:\mathscr{V}\to \mathscr{W}$ is bounded if
$$
\|T\|=\sup_{\|u\|=1}\|T(u)\|< \infty
$$
#### Theorem of continuity and boundedness
A linear map $T:\mathscr{V}\to \mathscr{W}$ is continuous if and only if it is bounded.
[Proof ignored here]
#### Definition of bounded Hilbert space
The set of all bounded linear operators in $\mathscr{V}$ is denoted by $\mathscr{B}(\mathscr{V})$.
### Direct sum of Hilbert spaces
Suppose $\mathscr{H}_1$ and $\mathscr{H}_2$ are two Hilbert spaces.
The direct sum of $\mathscr{H}_1$ and $\mathscr{H}_2$ is the Hilbert space $\mathscr{H}_1\oplus \mathscr{H}_2$ with the inner product
$$
\langle (u_1,u_2),(v_1,v_2)\rangle=\langle u_1,v_1\rangle_{\mathscr{H}_1}+\langle u_2,v_2\rangle_{\mathscr{H}_2}
$$
Such space is denoted by $\mathscr{H}_1\oplus \mathscr{H}_2$.
A countable direct sum of Hilbert spaces can be defined similarly, as long as it is bounded.
That is, $\{u_n:n=1,2,\cdots\}$ is a sequence of elements in $\mathscr{H}_n$, and $\sum_{n=1}^\infty \|u_n\|^2<\infty$.
The inner product in such countable direct sum is defined by
$$
\langle (u_n)_{n=1}^\infty, (v_n)_{n=1}^\infty\rangle=\sum_{n=1}^\infty \langle u_n,v_n\rangle_{\mathscr{H}_n}
$$
Such space is denoted by $\mathscr{H}=\bigoplus_{n=1}^\infty \mathscr{H}_n$.
### Closed subspaces of Hilbert spaces
#### Definition of closed subspace
A subspace $\mathscr{M}$ of a Hilbert space $\mathscr{H}$ is closed if every convergent sequence in $\mathscr{M}$ converges to some element in $\mathscr{M}$.
#### Definition of pairwise orthogonal subspaces
Two subspaces $\mathscr{M}_1$ and $\mathscr{M}_2$ of a Hilbert space $\mathscr{H}$ are pairwise orthogonal if $\langle u,v\rangle=0$ for all $u\in \mathscr{M}_1$ and $v\in \mathscr{M}_2$.
### Orthogonal projections
#### Definition of orthogonal complement
The orthogonal complement of a subspace $\mathscr{M}$ of a Hilbert space $\mathscr{H}$ is the set of all elements in $\mathscr{H}$ that are orthogonal to every element in $\mathscr{M}$.
It is denoted by $\mathscr{M}^\perp=\{u\in \mathscr{H}: \langle u,v\rangle=0,\forall v\in \mathscr{M}\}$.
#### Projection theorem
Let $\mathscr{H}$ be a Hilbert space and $\mathscr{M}$ be a closed subspace of $\mathscr{H}$. Then for any $v\in \mathscr{H}$ can be uniquely decomposed as $v=u+w$ where $u\in \mathscr{M}$ and $w\in \mathscr{M}^\perp$.
[Proof ignored here]
### Dual Hilbert spaces
#### Norm of linear functionals
Let $\mathscr{H}$ be a Hilbert space.
The norm of a linear functional $f\in \mathscr{H}^*$ is defined by
$$
\|f\|=\sup_{\|u\|=1}|f(u)|
$$
#### Definition of dual Hilbert space
The dual Hilbert space of $\mathscr{H}$ is the space of all bounded linear functionals on $\mathscr{H}$.
It is denoted by $\mathscr{H}^*$.
$$
\mathscr{H}^*=\mathscr{B}(\mathscr{H},\mathbb{C})=\{f: \mathscr{H}\to \mathbb{C}: f\text{ is linear and }\|f\|<\infty\}
$$
You can exchange the $\mathbb{C}$ with any other field you are interested in.
#### The Riesz lemma
For each $f\in \mathscr{H}^*$, there exists a unique $v_f\in \mathscr{H}$ such that $f(u)=\langle u,v_f\rangle$ for all $u\in \mathscr{H}$. And $\|f\|=\|v_f\|$.
[Proof ignored here]
#### Definition of bounded sesqilinear form
A bounded sesqilinear form on $\mathscr{H}$ is a function $B: \mathscr{H}\times \mathscr{H}\to \mathbb{C}$ satisfying
1. $B(u,av+bw)=aB(u,v)+bB(u,w)$ for all $u,v,w\in \mathscr{H}$ and $a,b\in \mathbb{C}$.
2. $B(av+bw,u)=\overline{a}B(v,u)+\overline{b}B(w,u)$ for all $u,v,w\in \mathscr{H}$ and $a,b\in \mathbb{C}$.
3. $|B(u,v)|\leq C\|u\|\|v\|$ for all $u,v\in \mathscr{H}$ and some constant $C>0$.
There exists a unique bounded linear operator $A\in \mathscr{B}(\mathscr{H})$ such that $B(u,v)=\langle Au,v\rangle$ for all $u,v\in \mathscr{H}$. The norm of $A$ is the smallest constant $C$ such that $|B(u,v)|\leq C\|u\|\|v\|$ for all $u,v\in \mathscr{H}$.
[Proof ignored here]
### The adjoint of a bounded operator
Let $A\in \mathscr{B}(\mathscr{H})$. And bounded sesqilinear form $B: \mathscr{H}\times \mathscr{H}\to \mathbb{C}$ such that $B(u,v)=\langle u,Av\rangle$ for all $u,v\in \mathscr{H}$. Then there exists a unique bounded linear operator $A^*\in \mathscr{B}(\mathscr{H})$ such that $B(u,v)=\langle A^*u,v\rangle$ for all $u,v\in \mathscr{H}$.
[Proof ignored here]
And $\|A^*\|=\|A\|$.
Additional properties of bounded operators:
Let $A,B\in \mathscr{B}(\mathscr{H})$ and $a,b\in \mathbb{C}$. Then
1. $(aA+bB)^*=\overline{a}A^*+\overline{b}B^*$.
2. $(AB)^*=B^*A^*$.
3. $(A^*)^*=A$.
4. $\|A^*\|=\|A\|$.
5. $\|A^*A\|=\|A\|^2$.
#### Definition of self-adjoint operator
An operator $A\in \mathscr{B}(\mathscr{H})$ is self-adjoint if $A^*=A$.
#### Definition of normal operator
An operator $N\in \mathscr{B}(\mathscr{H})$ is normal if $NN^*=N^*N$.
#### Definition of unitary operator
An operator $U\in \mathscr{B}(\mathscr{H})$ is unitary if $U^*U=UU^*=I$.
where $I$ is the identity operator on $\mathscr{H}$.
#### Definition of orthogonal projection
An operator $P\in \mathscr{B}(\mathscr{H})$ is an orthogonal projection if $P^*=P$ and $P^2=P$.
### Tensor product of (infinite-dimensional) Hilbert spaces
#### Definition of tensor product
Let $\mathscr{H}_1$ and $\mathscr{H}_2$ be two Hilbert spaces. $u_1\in \mathscr{H}_1$ and $u_2\in \mathscr{H}_2$. Then $u_1\otimes u_2$ is an conjugate bilinear functional on $\mathscr{H}_1\times \mathscr{H}_2$.
$$
(u_1\otimes u_2)(v_1,v_2)=\langle u_1,v_1\rangle_{\mathscr{H}_1}\langle u_2,v_2\rangle_{\mathscr{H}_2}
$$
Let $\mathscr{V}$ be the set of all finite lienar combination of such conjugate bilinear functionals. We define the inner product on $\mathscr{V}$ by
$$
\langle u\otimes v,u'\otimes v'\rangle=\langle u,u'\rangle_{\mathscr{H}_1}\langle v,v'\rangle_{\mathscr{H}_2}
$$
The infinite-dimensional tensor product of $\mathscr{H}_1$ and $\mathscr{H}_2$ is the completion (extension of those bilinear functionals to make the set closed) of $\mathscr{V}$ with respect to the norm induced by the inner product.
Denoted by $\mathscr{H}_1\otimes \mathscr{H}_2$.
The orthonormal basis of $\mathscr{H}_1\otimes \mathscr{H}_2$ is $\{u_i\otimes v_j:i=1,2,\cdots,j=1,2,\cdots\}$. where $u_i$ is the orthonormal basis of $\mathscr{H}_1$ and $v_j$ is the orthonormal basis of $\mathscr{H}_2$.
### Fock space
#### Definition of Fock space
Let $\mathscr{H}^{\otimes n}$ be the $n$-fold tensor product of $\mathscr{H}$.
Set $\mathscr{H}^{\otimes 0}=\mathbb{C}$.
The Fock space of $\mathscr{H}$ is the direct sum of all $\mathscr{H}^{\otimes n}$.
$$
\mathscr{F}(\mathscr{H})=\bigoplus_{n=0}^\infty \mathscr{H}^{\otimes n}
$$
For example, if $\mathscr{H}=L^2(\mathbb{R},\lambda)$, then an element in $\mathscr{F}(\mathscr{H})$ is a sequence of functions $\psi=(\psi_0,\psi_1(x_1),\psi_2(x_1,x_2),\cdots)$ such that $|\psi_0|^2+\sum_{n=1}^\infty \int|\psi_n(x_1,\cdots,x_n)|^2dx_1\cdots dx_n<\infty$.

View File

@@ -0,0 +1,461 @@
# Math401 Topic 4: The quantum version of probabilistic concepts
> In mathematics, on often speaks of non-commutative instead of quantum constructions.
**Note, in this section, we will see a lot of mixed used terms used in physics and mathematics. I will use _italic_ to denote the terminology used in physics. It is safe to ignore them if you just care about the mathematics.**
## Section 1: Generalities about classical versus quantum systems
In classical physics, we assume that a system's properties have well-defined values regardless of how we choose to measure them.
### Basic terminology
#### Set of states
The preparation of a system builds a convex set of states as our initial condition for the system.
For a collection of $N$ system. Let procedure $N_1=\lambda P_1$ be a preparation procedure for state $P_1$, and $N_2=(1-\lambda) P_2$ be a preparation procedure for state $P_2$. The state of the collection is $N=\lambda N_1+(1-\lambda) N_2$.
#### Set of effects
The set of effects is the set of all possible outcomes of a measurement. $\Omega=\{\omega_1, \omega_2, \ldots, \omega_n\}$. Where each $\omega_i$ is an associated effect, or some query problems regarding the system. (For example, is outcome $\omega_i$ observed?)
#### Registration of outcomes
A pair of state and effect determines a probability $E_i(P)=p(\omega_i|P)$. By the law of large numbers, this probability shall converge to $N(\omega_i)/N$ as $N$ increases.
**Quantum states, _observables_ (random variables), and effects can be represented mathematically by linear operators on a Hilbert space.**
## Section 2: Examples of physical experiment in language of mathematics
### Sten-Gernach experiment
_**State preparation:**_ Silver tams are emitted from a thermal source and collimated to form a beam.
_**Measurement:**_ Silver atoms interact with the field produced by the magnet and impinges on the class plate.
_**Registration:**_ The impression left on the glass pace by the condensed silver atoms.
## Section 3: Finite probability spaces in the language of Hilbert space and operators
> Superposition is a linear combination of two or more states.
A quantum coin can be represented mathematically by linear combination of $|0\rangle$ and $|1\rangle$.$\alpha|0\rangle+\beta|1\rangle\in\mathscr{H}\cong\mathbb{C}^2$.
> For the rest of the material, we shall take the $\mathscr{H}$ to be vector space over $\mathbb{C}$.
### Definitions in classical probability under generalized probability theory
#### Definition of states (classical probability)
A state in classical probability is a probability distribution on the set of all possible outcomes. We can list as $(p_1,p_2,\cdots,p_n)$.
To each event $A\in \Omega$, we associate the operator on $\mathscr{H}$ of multiplication by the indicator function $P_A\coloneqq M_{\mathbb{I}_A}:f\mapsto \mathbb{I}_A f$ that projects onto the subspace of $\mathscr{H}$ corresponding to the event $A$.
$$
P_A=\sum_{k=1}^n a_k|k\rangle\langle k|
$$
where $a_k\in\{0,1\}$, and $a_k=1$ if and only if $k\in A$. Note that $P_A^*=P_A$ and $P_A^2=P_A$.
#### Definition of density operator (classical probability)
Let $(p_1,p_2,\cdots,p_n)$ be a probability distribution on $X$, where $p_k\geq 0$ and $\sum_{k=1}^n p_k=1$. The density operator $\rho$ is defined by
$$
\rho=\sum_{k=1}^n p_k|k\rangle\langle k|
$$
The probability of event $A$ relative to the probability distribution $(p_1,p_2,\cdots,p_n)$ becomes the trace of the product of $\rho$ and $P_A$.
$$
\operatorname{Prob}_\rho(A)\coloneqq\text{Tr}(\rho P_A)
$$
#### Definition of random variables (classical probability)
A random variable is a function $f:X\to\mathbb{R}$. It can also be written in operator form:
$$
F=\sum_{k=1}^n f(k)P_{\{k\}}
$$
The expectation of $f$ relative to the probability distribution $(p_1,p_2,\cdots,p_n)$ is given by
$$
\mathbb{E}_\rho(f)=\sum_{k=1}^n p_k f(k)=\operatorname{Tr}(\rho F)
$$
Note, by our definition of the operator $F,P_A,\rho$ (all diagonal operators) commute among themselves, which does not hold in general, in non-commutative (_quantum_) theory.
## Section 4: Why we need generalized probability theory to study quantum systems
Story of light polarization and violation of Bell's inequality.
### Classical picture of light polarization and Bell's inequality
> An interesting story will be presented here.
#### Polarization of light
The light which comes through a polarizer is polarized in a certain direction. If we fixed the first filter and rotate the second filter, we will observe the intensity of the light will change.
The light intensity decreased with $\alpha$ (the angle between the two filters). The light should vanished when $\alpha=\pi/2$.
![Filter figure](https://notenextra.trance-0.com/Math401/Filter_figure.png)
By experimental measurement, the intensity of the light passing the first filter is half the beam intensity (Assume the original beam is completely unpolarized).
Then $I_1=I_0/2$, and
$$
I_2=I_0\cos^2\alpha
$$
Claim: there exist a smallest package of monochromatic light, which is a photon.
We can model the behavior of each individual photon passing through the filter with direction $\alpha$ with random variable $P_\alpha$. The $P_\alpha(\omega)=1$ if the photon passes through the filter, and $P_\alpha(\omega)=0$ if the photon does not pass through the filter.
Then, the probability of the photon passing through the two filters with direction $\alpha$ and $\beta$ is given by
$$
\mathbb{E}(P_\alpha P_\beta)=\operatorname{Prob}(P_\alpha=1 \text{ and } P_\beta=1)=\frac{1}{2}\cos^2(\alpha-\beta)
$$
However, for system of 3 polarizing filters $F_1,F_2,F_3$, having direction $\alpha_1,\alpha_2,\alpha_3$. If we put them on the optical bench in pairs, Then we will have three random variables $P_1,P_2,P_3$.
#### Bell's 3 variable inequality
$$
\operatorname{Prob}(P_1=1,P_3=0)\leq \operatorname{Prob}(P_1=1,P_2=0)+\operatorname{Prob}(P_2=1,P_3=0)
$$
<details>
<summary>Proof</summary>
By the law of total probability, (The event that the photon passes through the first filter but not the third filter is the union of the event that the photon did not pass through the second filter and the event that the photon passed the second filter and did not pass through the third filter) we have
$$
\begin{aligned}
\operatorname{Prob}(P_1=1,P_3=0)&=\operatorname{Prob}(P_1=1,P_2=0,P_3=0)+\operatorname{Prob}(P_1=1,P_2=1,P_3=0)\\
&=\operatorname{Prob}(P_1=1,P_2=0)\operatorname{Prob}(P_3=0)+\operatorname{Prob}(P_2=1,P_3=0)\operatorname{Prob}(P_1=1)\\
&\leq\operatorname{Prob}(P_1=1,P_2=0)+\operatorname{Prob}(P_2=1,P_3=0)
\end{aligned}
$$
However, according to our experimental measurement, for any pair of polarizers $F_i,F_j$, by the complement rule, we have
$$
\begin{aligned}
\operatorname{Prob}(P_i=1,P_j=0)&=\operatorname{Prob}(P_i=1)-\operatorname{Prob}(P_i=1,P_j=1)\\
&=\frac{1}{2}-\frac{1}{2}\cos^2(\alpha_i-\alpha_j)\\
&=\frac{1}{2}\sin^2(\alpha_i-\alpha_j)
\end{aligned}
$$
This leads to a contradiction if we apply the inequality to the experimental data.
$$
\frac{1}{2}\sin^2(\alpha_1-\alpha_3)\leq\frac{1}{2}\sin^2(\alpha_1-\alpha_2)+\frac{1}{2}\sin^2(\alpha_2-\alpha_3)
$$
If $\alpha_1=0,\alpha_2=\frac{\pi}{6},\alpha_3=\frac{\pi}{3}$, then
$$
\begin{aligned}
\frac{1}{2}\sin^2(-\frac{\pi}{3})&\leq\frac{1}{2}\sin^2(-\frac{\pi}{6})+\frac{1}{2}\sin^2(\frac{\pi}{6}-\frac{\pi}{3})\\
\frac{3}{8}&\leq\frac{1}{8}+\frac{1}{8}\\
\frac{3}{8}&\leq\frac{1}{4}
\end{aligned}
$$
This is a contradiction, so Bell's inequality is violated.
QED
</details>
Other revised experiments (eg. Aspect's experiment, Calcium entangled photon experiment) are also conducted and the inequality is still violated.
#### The true model of light polarization
The full description of the light polarization is given belows:
State of polarization of a photon: $\psi=\alpha|0\rangle+\beta|1\rangle$, where $|0\rangle$ and $|1\rangle$ are the two orthogonal polarization states in $\mathbb{C}^2$.
Polarization filter (generalized 0,1 valued random variable): orthogonal projection $P_\alpha$ on $\mathbb{C}^2$ corresponding to the direction $\alpha$. (operator satisfies $P_\alpha^*=P_\alpha=P_\alpha^2$.)
The matrix representation of $P_\alpha$ is given by
$$
P_\alpha=\begin{pmatrix}
\cos^2(\alpha) & \cos(\alpha)\sin(\alpha)\\
\cos(\alpha)\sin(\alpha) & \sin^2(\alpha)
\end{pmatrix}
$$
Probability of a photon passing through the filter $P_\alpha$ is given by $\langle P_\alpha\psi,\psi\rangle$, this is $\cos^2(\alpha)$ if we set $\psi=|0\rangle$.
Since the probability of a photon passing through the three filters is not commutative, it is impossible to discuss $\operatorname{Prob}(P_1=1,P_3=0)$ in the classical setting.
## Section 5: The non-commutative (_quantum_) probability theory
Let $\mathscr{H}$ be a Hilbert space. $\mathscr{H}$ consists of complex-valued functions on a finite set $\Omega=\{1,2,\cdots,n\}$. and that the functions $(e_1,e_2,\cdots,e_n)$ form an orthonormal basis of $\mathscr{H}$. We use Dirac notation $|k\rangle$ to denote the basis vector $e_k$.
In classical settings, multiplication operators is now be the full space of bounded linear operators on $\mathscr{H}$. (Denoted by $\mathscr{B}(\mathscr{H})$)
Let $A,B\in\mathscr{F}$ be the set of all events in the classical probability settings. $X$ denotes the set of all possible outcomes.
> A orthogonal projection on a Hilbert space is a projection operator satisfying $P^*=P$ and $P^2=P$. We denote the set of all orthogonal projections on $\mathscr{H}$ by $\mathscr{P}$.
>
> This can be found in linear algebra. [Orthogonal projection](https://notenextra.trance-0.com/Math429/Math429_L28#definition-655)
Let $P,Q\in\mathscr{P}$ be the event in non-commutative (_quantum_) probability space. $R(\cdot)$ is the range of the operator. $P^\perp$ is the orthogonal complement of $P$.
| Classical | Classical interpretation | Non-commutative (_Quantum_) | Non-commutative (_Quantum_) interpretation |
| --------- | ------- | -------- | -------- |
| $A\subset B$| Event $A$ is a subset of event $B$ | $P\leq Q$| $R(P)\subseteq R(Q)$ Range of event $P$ is a subset of range of event $Q$ |
| $A\cap B$| Both event $A$ and $B$ happened | $P\land Q$| projection to $R(P)\cap R(Q)$ Range of event $P$ and event $Q$ happened |
| $A\cup B$| Any of the event $A$ or $B$ happened | $P\lor Q$| projection to $R(P)\cup R(Q)$ Range of event $P$ or event $Q$ happened |
| $X\subset A$ or $A^c$| Event $A$ did not happen | $P^\perp$| projection$R(P)^\perp$ Range of event $P$ is the orthogonal complement of range of event $P$ |
In such setting, some rules of classical probability theory are not valid in quantum probability theory.
In classical probability theory, $A\cap(B\cup C)=(A\cap B)\cup(A\cap C)$.
In quantum probability theory, $P\land(Q\lor R)\neq(P\land Q)\lor(P\land R)$ in general.
### Definitions of non-commutative (_quantum_) probability theory under generalized probability theory
#### Definition of states (non-commutative (_quantum_) probability theory)
A state on $(\mathscr{B}(\mathscr{H}),\mathscr{P})$ is a map $\mu:\mathscr{P}\to[0,1]$ such that:
1. $\mu(O)=0$, where $O$ is the zero projection.
2. If $P_1,P_2,\cdots,P_n$ are pairwise disjoint orthogonal projections, then $\mu(P_1\lor P_2\lor\cdots\lor P_n)=\sum_{i=1}^n\mu(P_i)$.
Where projections are disjoint if $P_iP_j=P_jP_i=O$.
#### Definition of density operator (non-commutative (_quantum_) probability theory)
A density operator $\rho$ on the finite-dimensional Hilbert space $\mathscr{H}$ is:
1. self-adjoint ($A^*=A$, that is $\langle Ax,y\rangle=\langle x,Ay\rangle$ for all $x,y\in\mathscr{H}$)
2. positive semi-definite (all eigenvalues are non-negative)
3. $\operatorname{Tr}(\rho)=1$.
If $(|\psi_1\rangle,|\psi_2\rangle,\cdots,|\psi_n\rangle)$ is an orthonormal basis of $\mathscr{H}$ consisting of eigenvectors of $\rho$, for the eigenvalue $p_1,p_2,\cdots,p_n$, then $p_j\geq 0$ and $\sum_{j=1}^n p_j=1$.
We can write $\rho$ as
$$
\rho=\sum_{j=1}^n p_j|\psi_j\rangle\langle\psi_j|
$$
(under basis $|\psi_j\rangle$, it is a diagonal matrix with $p_j$ on the diagonal)
Every basis of $\mathscr{H}$ can be decomposed to these forms.
#### Theorem: Born's rule
Let $\rho$ be a density operator on $\mathscr{H}$. then
$$
\mu(P)\coloneqq\operatorname{Tr}(\rho P)=\sum_{j=1}^n p_j\langle\psi_j|P|\psi_j\rangle
$$
Defines a probability measure on the space $\mathscr{P}$.
[Proof ignored here]
#### Theorem: Gleason's theorem (very important)
Let $\mathscr{H}$ be a Hilbert space over $\mathbb{C}$ or $\mathbb{R}$ of dimension $n\geq 3$. Let $\mu$ be a state on the space $\mathscr{P}$ of projections on $\mathscr{H}$. Then there exists a unique density operator $\rho$ such that
$$
\mu(P)=\operatorname{Tr}(\rho P)
$$
for all $P\in\mathscr{P}$. $\mathscr{P}$ is the space of all orthogonal projections on $\mathscr{H}$.
[Proof ignored here]
#### Definition of random variable _or Observables_ (non-commutative (_quantum_) probability theory)
_It is the physical measurement of a system that we are interested in. (kinetic energy, position, momentum, etc.)_
$\mathscr{B}(\mathbb{R})$ is the set of all Borel sets on $\mathbb{R}$.
An random variable on the Hilbert space $\mathscr{H}$ is a projection valued map $P:\mathscr{B}(\mathbb{R})\to\mathscr{P}$.
With the following properties:
1. $P(\emptyset)=O$ (the zero projection)
2. $P(\mathbb{R})=I$ (the identity projection)
3. For any sequence $A_1,A_2,\cdots,A_n\in \mathscr{B}(\mathbb{R})$. the following holds:
(a) $P(\bigcup_{i=1}^n A_i)=\bigvee_{i=1}^n P(A_i)$
(b) $P(\bigcap_{i=1}^n A_i)=\bigwedge_{i=1}^n P(A_i)$
(c) $P(A^c)=I-P(A)$
(d) If $A_j$ are mutually disjoint (that is $P(A_i)P(A_j)=P(A_j)P(A_i)=O$ for $i\neq j$), then $P(\bigcup_{j=1}^n A_j)=\sum_{j=1}^n P(A_j)$
#### Definition of probability of a random variable
For a system prepared in state $\rho$, the probability of the random variable by the projection-valued measure $P$ is in the Borel set $A$ is $\operatorname{Tr}(\rho P(A))$.
### Expectation of an random variable and projective measurement
Notice that if we set $\lambda$ is _observed_ with probability $p_\lambda=\operatorname{Tr}(\rho P_\lambda)$. $\rho'\coloneqq\sum_{\lambda\in sp(T)}P_\lambda \rho P_\lambda$ is a density operator.
#### Definition of expectation of operators
Let $T$ be a self-adjoint operator on $\mathscr{H}$. The expectation of $T$ relative to the density operator $\rho$ is given by
$$
\mathbb{E}_\rho(T)=\operatorname{Tr}(\rho T)=\sum_{\lambda\in sp(T)}\lambda \operatorname{Tr}(\rho P(\lambda))
$$
if we set $T=\sum_{\lambda\in sp(T)}\lambda P_\lambda$, then $\mathbb{E}_\rho(T)=\sum_{\lambda\in sp(T)}\lambda \operatorname{Tr}(\rho P(\lambda))$.
### The uncertainty principle
Let $A,B$ be two self-adjoint operators on $\mathscr{H}$. Then we define the following two self-adjoint operators:
$$
i[A,B]\coloneqq i(AB-BA)
$$
$$
A\circ B\coloneqq \frac{AB+BA}{2}
$$
Note that $A\circ B$ satisfies Jordan's identity.
$$
(A\circ B)\circ (A\circ A)=A\circ (B\circ (A\circ A))
$$
#### Definition of variance
Given a state $\rho$, the variance of $A$ is given by
$$
\operatorname{Var}_\rho(A)\coloneqq\mathbb{E}_\rho(A^2)-\mathbb{E}_\rho(A)^2=\operatorname{Tr}(\rho A^2)-\operatorname{Tr}(\rho A)^2
$$
#### Definition of covariance
Given a state $\rho$, the covariance of $A$ and $B$ is given by the Jordan product of $A$ and $B$.
$$
\operatorname{Cov}_\rho(A,B)\coloneqq\mathbb{E}_\rho(A\circ B)-\mathbb{E}_\rho(A)\mathbb{E}_\rho(B)=\operatorname{Tr}(\rho A\circ B)-\operatorname{Tr}(\rho A)\operatorname{Tr}(\rho B)
$$
#### Robertson-Schrödinger uncertainty relation in finite dimensional Hilbert space
Let $\rho$ be a state on $\mathscr{H}$, $A,B$ be two self-adjoint operators on $\mathscr{H}$. Then
$$
\operatorname{Var}_\rho(A)\operatorname{Var}_\rho(B)\geq\operatorname{Cov}_\rho(A,B)^2+\frac{1}{4}|\mathbb{E}_\rho([A,B])|^2
$$
If $\rho$ is a pure state ($\rho=|\psi\rangle\langle\psi|$ for some unit vector $|\psi\rangle\in\mathscr{H}$), and the equality holds, then $A$ and $B$ are collinear (i.e. $A=c B$ for some constant $c\in\mathbb{R}$).
When $A$ and $B$ commute, the classical inequality is recovered (Cauchy-Schwarz inequality).
$$
\operatorname{Var}_\rho(A)\operatorname{Var}_\rho(B)\geq\operatorname{Cov}_\rho(A,B)^2
$$
[Proof ignored here]
### The uncertainty relation for unbounded symmetric operators
#### Definition of symmetric operator
An operator $A$ is symmetric if for all $\phi,\psi\in\mathscr{H}$, we have
$$
\langle A\phi,\psi\rangle=\langle\phi,A\psi\rangle
$$
An example of symmetric operator is $T(\psi)=i\hbar\frac{d\psi}{dx}$. If we let $\mathscr{H}=\mathscr{D}(T)$, $\hbar$ is the Planck constant.
$\mathscr{D}(T)$ be the space of all square integrable, differentiable, and it's derivative is also square integrable functions on $\mathbb{R}$.
#### Definition of joint domain of two operators
Let $(A,\mathscr{D}(A)),(B,\mathscr{D}(B))$ be two symmetric operators on their corresponding domains. The domain of $AB$ is defined as
$$
\mathscr{D}(AB)\coloneqq\{\psi\in\mathscr{D}(B):B\psi\in\mathscr{D}(A)\}
$$
Since $(AB)\psi=A(B\psi)$, the variance of an operator $A$ relative to a pure state $\rho=|\psi\rangle\langle\psi|$ is given by
$$
\operatorname{Var}_\rho(A)=\operatorname{Tr}(\rho A^2)-\operatorname{Tr}(\rho A)^2=\langle\psi,A^2\psi\rangle-\langle\psi,A\psi\rangle^2
$$
If $A$ is symmetric, then $\operatorname{Var}_\rho(A)=\langle A\psi,A\psi\rangle-\langle \psi, A\psi\rangle^2$.
#### Robertson-Schrödinger uncertainty relation for unbounded symmetric operators
Let $(A,\mathscr{D}(A)),(B,\mathscr{D}(B))$ be two symmetric operators on their corresponding domains. Then
$$
\operatorname{Var}_\rho(A)\operatorname{Var}_\rho(B)\geq\operatorname{Cov}_\rho(A,B)^2+\frac{1}{4}|\mathbb{E}_\rho([A,B])|^2
$$
If $\rho$ is a pure state ($\rho=|\psi\rangle\langle\psi|$ for some unit vector $|\psi\rangle\in\mathscr{H}$), and the equality holds, then $A\psi$ and $B\psi$ are collinear (i.e. $A\psi=c B\psi$ for some constant $c\in\mathbb{R}$).
[Proof ignored here]
### Summary of analog of classical probability theory and non-commutative (_quantum_) probability theory
| Classical probability | Non-commutative (_Quantum_) probability |
| --------- | ------- |
| Sample space $\Omega$, cardinality $\vert\Omega\vert=n$, example: $\Omega=\{0,1\}$ | Complex Hilbert space $\mathscr{H}$, dimension $\dim\mathscr{H}=n$, example: $\mathscr{H}=\mathbb{C}^2$ |
|Common algebra of $\mathbb{C}$ valued functions| Algebra of bounded operators $\mathscr{B}(\mathscr{H})$|
|$f\mapsto \bar{f}$ complex conjugation| $P\mapsto P^*$ adjoint|
|Events: indicator functions of sets| Projections: space of orthogonal projections $\mathscr{P}\subseteq\mathscr{B}(\mathscr{H})$|
|functions $f$ such that $f^2=f=\overline{f}$| orthogonal projections $P$ such that $P^*=P=P^2$|
|$\mathbb{R}$-valued functions $f=\overline{f}$| self-adjoint operators $A=A^*$|
| $\mathbb{I}_{f^{-1}(\{\lambda\})}$ is the indicator function of the set $f^{-1}(\{\lambda\})$| $P(\lambda)$ is the orthogonal projection to eigenspace|
|$f=\sum_{\lambda\in \operatorname{Range}(f)}\lambda \mathbb{I}_{f^{-1}(\{\lambda\})}$|$A=\sum_{\lambda\in \operatorname{sp}(A)}\lambda P(\lambda)$|
|Probability measure $\mu$ on $\Omega$| Density operator $\rho$ on $\mathscr{H}$|
|Delta measure $\delta_\omega$| Pure state $\rho=\vert\psi\rangle\langle\psi\vert$|
|$\mu$ is non-negative measure and $\sum_{i=1}^n\mu(\{i\})=1$| $\rho$ is positive semi-definite and $\operatorname{Tr}(\rho)=1$|
|Expected value of random variable $f$ is $\mathbb{E}_{\mu}(f)=\sum_{i=1}^n f(i)\mu(\{i\})$| Expected value of operator $A$ is $\mathbb{E}_\rho(A)=\operatorname{Tr}(\rho A)$|
|Variance of random variable $f$ is $\operatorname{Var}_\mu(f)=\sum_{i=1}^n (f(i)-\mathbb{E}_\mu(f))^2\mu(\{i\})$| Variance of operator $A$ is $\operatorname{Var}_\rho(A)=\operatorname{Tr}(\rho A^2)-\operatorname{Tr}(\rho A)^2$|
|Covariance of random variables $f$ and $g$ is $\operatorname{Cov}_\mu(f,g)=\sum_{i=1}^n (f(i)-\mathbb{E}_\mu(f))(g(i)-\mathbb{E}_\mu(g))\mu(\{i\})$| Covariance of operators $A$ and $B$ is $\operatorname{Cov}_\rho(A,B)=\operatorname{Tr}(\rho A\circ B)-\operatorname{Tr}(\rho A)\operatorname{Tr}(\rho B)$|
|Composite system is given by Cartesian product of the sample spaces $\Omega_1\times\Omega_2$| Composite system is given by tensor product of the Hilbert spaces $\mathscr{H}_1\otimes\mathscr{H}_2$|
|Product measure $\mu_1\times\mu_2$ on $\Omega_1\times\Omega_2$| Tensor product of space $\rho_1\otimes\rho_2$ on $\mathscr{H}_1\otimes\mathscr{H}_2$|
|Marginal distribution $\pi_*v$|Partial trace $\operatorname{Tr}_2(\rho)$|
### States of two dimensional system and the complex projective space (_Bloch sphere_)
Let $v=e^{i\theta}u$, then the space of pure states ($\rho=|u\rangle\langle u|$) is the complex projective space $\mathbb{C}P^1$.
$\alpha=x_i+iy_i,\beta=x_2+iy_2$ must satisfy $|\alpha|^2+|\beta|^2=1$, that is $x_1^2+x_2^2+y_1^2+y_2^2=1$.
The set of unit vectors in $\mathbb{C}^2$ is the unit sphere in $\mathbb{R}^3$.
So the space of pure states is the unit circle in $\mathbb{R}^2$.
#### Mapping between the space of pure states and the complex projective space
Any two dimensional pure state can be written as $e^{i\theta}u$, where $u$ is a unit vector in $\mathbb{R}^2$. There exists a bijective map $P:S^2\to\mathscr{P}_1\subseteq M_2(\mathbb{C})$ such that $P(u)=|u\rangle\langle u|$.
$$
P(\vec{x})=\frac{1}{2}(I+\vec{a}\cdot\vec{\sigma})=\frac{1}{2}\begin{pmatrix}
1&0\\
0&1
\end{pmatrix}+\frac{a_x}{2}\begin{pmatrix}
0&1\\
1&0
\end{pmatrix}+\frac{a_y}{2}\begin{pmatrix}
0&-i\\
i&0
\end{pmatrix}+\frac{a_z}{2}\begin{pmatrix}
1&0\\
0&-1
\end{pmatrix}

View File

@@ -0,0 +1,203 @@
# Math401 Topic 5: Introducing dynamics: classical and non-commutative
## Section 1: Dynamics in classical probability
### Basic definitions
#### Definition of orbit
Let $T:\Omega\to\Omega$ be a map (may not be invertible) generating a dynamical system on $\Omega$. Given $\omega\in \Omega$, the (forward) orbit of $\omega$ is the set $\mathscr{O}(\omega)=\{T^n(\omega)\}_{n\in\mathbb{Z}}$.
The theory of dynamics is the study of properties of orbits.
#### Definition of measure-preserving map
Let $P$ be a probability measure on a $\sigma$-algebra $\mathscr{F}$ of subsets of $\Omega$. (that is, $P:\mathscr{F}\to$ anything) A measurable transformation $T:\Omega\to\Omega$ is said to be measure-preserving if for all random variables $\psi:\Omega\to\mathbb{R}$, we have $\mathbb{E}(\psi\circ T)=\mathbb{E}(\psi)$, that is:
$$
\int_\Omega (\psi\circ T)(\omega)dP(\omega)=\int_\Omega \psi(\omega)dP(\omega)
$$
Example:
The doubling map $T:\Omega\to\Omega$ is defined as $T(x)=2x\mod 1$, is a Lebesgue measure preserving map on $\Omega=[0,1]$.
#### Definition of isometry
The composition operator $\psi\mapsto U\psi=\psi\circ T$, where $T$ is a measure preserving map defined on $\mathscr{H}=L^2(\Omega,\mathscr{F},P)$ is isometry of $\mathscr{H}$ if $\langle U\psi,U\phi\rangle=\langle\psi,\phi\rangle$ for all $\psi,\phi\in\mathscr{H}$.
#### Definition of unitary
The composition operator $\psi\mapsto U\psi=\psi\circ T$, where $T$ is a measure preserving map defined on $\mathscr{H}=L^2(\Omega,\mathscr{F},P)$ is unitary of $\mathscr{H}$ if $U$ is an isometry and $T$ is invertible with measurable inverse.
## Section 2: Continuous time (classical) dynamical systems
### Spring-mass system
![Spring-mass system](https://notenextra.trance-0.com/Math401/Spring-mass_system.png)
The pure state of the system is given by the position and velocity of the mass. $(x,v)$ is a point in $\mathbb{R}^2$. $\mathbb{R}^2$ is the state space of the system. (or phase space)
The motion of the system in its state space is a closed curve.
$$
\Phi_t(x,v)=\left(\cos(\omega t)x-\frac{1}{\omega}\sin(\omega t)v, \cos(\omega t)v-\omega\sin(\omega t)x\right)
$$
Such system with closed curve is called **integrable system**. Where the doubling map produces orbits having distinct dynamical properties (**chaotic system**).
> Note, some section is intentionally ignored here. They are about in the setting of operators on Hilbert spaces, the evolution of (classical, non-dissipative e.g. linear spring-mass system) system, is implemented by a one-parameter group of unitary operators.
>
> The detailed construction is omitted here.
#### Definition of Hermitian operator
A linear operator $A$ on a Hilbert space $\mathscr{H}$ is said to be Hermitian if $\forall \psi,\phi\in$ **domain of $A$**, we have $\langle A\psi,\phi\rangle=\langle\psi,A\phi\rangle$.
It is skew-Hermitian if $\langle A\psi,\phi\rangle=-\langle\psi,A\phi\rangle$.
## Section 3: Hamiltonians and the Schrödinger equation (finite dimensional version)
the problem of solving Schrödinger equation is at its core about studying the spectral theory of the Hamiltonian operator.
### Dynamics in 2-dimensional (_2 level_) systems (qubit)
In previous sections, we know that any self-adjoint matrix has the form $x_0+\vec{x}\cdot \sigma$, where $\sigma$ is the Pauli matrices.
And $(x_0,\vec{x})\in\mathbb{R}^4$ is a point in $\mathbb{R}^4$.
The general form (time-independent) of the Hamiltonian for a 2-level system is:
$$
H=\begin{pmatrix}
x_0+x_3 & x_1-ix_2 \\
x_1+ix_2 & -x_0+x_3
\end{pmatrix}
$$
Parameterizing the curves in Bloch space generated by Hamiltonian. In physical dimension of $\vec{x}=\omega\hbar\vec{s}$, $\omega>0$. $\omega\hbar$ is the physical dimension of energy.
we have:
$$
H=\omega\hbar\begin{pmatrix}
s_3 & s_1-is_2 \\
s_1+is_2 & -s_3
\end{pmatrix}
$$
[Continue on the orbits of states in the Bloch sphere] skip for now.
## Section 4: Transition probability, probability amplitudes and the Born rule
the modulus squared of a probability amplitude is the probability of the corresponding state.
### Basic definitions in transition probability
#### Definition of probability amplitude
For a n-dimensional Hilbert space $\mathscr{H}$, the system is initially in a pure state give by the unit vector $|\psi_0\rangle\in\mathscr{H}$, thus with the density operator $\rho_0=|\psi_0\rangle\langle\psi_0|$.
Then the state at time $t_1$ is given by $|\psi_1\rangle=A|\psi_0\rangle$, where $A\in U(n)$ is a unitary operator.
Then the density operator at time $t_1$ is given by $\rho_1=|\psi_1\rangle\langle\psi_1|=A|\psi_0\rangle\langle\psi_0|A^*=A\rho_0A^*$.
The entry of $A$ are $a_{ij}=\langle i|A|j\rangle$. where $|i\rangle$ is the basis of $\mathscr{H}$.
The $a_{ij}$ are the probability amplitudes of the transition from state $|i\rangle$ to state $|j\rangle$.
#### Definition of transition probability
Given above, the transition probability from state $|i\rangle$ to state $|j\rangle$ is given by:
$$
|a_{ij}|^2
$$
#### Sum over paths
To each path of classical states, path $j\to i: i_0=j,i_1,i_2,\cdots,i_l=i$, we associates the probability amplitude of the path given by:
$$
|\text{path}(j\to i)\rangle=\langle i_0|i_1\rangle\langle i_1|i_2\rangle\cdots\langle i_{l-1}|i_l\rangle
$$
The probability of the path is given by:
$$
\operatorname{Prob}(i|j)=\left|\sum_{\text{all paths}j\to i \text{ with } l \text{ steps}}|\text{path}(j\to i)\rangle\right|^2
$$
### Measuring a qubit
#### Definition of qubit
A qubit is a 2-level quantum system.
One example of qubit is the photon polarization.
#### Measurement of a qubit
The measurement of a qubit is a map fro the space of density operators, to a point on the intervals $[0,1]$.
This gives a probability distribution on the interval $[0,1]$ in our classical probability space.
![Measurement of a qubit](https://notenextra.trance-0.com/Math401/Measurement_of_a_qubit.png)
Here $p=\cos^2(\theta)\in[0,1]$. is the probability of the state being in the state $|0\rangle$.
The north pole on the Bloch sphere gives probability $1$ for the state being in the state $|0\rangle$.
The south pole on the Bloch sphere gives probability $1$ for the state being in the state $|1\rangle$.
The equator on the Bloch sphere gives probability $1/2$ for the state being in the state $|0\rangle$ or $|1\rangle$.
### Projective measurement of an $N$-qubit system
For $N$ qubits, the pure quantum state $\rho=|\psi\rangle\langle\psi|$ represented by the state vector $|\psi\rangle\in\mathscr{H}^{\otimes N}=\mathscr{H}\otimes\cdots\otimes\mathscr{H}(\mathscr{H}=\mathbb{C}^2)$.
This produces as output the random variable $X\in \{0,1\}^N$. $X=(a_1,a_2,\cdots,a_N)$, where $a_i\in \{0,1\}$.
By the Born rule,
$$
\operatorname{Prob}(X=(a_1,a_2,\cdots,a_N))=\left|\langle a_1a_2\cdots a_N|\psi\rangle\right|^2
$$
where $\langle a_1a_2\cdots a_N|\psi\rangle=\langle a_1|\otimes\langle a_2|\otimes\cdots\otimes\langle a_N|\psi\rangle$.
The input vector state $|\psi\rangle$ is a unit vector in $\mathscr{H}^{\otimes N}$.
This can be written as a tensor product of the basis vectors:
$$
|\psi\rangle=\sum_{a_1,a_2,\cdots,a_N} c_{a_1,a_2,\cdots,a_N}|a_1a_2\cdots a_N\rangle
$$
where $c_{a_1,a_2,\cdots,a_N}\in\mathbb{C}$.
The probability distribution of the post-measurement **classical random variable** $X$ can be represented as a point in the $2^N-1$ dimensional simplex of all probability distributions on the set $\{0,1\}^N$.
$$
\mathscr{P}(\{0,1\}^N)=\left\{(p_1,p_2,\cdots,p_{2^N})\in\mathbb{R}^{2^N}:p_i\geq 0,\sum_{i=1}^{2^N}p_i=1\right\}
$$
![Simplex of all probability distributions on the set $\{0,1\}^N$](https://notenextra.trance-0.com/Math401/Simplex_of_all_probability_distributions_on_the_set_01N.png)
here we use the binary representation for the index $i$ in the diagram.
#### Pure versus mixed states
A pure state is a state that is represented by a unit vector in $\mathscr{H}^{\otimes N}$.
A mixed state is a state that is represented by a density operator in $\mathscr{H}^{\otimes N}$. (convex combination of pure states)
if $\rho_j=|\psi_j\rangle\langle\psi_j|$, then $\rho=\sum_{j=1}^N p_j\rho_j$ is a mixed state, where $p_j\geq 0$ and $\sum_{j=1}^N p_j=1$.
#### Projective measurement of subsystem and partial trace
This section is related to quantum random walk and we will skip it for now.
## Section 5: Quantum random walk
This part is skipped, it is an interesting topic, but it is not the focus of my research for now.

View File

@@ -0,0 +1,639 @@
# Math401 Topic 6: Postulates of quantum theory and measurement operations
## Section 1: Postulates of quantum theory
This part is a review of the quantum theory, I will keep the content brief.
If you are familiar with the linear algebra defined before, you can jump right into this section to keep your time as viewing those compact notations.
### Pure states
#### Pure state and mixed state
A pure state is a state that is represented by a unit vector in $\mathscr{H}^{\otimes N}$.
A mixed state is a state that is represented by a density operator in $\mathscr{H}^{\otimes N}$. (convex combination of pure states)
if $\rho_j=|\psi_j\rangle\langle\psi_j|$, then $\rho=\sum_{j=1}^N p_j\rho_j$ is a mixed state, where $p_j\geq 0$ and $\sum_{j=1}^N p_j=1$.
#### Coset space
Two non-zero vectors $u,v\in \mathscr{H}$ are said to represent the same state if $u=cv$ for some complex number $c$ with $|c|=1$.
The set of states of a quantum system is called the **coset space** of $\mathscr{H}$, $u\sim v$ if $u=cv$ for some complex number $c$ with $|c|=1$.
The coset space is called the projective space of $\mathscr{H}$, denoted by $P(\mathscr{H})\colon=(\mathscr{H}\setminus\{0\})/\sim$.
Any vector in the form $e^{i\theta}|u\rangle$ for some $u\in \mathscr{H}$ and $\theta\in \mathbb{R}$ represents the same state as $|u\rangle$.
Example: the system of a qubit has a Hilbert space $\mathbb{C}^2$, the coset space is $P(\mathbb{C}^2)\cong S^2$ is the Bloch sphere.
### Composite systems
#### Tensor product
The tensor product of two Hilbert spaces $\mathscr{H}_1$ and $\mathscr{H}_2$ is the Hilbert space $\mathscr{H}_1\otimes\mathscr{H}_2$ with the inner product $\langle u_1\otimes u_2,v_1\otimes v_2\rangle=\langle u_1,v_1\rangle\langle u_2,v_2\rangle$.
The tensor product of two vectors $u_1\in \mathscr{H}_1$ and $u_2\in \mathscr{H}_2$ is the vector $u_1\otimes u_2\in \mathscr{H}_1\otimes\mathscr{H}_2$.
#### Multipartite systems
For each part in a multipartite quantum system, each part is associated a Hilbert space $\mathscr{H}_i$. The total system is associated a Hilbert space $\mathscr{H}=\mathscr{H}_1\otimes\mathscr{H}_2\otimes\cdots\otimes\mathscr{H}_n$.
The state of the total system has the form $u_1\otimes u_2\otimes\cdots\otimes u_n$ for some $u_i\in \mathscr{H}_i$.
#### Entanglement (talk later)
A state $|\psi\rangle$ is entangled if it cannot be expressed as a product state $v_1\otimes v_2$ for any single-qubit states $|v_1\rangle$ and $|v_2\rangle$. In other words, an entangled state is non-separable.
Example: the Bell state $|\psi^+\rangle=\frac{1}{\sqrt{2}}(|00\rangle+|11\rangle)$ is entangled.
Assume it can be written as $|\psi\rangle=|\psi_1\rangle\otimes|\psi_2\rangle$ where $|\psi_1\rangle=a|0\rangle+b|1\rangle$ and $|\psi_2\rangle=c|0\rangle+d|1\rangle$. Then:
$$
|\psi\rangle=a|00\rangle+b|01\rangle+c|10\rangle+d|11\rangle
$$
Setting this equal to $|\psi^+\rangle=\frac{1}{\sqrt{2}}(|00\rangle+|11\rangle)$ gives:
$$
ac|00\rangle+ad|01\rangle+bc|10\rangle+bd|11\rangle=\frac{1}{\sqrt{2}}(|00\rangle+|11\rangle)
$$
This requires:
$$
ac=bd=\frac{1}{2}
$$
$$
ad=bc=0
$$
This is a contradiction, so $|\psi^+\rangle$ is entangled.
### Mixed states and density operators
#### Density operator
A density operator is a [Hermitian](https://notenextra.trance-0.com/Math401/Math401_T5#definition-of-hermitian-operator), positive semi-definite operator with trace 1.
The density operator of a pure state $|\psi\rangle$ is $\rho=|\psi\rangle\langle\psi|$.
The density operator of a mixed state is given by the unit vector $u_1,u_2,\cdots,u_n$ in $\mathscr{H}$ with the probability $p_1,p_2,\cdots,p_n$, $p_i\geq 0$ such that $\sum_{i=1}^n p_i=1$.
The density operator is $\rho=\sum_{i=1}^n p_i|u_i\rangle\langle u_i|$.
#### Trace 1 proposition
Density operator on the finite dimensional Hilbert space $\mathscr{H}$ are positive operators having trace equal to 1.
#### Pure state lemma
A state is pure if and only if $Tr(\rho^2)=1$.
For any mixed state $\rho$, $Tr(\rho^2)<1$.
[Proof ignored here]
#### Unitary freedom in the ensemble for density operators theorem
Let $v_1,v_2,\cdots,v_l$ and $w_1,w_2,\cdots,w_l$ be two collections of vectors in the finite dimensional Hilbert space $\mathscr{H}$, the vectors being arbitrary (can be zero) except for the requirement that they define the same density operator $\rho$.
$$
\sum_{i=1}^l |v_i\rangle\langle v_i|=\sum_{i=1}^l |w_i\rangle\langle w_i|
$$
Then there exists a unitary matrix $U=(\mu_{ij})_{1\leq i,j\leq l}$ such that:
$$
v_i=\sum_{j=1}^l \mu_{ij}w_j
$$
The converse is also true.
If $\rho$ is a density operator on $\mathscr{H}$ given by: $\sum_{i=1}^l |w_i\rangle\langle w_i|$ and vector $v_i$ is given by: $v_i=\sum_{j=1}^l \mu_{ij}w_j$, then $\rho_1=\sum_{i=1}^l |v_i\rangle\langle v_i|$ is the density operator of the subsystem $\mathscr{H}_1$.
[Proof ignored here]
### Density operator of subsystems
#### Partial trace for density operators
Let $\rho$ be a density operator in $\mathscr{H}_1\otimes\mathscr{H}_2$, the partial trace of $\rho$ over $\mathscr{H}_2$ is the density operator in $\mathscr{H}_1$ (reduced density operator for the subsystem $\mathscr{H}_1$) given by:
$$
\rho_1\coloneqq\operatorname{Tr}_2(\rho)=\sum_{k=1}^r \lambda_k^2|v_k\rangle\langle v_k|
$$
<details>
<summary>Examples</summary>
Let $\rho=\frac{1}{\sqrt{2}}(|01\rangle+|10\rangle)$ be a density operator on $\mathscr{H}=\mathbb{C}^2\otimes \mathbb{C}^2$.
Expand the expression of $\rho$ in the basis of $\mathbb{C}^2\otimes\mathbb{C}^2$ using linear combination of basis vectors:
$$
\rho=\frac{1}{2}(|01\rangle\langle 01|+|01\rangle\langle 10|+|10\rangle\langle 01|+|10\rangle\langle 10|)
$$
Note $\operatorname{Tr}_2(|ab\rangle\langle cd|)=|a\rangle\langle c|\cdot \langle b|d\rangle$.
Then the reduced density operator of the subsystem $\mathbb{C}^2$ in first qubit is, note the $\langle 0|0\rangle=\langle 1|1\rangle=1$ and $\langle 0|1\rangle=\langle 1|0\rangle=0$:
$$
\begin{aligned}
\rho_1&=\operatorname{Tr}_2(\rho)\\
&=\frac{1}{2}(\langle 1|1\rangle |0\rangle\langle 0|+\langle 0|1\rangle |0\rangle\langle 1|+\langle 1|0\rangle |1\rangle\langle 0|+\langle 0|0\rangle |1\rangle\langle 1|)\\
&=\frac{1}{2}(|0\rangle\langle 0|+|1\rangle\langle 1|)\\
&=\frac{1}{2}I
\end{aligned}
$$
is a mixed state.
</details>
#### Schmidt Decomposition theorem
Let $|u\rangle\in \mathscr{H}_1\otimes\mathscr{H}_2$ be a unit vector (pure state), then there exists orthonormal bases $|v_i\rangle$ of $\mathscr{H}_1$ and $|w_j\rangle$ of $\mathscr{H}_2$ and $\{\lambda_k\},k\leq r$, where $r$ is the Schmidt rank of $|u\rangle$, such that:
$$
|u\rangle=\sum_{k=1}^r \lambda_k|v_k\rangle\otimes|w_k\rangle
$$
where $\lambda_k$ are **non-negative real numbers**. such that $\sum_{k=1}^r \lambda_k^2=1$.
[Proof ignored here]
**Remark**: non-zero vector $u\in \mathscr{H}_1\otimes\mathscr{H}_2$ decomposes as a tensor product $u=u_1\otimes u_2$ if and only if the Schmidt rank of $u$ is 1. **A state** that cannot be decomposed as a tensor product is called **entangled**.
#### Reduced density operator
In $\mathscr{H}_1\otimes\mathscr{H}_2$, the reduced density operator of the subsystem $\mathscr{H}_1$ is:
$$
\rho_1=\operatorname{Tr}_2(\rho)=\sum_{k=1}^r \lambda_k^2|v_k\rangle\langle v_k|
$$
where $\rho$ is the density operator in $\mathscr{H}_1\otimes\mathscr{H}_2$.
Example:
Let $\rho=\frac{1}{2}(|01\rangle+|10\rangle)\in \mathbb{C}^2\otimes\mathbb{C}^2$,
Expand the expression of $\rho$ in the basis of $\mathbb{C}^2\otimes\mathbb{C}^2$:
$$
\rho=\frac{1}{2}(|01\rangle\langle 01|+|01\rangle\langle 10|+|10\rangle\langle 01|+|10\rangle\langle 10|)
$$
then the reduced density operator of the subsystem $\mathbb{C}^2$ in first qubit is:
$$
\begin{aligned}
\rho_1&=\operatorname{Tr}_2(\rho)\\
&=\frac{1}{2}(\langle 1|1\rangle|0\rangle\langle 0|+\langle 1|0\rangle|0\rangle\langle 1|+\langle 0|1\rangle|1\rangle\langle 0|+\langle 0|0\rangle|1\rangle\langle 1|)\\
&=\frac{1}{2}(|0\rangle\langle 0|+|1\rangle\langle 1|)\\
&=\frac{1}{2}I
\end{aligned}
$$
### State purification
Every mixed state can be derived as the reduction of a pure state on an enlarged Hilbert space.
#### State purification theorem
Let $\rho$ be a mixed state in a finite dimensional Hilbert space $\mathscr{H}$, then there exists a unit vector $|w\rangle\in \mathscr{H}\otimes\mathscr{H}$ such that:
$$
\rho=\operatorname{Tr}_2(|w\rangle\langle w|)
$$
Hint of proof:
Let $u_1,u_2,\cdots,u_d$ be an orthonormal basis of $\mathscr{H}$, $\sum_{i=1}^d p_i=1$, $p_i\geq 0$, then:
$$
\rho=\sum_{i=1}^d p_i|u_i\rangle\langle u_i|
$$
Let $w=\sum_{i=1}^d \sqrt{p_i}u_i\otimes u_i$.
### Observables
The observables in the quantum theory are self-adjoint operators on the Hilbert space $\mathscr{H}$, denoted by $A\in \mathscr{O}$
In finite dimensional Hilbert space, $A$ can be written as $\sum_{\lambda\in \operatorname{sp}{(A)}}\lambda P_\lambda$, where $P_\lambda$ is the projection operator onto the eigenspace of $A$ corresponding to the eigenvalue $\lambda$. $P_\lambda=P_\lambda^2=P_\lambda^*$.
### Effects and Busch's theorem for effect operators
Below is a section on Topic 4, about Gleason's theorem and definition of states, and Born's rule for describing the states using density operators.
#### Definition of states (non-commutative (_quantum_) probability theory)
> Do a double check on this section, this notation is slightly different from the one in Topic 4.
A state on $(\mathscr{B}(\mathscr{H}),\mathscr{P})$ is a map $\mu:\mathscr{P}\to[0,1]$ such that:
1. $0\leq \mu(E)\leq 1$ for all $E\in \mathscr{P}(\mathscr{H})$.
2. $\mu(I_{\mathscr{H}})=1$.
3. If $E_1,E_2,\cdots,E_n$ are pairwise disjoint orthogonal projections, whose sum is also in $\mathscr{P}(\mathscr{H})$ then $\mu(E_1\lor E_2\lor\cdots\lor E_n)=\sum_{i=1}^n\mu(E_i)$.
Where projections are disjoint if $P_iP_j=P_jP_i=O$.
#### Definition of density operator (non-commutative (_quantum_) probability theory)
A density operator $\rho$ on the finite-dimensional Hilbert space $\mathscr{H}$ is:
1. self-adjoint ($A^*=A$, that is $\langle Ax,y\rangle=\langle x,Ay\rangle$ for all $x,y\in\mathscr{H}$)
2. positive semi-definite (all eigenvalues are non-negative)
3. $\operatorname{Tr}(\rho)=1$.
If $(|\psi_1\rangle,|\psi_2\rangle,\cdots,|\psi_n\rangle)$ is an orthonormal basis of $\mathscr{H}$ consisting of eigenvectors of $\rho$, for the eigenvalue $p_1,p_2,\cdots,p_n$, then $p_j\geq 0$ and $\sum_{j=1}^n p_j=1$.
We can write $\rho$ as
$$
\rho=\sum_{j=1}^n p_j|\psi_j\rangle\langle\psi_j|
$$
(under basis $|\psi_j\rangle$, it is a diagonal matrix with $p_j$ on the diagonal)
Every basis of $\mathscr{H}$ can be decomposed to these forms.
#### Theorem: Born's rule
Let $\rho$ be a density operator on $\mathscr{H}$. then
$$
\mu(P)\coloneqq\operatorname{Tr}(\rho P)=\sum_{j=1}^n p_j\langle\psi_j|P|\psi_j\rangle
$$
Defines a probability measure on the space $\mathscr{P}$.
[Proof ignored here]
#### Theorem: Gleason's theorem (very important)
Let $\mathscr{H}$ be a Hilbert space over $\mathbb{C}$ or $\mathbb{R}$ of dimension $n\geq 3$. Let $\mu$ be a state on the space $\mathscr{P}(\mathscr{H})$ of projections on $\mathscr{H}$. Then there exists a unique density operator $\rho$ such that
$$
\mu(P)=\operatorname{Tr}(\rho P)
$$
for all $P\in\mathscr{P}(\mathscr{H})$. $\mathscr{P}(\mathscr{H})$ is the space of all orthogonal projections on $\mathscr{H}$.
[Proof ignored here]
Extending the experimental procedure in quantum physics, **many of the outcome probabilities are expectation of effects instead of projections.** (POVMs)
#### Definition of effect
An effect is a positive (self-adjoint) operator $E$ on $\mathscr{H}$ such that $0\leq E\leq I$.
The set of effects on $\mathscr{H}$ is denoted by $\mathscr{E}(\mathscr{H})$.
An operator $E$ is said to be the **extreme point** of the convex set $\mathscr{E}(\mathscr{H})$ if it cannot be written as a convex combination of two other effects.
That is, If $E$ is an extreme point, then $E=\lambda E_1+(1-\lambda)E_2$ for some $0\leq \lambda\leq 1$ and $E_1,E_2\in \mathscr{E}(\mathscr{H})$ implies $E=E_1=E_2$.
#### Proposition: Effect operator lemma
The set of orthogonal projections on $\mathscr{H}$, $\mathscr{P}(\mathscr{H})$, is the set of extreme points of $\mathscr{E}(\mathscr{H})$.
#### Theorem: Generalized measures on effects
Let $\mathscr{H}$ be a finite-dimensional Hilbert space. Then any generalized probability measure
$$
\mu:E\in \mathscr{E}(\mathscr{H})\to \mu(E)\in[0,1]
$$
with the properties (same as the definition of states):
1. $0\leq \mu(E)\leq 1$ for all $E\in \mathscr{E}(\mathscr{H})$.
2. $\mu(I_{\mathscr{H}})=1$.
3. If $E_1,E_2,\cdots,E_n$ are pairwise disjoint orthogonal effects, whose sum is also in $\mathscr{E}(\mathscr{H})$ then $\mu(E_1\lor E_2\lor\cdots\lor E_n)=\sum_{i=1}^n\mu(E_i)$.
is the form:
$\mu(E)=\operatorname{Tr}(\rho E)$
for some density operator $\rho$ on $\mathscr{H}$.
[Proof ignored here]
> If $\mu$ is a positive linear functional on the space of self-adjoint operators on the finite dimensional Hilbert space $\mathscr{H}$.
>
> Then, there exists a density operator $\rho$ on $\mathscr{H}$ such that $\mu(E)=\operatorname{Tr}(\rho E)$.
### Measurements
A measurement (observation) of a system prepared in a given state produces an outcome $x$, $x$ is a physical event that is a subset of the set of all possible outcomes.
To each $x\in X$, we associate a measurement operator $M_x$ on $\mathscr{H}$.
Given the initial state (pure state, unit vector) $u$, the probability of measurement outcome $x$ is given by:
$$
p(x)=\|M_xu\|^2
$$
After the measurement, the state of the system is given by:
$$
v=\frac{M_xu}{\|M_xu\|}
$$
Note that to make sense of this definition, the collection of measurement operators $\{M_x\}$ must satisfy the **completeness** requirement:
$$
1=\sum_{x\in X} p(x)=\sum_{x\in X}\|M_xu\|^2=\sum_{x\in X}\langle M_xu,M_xu\rangle=\langle u,(\sum_{x\in X}M_x^*M_x)u\rangle
$$
So $\sum_{x\in X}M_x^*M_x=I$.
An example of measurement is the projective measurements (von Neumann measurements).
It is given by the set of orthogonal projections $M_x$ on $\mathscr{H}$ with the property:
1. $M_x=M_x^*$
2. $M_xM_y=\delta_{xy}M_x$ for all $x,y\in X$
3. $\sum_{x\in X}M_x=I$
#### Composition of measurements
Given two complete collections of measurement operators $\{M_x\}$ and $\{N_y\}$ on $\mathscr{H}_1$ and $\mathscr{H}_2$ respectively, the composition of the two measurements is given by the collection of measurement operators $\{M_xN_y\}$ on $\mathscr{H}_1\otimes\mathscr{H}_2$.
#### Proposition of indistinguishability
Suppose that we have two system $u_1,u_2\in \mathscr{H}_1$, the two states are distinguishable if and only if they are orthogonal.
Ways to distinguish the two states:
1. set $X=\{0,1,2\}$ and $M_i=|u_i\rangle\langle u_i|$, $M_0=I-M_1-M_2$
2. then $\{M_0,M_1,M_2\}$ is a complete collection of measurement operators on $\mathscr{H}$.
3. suppose the prepared state is $u_1$, then $p(1)=\|M_1u_1\|^2=\|u_1\|^2=1$, $p(2)=\|M_2u_1\|^2=0$, $p(0)=\|M_0u_1\|^2=0$.
If they are not orthogonal, then there are no choice of measurement operators to distinguish the two states.
[Proof ignored here]
_intuitively, if the two states are not orthogonal, then for any measurement there exists non-zero probability of getting the same outcome for both states._
#### Effects and POVM measurements
An effect on the finite dimension Hilbert space $\mathscr{H}$ is a positive operator $E$ on $\mathscr{H}$ such that $0\leq E\leq I$. A positive operator valued measure POVM consists of an index set $\mathscr{I}$ and a collection of effects $\{E_i,i\in \mathscr{I}\}$ satisfying the identity $\sum_{i\in \mathscr{I}}E_i=I$.
The probabilty of measurement outcome $i\in \mathscr{I}$ is given by $p(i)=\langle v,E_iv\rangle$ on a ysstem prepared in the state described by the unit vector $v$.
For a mixed state $\rho$, the probability of measurement outcome $i\in \mathscr{I}$ is given by $p(i)=\operatorname{Tr}(\rho E_i)$.
Example, suppose we have a system prepared in the following two states:
$$
u_1=|0\rangle, u_2=\frac{1}{\sqrt{2}}(|0\rangle+|1\rangle)
$$
Since they are not orthogonal, there is no measurement that can definitely distinguish the two states.
Consider the following POVM:
$$
E_1=\frac{\sqrt{2}}{1+\sqrt{2}}|1\rangle \langle 1|, E_2=\frac{\sqrt{2}}{1+\sqrt{2}}\frac{(|0\rangle-|1\rangle)(\langle 0|-\langle 1|)}{2},E_3=I-E_1-E_2
$$
Then, suppose we have an unknown state $u$, the probability of given $u_1$, measurement outcome $1$ is:
$$
p(1)=\langle u_1,E_1u_1\rangle=0
$$
So if the measurement outcome is $1$, we can conclude that the state is $u_2$.
The probability of given $u_2$, measurement outcome $2$ is:
$$
p(2)=\langle u_2,E_2u_2\rangle=0
$$
So if the measurement outcome is $2$, we can conclude that the state is $u_1$.
If the measurement outcome is $3$, then we cannot conclude anything about the state.
#### Proposition: Ancilla system
A general measurement of a system having Hilbert space $\mathscr{H}$ is equivalent to a projective measurement composed with a unitary transformation on the Hilbert space $\mathscr{H}\otimes\mathscr{A}$ of a composite system. The system described by $\mathscr{A}$ is called the ancilla system. This equivalent measurement is not unique.
[Further details ignored here]
### Quantum operations and CPTP maps
$L^1(\Omega,\mathscr{F},\mu)$ is the space of intergrable functions on $\mathscr{H}$, that is $\int_{\Omega} |f(\omega)| d\mu(\omega)<\infty$ for some measure $\mu$ on $\Omega$.
We define $\mathscr{L}_1(\mathscr{H})$, the space of trace class operators on $\mathscr{H}$, as the space of operators $A$ such that $\operatorname{Tr}(\sqrt{A^*A})<\infty$.
$L_2(\Omega,\mathscr{F},\mu)$ is the space of square intergrable functions on $\mathscr{H}$, that is $\int_{\Omega} |f(\omega)|^2 d\mu(\omega)<\infty$ for some measure $\mu$ on $\Omega$.
We define $\mathscr{L}_2(\mathscr{H})$, the space of Hilbert-Schmidt operators on $\mathscr{H}$, as the space of operators $A$ such that $\operatorname{Tr}(A^*A)<\infty$.
The space of $\mathscr{L}_2(\mathscr{H})$ is a Hilbert space equipped with the inner product $\langle A,B\rangle=\operatorname{Tr}(B^*A)$.
with Cauchy-Schwarz inequality:
$$
\operatorname{Tr}(A^*B)\leq \operatorname{Tr}(A^*A)^{1/2}\operatorname{Tr}(B^*B)^{1/2}
$$
The space of density operators $\mathscr{S}(\mathscr{H})$ is a convex subset (for $\rho_1,\rho_2\in \mathscr{S}(\mathscr{H})$, $\lambda\in[0,1]$, $\lambda\rho_1+(1-\lambda)\rho_2\in \mathscr{S}(\mathscr{H})$) of $\mathscr{L}_1(\mathscr{H})$ with trace $1$.
#### Definition of CPTP map
A completely positive trace preserving (CPTP) map is a linear map $\mathscr{E}:\mathscr{L}_1(\mathscr{H})\to \mathscr{L}_1(\mathscr{H})$ such that:
1. $\mathscr{E}(\operatorname{Tr}(\rho))=\operatorname{Tr}(\rho)$ for all $\rho\in \mathscr{S}(\mathscr{H})$.
2. $\mathscr{E}$ is completely positive, that is $\mathscr{E}\otimes I_{\mathscr{H}}:\mathscr{L}_1(\mathscr{H}_1\otimes\mathscr{K})\to\mathscr{L}_1(\mathscr{H}_2\otimes\mathscr{K})$ is positive for every finite-dimensional or separable Hilbert space $\mathscr{K}$.
_note that the condition for completely positive is stronger than the condition for positive. Because if we only require the map to be positive, then the map may assign negative values to some entangled states._
Example:
A map $\mathscr{E}:\mathscr{L}_1(\mathscr{H})\to \mathscr{L}_1(\mathscr{H})$ is given by:
$$
\mathscr{E}(\rho):\sum_{i,j} \alpha_{ij}|i\rangle\langle j|\to \sum_{i,j} \overline{\alpha_{ij}}|i\rangle\langle j|
$$
This map is positive but will assign negative values to some entangled states given by:
$$
\rho=|\phi\rangle\langle\phi|
$$
where $|\phi\rangle=\frac{1}{\sqrt{2}}(|00\rangle+|11\rangle)$.
#### Definition of quantum channel
Let $\mathscr{H}$ and $\mathscr{K}$ be Hilbert spaces, $U$ be a unitary operator on $\mathscr{H}\otimes\mathscr{K}$, and $\omega$ be a density operator on $\mathscr{K}$. The CPTP map
$$
\mathscr{E}:T\in \mathscr{L}_1(\mathscr{H})\to \operatorname{Tr}_\mathscr{K}(U (T\otimes \omega)U^*)
$$
is a quantum channel.
We skipped few exercises here and jump right into the definition.
In short, the quantum channel describes the following process:
Initialization: The ancilla $\mathscr{K}$ is prepared in a fixed state $\omega$ (density operator).
Coupling: The input state $T$ (on $\mathscr{H}$) is combined with $\omega$ to form $T\otimes\omega$ on $\mathscr{H}\otimes\mathscr{K}$.
Unitary evolution: The joint system evolves under $U$ (unitary on $\mathscr{H}\otimes\mathscr{K}$).
Discarding ancilla: The ancilla $\mathscr{K}$ is traced out, leaving a state on $\mathscr{H}$.
This is a Stinespring dilation, representing any CPTP map.
#### Proposition: Stinespring dilation theorem (to be checked)
Any CPTP map $\mathscr{E}:\mathscr{L}_1(\mathscr{H})\to \mathscr{L}_1(\mathscr{H})$ can be represented as:
$$
\mathscr{E}(T)=\operatorname{Tr}_\mathscr{K}(U (T\otimes \omega)U^*)
$$
### Conditional operations
#### Definition of controlled-unitary operations
A controlled-unitary operation is
$$
U\coloneqq\sum_{a=1}^{n_1}|a\rangle\langle a|\otimes U_a
$$
where $U_a$ is a unitary operator on $\mathscr{H}$ and $|a\rangle$ is a basis of $\mathscr{K}$.
#### Principle of deferred measurement
All measurements that may occur in the process of executing a quantum computation may be relegated to the end of the quantum circuit, prior to which all operations are unitary.
## Section 2: Quantum entanglement
### Bell states and the EPR phenomenon
#### Definition of Bell states
The Bell states are the following four states:
$$
|\Phi^+\rangle=\frac{1}{\sqrt{2}}(|00\rangle+|11\rangle), |\Phi^-\rangle=\frac{1}{\sqrt{2}}(|00\rangle-|11\rangle)
$$
$$
|\Psi^+\rangle=\frac{1}{\sqrt{2}}(|01\rangle+|10\rangle), |\Psi^-\rangle=\frac{1}{\sqrt{2}}(|01\rangle-|10\rangle)
$$
These are the basis of the two-qubit Hilbert space.
[The section discussing the EPR phenomenon is ignored here, the key to remember is that there exists no classical (local) explanation for the correlation between the two qubits.]
### Von Neumann entropy and maximally entangled states
#### Definition of EPR state
A vector $|\psi\rangle$ on tensor product space $\mathscr{H}_1\otimes\mathscr{H}_2$ is called an EPR state if it is of the form:
$$
|\psi\rangle=\frac{1}{\sqrt{n}}\sum_{i=1}^n |i\rangle_1|i\rangle_2
$$
where $|i\rangle_1$ and $|i\rangle_2$ are basis of $\mathscr{H}_1$ and $\mathscr{H}_2$ respectively.
This describes a maximally entangled state.
#### Weyl operators
Let $\mathscr{H}$ be a Hilbert space with orthonormal basis $(|i\rangle)$.
The shift operator $X$ is defined as:
$$
X|i\rangle=|i+1\rangle
$$
Note that $X$ permutes basis element cyclically. Let $\omega=e^{2\pi i/n}$, then $1,\omega,\omega^2,\cdots,\omega^{n-1}$ are the $n$-th roots of unity.
The phase operator $Z$ is defined as:
$$
Z|i\rangle=\omega^i|i\rangle
$$
The Weyl operators are the following operators:
$$
W_{ab}=X^aZ^b
$$
where $a,b\in\{0,1,\cdots,n-1\}$.
#### Definition of von Neumann entropy
The von Neumann entropy of a density operator $\rho$ is defined as:
$$
S(\rho)=-\operatorname{Tr}(\rho\log\rho)=-\sum_{i}\mu_i\log\mu_i
$$
where $\mu_i$ are the eigenvalues of $\rho$.
## Section 3: Information transmission by quantum systems
### Transmission of classical information
#### Transmission over information channels
Let the measurement operation defined by POVM $\{E_y\}$, the conditional probability of obtaining signal $y$ at the output given the input is $x$ is given by:
$$
p_E(y|x)=\operatorname{Tr}(\rho_x E_y)
$$
where $\rho_x$ is the density operator of the input state, $E_y$ is the measurement operator for the output signal $y$.
#### Holevo bound
The maximal amount of classical information that can be transmitted by a quantum system is given by the Holevo bound. $\log_2(d)$ is the maximum amount of classical information that can be transmitted by a quantum system with $d$ levels.
> The fact that Hilbert space contains infinitely many different state vectors does not aid us in transmitting an unlimited amount of information. The more states are used for transmission, the closer they are to each other and hence they become less and less distinguishable.
### Making use of entanglement and local operations
No information can be gained by measuring a pair of entangled qubits.
### Superdense coding [very important]
It is a procedure defined as follows:
Suppose $A$ and $B$ share a Bell state $|\Phi^+\rangle=\frac{1}{\sqrt{2}}(|00\rangle+|11\rangle)$, where $A$ holds the first part and $B$ holds the second part.
$A$ wish to send 2 classical bits to $B$.
$A$ performs one of four Pauli unitaries on the combined state of entangled qubits $\otimes$ one qubit. Then $A$ sends the resulting one qubit to $B$.
This operation extends the initial one entangled qubit to a system of one of four orthogonal Bell states.
$B$ performs a measurement on the combined state of the one qubit and the entangled qubits he holds.
$B$ decodes the result and obtains the 2 classical bits sent by $A$.
Superdense coding](https://notenextra.trance-0.com/Math401/Superdense_coding.png)
## Section 4: Quantum automorphisms and dynamics
Section ignored.

View File

@@ -0,0 +1 @@
# Math401 Topic 7: Basic of quantum circuits

View File

@@ -0,0 +1,16 @@
export default {
Math401_T1: "Math 401, Topic 1: Probability under language of measure theory",
Math401_T2: "Math 401, Topic 2: Finite-dimensional Hilbert spaces",
Math401_T3: "Math 401, Topic 3: Separable Hilbert spaces",
Math401_T4: "Math 401, Topic 4: The quantum version of probabilistic concepts",
Math401_T5: "Math 401, Topic 5: Introducing dynamics: classical and non-commutative",
Math401_T6: "Math 401, Topic 6: Postulates of quantum theory and measurement operations",
Math401_T7: "Math 401, Topic 7: Basic of quantum circuits",
"---":{
type: 'separator'
},
Math401_P1: "Math 401, Paper 1: Concentration of measure effects in quantum information (Patrick Hayden)",
Math401_P1_1: "Math 401, Paper 1, Side note 1: Quantum information theory and Measure concentration",
Math401_P1_2: "Math 401, Paper 1, Side note 2: Page's lemma",
Math401_P1_3: "Math 401, Paper 1, Side note 3: Levy's concentration theorem",
}