optimize entropy

This commit is contained in:
Trance-0
2026-02-01 13:41:12 -06:00
parent c18f798c16
commit 20f486cccb
16 changed files with 696 additions and 712 deletions

View File

@@ -27,3 +27,12 @@ In this report, we will show the process of my exploration of the concentration
## About bibliography for the report
Since we are most referencing books, to the future self who want to separate the content, don't do so unless your bib exceeds 100 entries.
> [!TIPS]
>
> To compile the bibliography for each chapter, run
>
> ```bash
biber chapters/chap1
pdflatex chapers/chap1.tex
> ```

BIN
chapters/chap0.pdf Normal file

Binary file not shown.

View File

@@ -13,7 +13,9 @@
\addcontentsline{toc}{chapter}{Chapter 0: Brief definitions and basic concepts}
\markboth{Chapter 0: Brief definitions and basic concepts}{}
This section serve as reference for definitions and theorems that we will use later. This section can be safely ignored if you are already familiar with the definitions and theorems.
As the future version of me might forgot everything we have over the summer, as I did for now, I will make a review again from the simple definition to recall the necessary information to tell you why we are here and how we are going to proceed.
This section serve as reference for definitions, notations, and theorems that we will use later. This section can be safely ignored if you are already familiar with the definitions and theorems.
But for the future self who might have no idea what I'm talking about, we will provided detailed definitions to you to understand the concepts.
@@ -21,44 +23,38 @@ But for the future self who might have no idea what I'm talking about, we will p
The main vector space we are interested in is $\mathbb{C}^n$; therefore, all the linear operators we defined are from $\mathbb{C}^n$ to $\mathbb{C}^n$.
\begin{defn}
\label{defn:braket}
We denote a vector in vector space as $\ket{\psi}=(z_1,\ldots,z_n)$ (might also be infinite dimensional, and $z_i\in\mathbb{C}$).
A natural inner product space defined on $\mathbb{C}^n$ is given by the Hermitian inner product:
\end{defn}
$$
\langle\psi|\varphi\rangle=\sum_{i=1}^n z_iz_i^*
$$
This satisfies the following properties:
Here $\psi$ is just a label for the vector, and you don't need to worry about it too much. This is also called the ket, where the counterpart $\bra{\psi}$ is called the bra, used to denote the vector dual to $\psi$; such an element is a linear functional if you really want to know what that is.
\begin{enumerate}
\item $\bra{\psi}\sum_i \lambda_i\ket{\varphi}=\sum_i \lambda_i \langle\psi|\varphi\rangle$ (linear on the second argument. Note that in physics \cite{Nielsen_Chuang_2010} we use linear on the second argument and conjugate linear on the first argument. But in math, we use linear on the first argument and conjugate linear on the second argument \cite{Axler_2024}. As promised in the beginning, we will use the physics convention in this report.)
\item $\langle\varphi|\psi\rangle=(\langle\psi|\varphi\rangle)^*$
\item $\langle\psi|\psi\rangle\geq 0$ with equality if and only if $\ket{\psi}=0$
\end{enumerate}
Here $\psi$ is just a label for the vector, and you don't need to worry about it too much. This is also called the ket, where the counterpart:
Few additional notation will be introduced, in this document, we will follows the notation used in mathematics literature \cite{axler2023linear}
\begin{itemize}
\item $\langle\psi\rangle$ is called the bra, used to denote the vector dual to $\psi$; such an element is a linear functional if you really want to know what that is.
\item $\langle\psi|\varphi\rangle$ is the inner product between two vectors, and $\bra{\psi} A\ket{\varphi}$ is the inner product between $A\ket{\varphi}$ and $\bra{\psi}$, or equivalently $A^\dagger \bra{\psi}$ and $\ket{\varphi}$.
\item Given a complex matrix $A=\mathbb{C}^{n\times n}$,
\begin{enumerate}
\item $A^*$ is the complex conjugate of $A$.
i.e.,
\item $\overline{A}$ is the complex conjugate of $A$.
\begin{examples}
$$
A=\begin{bmatrix}
1+i & 2+i & 3+i \\
4+i & 5+i & 6+i \\
7+i & 8+i & 9+i\end{bmatrix},
A^*=\begin{bmatrix}
\overline{A}=\begin{bmatrix}
1-i & 2-i & 3-i \\
4-i & 5-i & 6-i \\
7-i & 8-i & 9-i
\end{bmatrix}
$$
\end{examples}
\item $A^\top$ denotes the transpose of $A$.
i.e.,
\begin{examples}
$$
A=\begin{bmatrix}
1+i & 2+i & 3+i \\
@@ -71,22 +67,24 @@ Here $\psi$ is just a label for the vector, and you don't need to worry about it
3+i & 6+i & 9+i
\end{bmatrix}
$$
\item $A^\dagger=(A^*)^\top$ denotes the complex conjugate transpose, referred to as the adjoint, or Hermitian conjugate of $A$.
i.e.,
\end{examples}
\item $A^*=\overline{(A^\top)}$ denotes the complex conjugate transpose, referred to as the adjoint, or Hermitian conjugate of $A$.
\begin{examples}
$$
A=\begin{bmatrix}
1+i & 2+i & 3+i \\
4+i & 5+i & 6+i \\
7+i & 8+i & 9+i
\end{bmatrix},
A^\dagger=\begin{bmatrix}
A^*=\begin{bmatrix}
1-i & 4-i & 7-i \\
2-i & 5-i & 8-i \\
3-i & 6-i & 9-i
\end{bmatrix}
$$
\item $A$ is unitary if $A^\dagger A=AA^\dagger=I$.
\item $A$ is hermitian (self-adjoint in mathematics literature) if $A^\dagger=A$.
\end{examples}
\item $A$ is unitary if $A^* A=AA^*=I$.
\item $A$ is self-adjoint (hermitian in physics literature) if $A^*=A$.
\end{enumerate}
\end{itemize}
@@ -116,6 +114,7 @@ $$
And we wish to build a way to associate the basis of $V$ and $W$ with the basis of $V\otimes W$. That makes the tensor product a vector space with dimension $\dim V\times \dim W$.
\begin{defn}
\label{defn:linear_functional}
Definition of linear functional
A linear functional is a linear map from $V$ to $\mathbb{F}$.
@@ -133,7 +132,7 @@ A generalized linear map is a function $f: V\to W$ satisfying the condition.
\begin{defn}
\label{defn:bilinear_functional}
A bilinear functional is a bilinear function $\beta:V\times W\to \mathbb{F}$ satisfying the condition that $\ket{v}\to \beta(\ket{v},\ket{w})$ is a linear functional for all $\ket{w}\in W$ and $\ket{w}\to \beta(\ket{v},\ket{w})$ is a linear functional for all $\ket{v}\in V$.
\end{defn}
@@ -142,7 +141,7 @@ The vector space of all bilinear functionals is denoted by $\mathcal{B}(V, W)$.
\begin{defn}
\label{defn:tensor_product}
Let $V, W$ be two vector spaces.
Let $V'$ and $W'$ be the dual spaces of $V$ and $W$, respectively, that is $V'=\{\psi:V\to \mathbb{F}\}$ and $W'=\{\phi:W\to \mathbb{F}\}$, $\psi, \phi$ are linear functionals.
@@ -150,7 +149,7 @@ Let $V'$ and $W'$ be the dual spaces of $V$ and $W$, respectively, that is $V'=\
The tensor product of vectors $v\in V$ and $w\in W$ is the bilinear functional defined by $\forall (\psi,\phi)\in V'\times W'$ given by the notation
$$
(v\otimes w)(\psi,\phi)\coloneqq\psi(v)\phi(w)
(v\otimes w)(\psi,\phi)=\psi(v)\phi(w)
$$
The tensor product of two vector spaces $V$ and $W$ is the vector space $\mathcal{B}(V',W')$
@@ -166,21 +165,25 @@ Here $\delta_{ij}=\begin{cases}
0 & \text{otherwise}
\end{cases}$ is the Kronecker delta.
\end{defn}
$$
V\otimes W=\left\{\sum_{i=1}^n \sum_{j=1}^m a_{ij} \phi_i(v)\psi_j(w): \phi_i\in V', \psi_j\in W'\right\}
$$
\end{defn}
Note that $\sum_{i=1}^n \sum_{j=1}^m a_{ij} \phi_i(v)\psi_j(w)$ is a bilinear functional that maps $V'\times W'$ to $\mathbb{F}$.
This enables basis-free construction of vector spaces with proper multiplication and scalar multiplication.
This vector space is equipped with the unique inner product $\langle v\otimes w, u\otimes x\rangle_{V\otimes W}$ defined by
\begin{defn}
\label{defn:inner_product_on_tensor_product}
The vector space defined by the tensor product is equipped with the unique inner product $\langle v\otimes w, u\otimes x\rangle_{V\otimes W}: V\otimes W\times V\otimes W\to \mathbb{F}$ defined by
$$
\langle v\otimes w, u\otimes x\rangle=\langle v,u\rangle_V\langle w,x\rangle_W
$$
\end{defn}
In practice, we ignore the subscript of the vector space and just write $\langle v\otimes w, u\otimes x\rangle=\langle v,u\rangle\langle w,x\rangle$.
@@ -203,17 +206,10 @@ First, we define the Hilbert space in case one did not make the step from the li
That is, a vector space equipped with an inner product that is complete (every Cauchy sequence converges to a limit).
\begin{examples}
To introduce an example of Hilbert space we use when studying quantum mechanics, we need to introduce a common inner product used in $\mathbb{C}^n$.
\begin{defn}
\label{defn:Hermitian_inner_product}
Hermitian inner product:
On $\mathbb{C}^n$, the Hermitian inner product is defined by
$$
\langle u,v\rangle=\sum_{i=1}^n \overline{u_i}v_i
$$
\end{defn}
\begin{prop}
\label{prop:Hermitian_inner_product_with_complex_vectorspace}
@@ -253,6 +249,8 @@ To introduce an example of Hilbert space we use when studying quantum mechanics,
Therefore, since the Hermitian inner product fulfills the inner product axioms and $\C^n$ is complete, the complex vector space $\C^n$ with the Hermitian inner product is a Hilbert space.
\end{proof}
\end{examples}
Another classical example of Hilbert space is $L^2(\Omega, \mathscr{F}, P)$, where $(\Omega, \mathscr{F}, P)$ is a measure space ($\Omega$ is a set, $\mathscr{F}$ is a $\sigma$-algebra on $\Omega$, and $P$ is a measure on $\mathscr{F}$). The $L^2$ space is the space of all function on $\Omega$ that is
\begin{enumerate}
@@ -268,6 +266,9 @@ Another classical example of Hilbert space is $L^2(\Omega, \mathscr{F}, P)$, whe
\item \textbf{complex-valued}: functions are complex-valued measurable. $f=u+v i$ is complex-valued if $u$ and $v$ are real-valued measurable.
\end{enumerate}
\begin{examples}
\begin{prop}
\label{prop:L2_space_is_a_Hilbert_space}
$L^2(\Omega, \mathscr{F}, P)$ is a Hilbert space.
@@ -305,8 +306,9 @@ Another classical example of Hilbert space is $L^2(\Omega, \mathscr{F}, P)$, whe
\end{itemize}
\end{proof}
\end{examples}
Let $\mathscr{H}$ be a Hilbert space. $\mathscr{H}$ consists of complex-valued functions on a finite set $\Omega=\{1,2,\cdots,n\}$, and the functions $(e_1,e_2,\cdots,e_n)$ form an orthonormal basis of $\mathscr{H}$. (We use Dirac notation $|k\rangle$ to denote the basis vector $e_k$~\cite{parthasarathy1992quantum}.)
Let $\mathscr{H}$ be a Hilbert space. $\mathscr{H}$ consists of complex-valued functions on a finite set $\Omega=\{1,2,\ldots,n\}$, and the functions $(e_1,e_2,\ldots,e_n)$ form an orthonormal basis of $\mathscr{H}$. (We use Dirac notation $|k\rangle$ to denote the basis vector $e_k$~\cite{parthasarathy1992quantum}.)
As an analog to the classical probability space $(\Omega,\mathscr{F},\mu)$, which consists of a sample space $\Omega$ and a probability measure $\mu$ on the state space $\mathscr{F}$, the non-commutative probability space $(\mathscr{H},\mathscr{P},\rho)$ consists of a Hilbert space $\mathscr{H}$ and a state $\rho$ on the space of all orthogonal projections $\mathscr{P}$.
@@ -429,7 +431,6 @@ When operators commute, we recover classical probability measures.
\end{prop}
\begin{proof}
Ways to distinguish the two states:
\begin{enumerate}
\item Set $X=\{0,1,2\}$ and $M_i=|u_i\rangle\langle u_i|$, $M_0=I-M_1-M_2$
@@ -441,7 +442,7 @@ When operators commute, we recover classical probability measures.
\end{proof}
\textit{Intuitively, if the two states are not orthogonal, then for any measurement (projection) there exists non-zero probability of getting the same outcome for both states.}
Intuitively, if the two states are not orthogonal, then for any measurement (projection) there exists non-zero probability of getting the same outcome for both states.
Here is Table~\ref{tab:analog_of_classical_probability_theory_and_non_commutative_probability_theory} summarizing the analog of classical probability theory and non-commutative (\textit{quantum}) probability theory~\cite{Feres}:
@@ -450,7 +451,7 @@ Here is Table~\ref{tab:analog_of_classical_probability_theory_and_non_commutativ
\renewcommand{\arraystretch}{1.5}
\caption{Analog of classical probability theory and non-commutative (\textit{quantum}) probability theory}
\label{tab:analog_of_classical_probability_theory_and_non_commutative_probability_theory}
{\tiny
{\small
\begin{tabular}{|p{0.5\linewidth}|p{0.5\linewidth}|}
\hline
\textbf{Classical probability} & \textbf{Non-commutative probability} \\

Binary file not shown.

View File

@@ -28,7 +28,6 @@
\chapter{Concentration of Measure And Quantum Entanglement}
As the future version of me might forgot everything we have over the summer, as I did for now, I will make a review again from the simple definition to recall the necessary information to tell you why we are here and how we are going to proceed.
First, we will build the mathematical model describing the behavior of quantum system and why they makes sense for physicists and meaningful for general publics.
@@ -44,7 +43,7 @@ The light intensity decreases with $\alpha$ (the angle between the two filters).
However, for a system of 3 polarizing filters $F_1,F_2,F_3$, having directions $\alpha_1,\alpha_2,\alpha_3$, if we put them on the optical bench in pairs, then we will have three random variables $P_1,P_2,P_3$.
\begin{figure}[H]
\begin{figure}[h]
\centering
\includegraphics[width=0.7\textwidth]{Filter_figure.png}
\caption{The light polarization experiment, image from \cite{kummer1998elements}}
@@ -122,481 +121,60 @@ Probability of a photon passing through the filter $P_\alpha$ is given by $\lang
Since the probability of a photon passing through the three filters is not commutative, it is impossible to discuss $\operatorname{Prob}(P_1=1,P_3=0)$ in the classical setting.
The main vector space we are interested in is $\mathbb{C}^n$; therefore, all the linear operators we defined are from $\mathbb{C}^n$ to $\mathbb{C}^n$.
We denote a vector in vector space as $\ket{\psi}=(z_1,\ldots,z_n)$ (might also be infinite dimensional, and $z_i\in\mathbb{C}$).
A natural inner product space defined on $\mathbb{C}^n$ is given by the Hermitian inner product:
We now show how the experimentally observed probability
$$
\langle\psi|\varphi\rangle=\sum_{i=1}^n z_iz_i^*
\frac{1}{2}\sin^2(\alpha_i-\alpha_j)
$$
arises from the operator model.
Assume the incoming light is \emph{unpolarized}. It is therefore described by
the density matrix
$$
\rho=\frac{1}{2} I .
$$
This satisfies the following properties:
Let $P_{\alpha_i}$ and $P_{\alpha_j}$ be the orthogonal projections corresponding
to the two polarization filters with angles $\alpha_i$ and $\alpha_j$.
\begin{enumerate}
\item $\bra{\psi}\sum_i \lambda_i\ket{\varphi}=\sum_i \lambda_i \langle\psi|\varphi\rangle$ (linear on the second argument. Note that in physics \cite{Nielsen_Chuang_2010} we use linear on the second argument and conjugate linear on the first argument. But in math, we use linear on the first argument and conjugate linear on the second argument \cite{Axler_2024}. As promised in the beginning, we will use the physics convention in this report.)
\item $\langle\varphi|\psi\rangle=(\langle\psi|\varphi\rangle)^*$
\item $\langle\psi|\psi\rangle\geq 0$ with equality if and only if $\ket{\psi}=0$
\end{enumerate}
Here $\psi$ is just a label for the vector, and you don't need to worry about it too much. This is also called the ket, where the counterpart:
\begin{itemize}
\item $\langle\psi\rangle$ is called the bra, used to denote the vector dual to $\psi$; such an element is a linear functional if you really want to know what that is.
\item $\langle\psi|\varphi\rangle$ is the inner product between two vectors, and $\bra{\psi} A\ket{\varphi}$ is the inner product between $A\ket{\varphi}$ and $\bra{\psi}$, or equivalently $A^\dagger \bra{\psi}$ and $\ket{\varphi}$.
\item Given a complex matrix $A=\mathbb{C}^{n\times n}$,
\begin{enumerate}
\item $A^*$ is the complex conjugate of $A$.
i.e.,
$$
A=\begin{bmatrix}
1+i & 2+i & 3+i\\
4+i & 5+i & 6+i\\
7+i & 8+i & 9+i\end{bmatrix},
A^*=\begin{bmatrix}
1-i & 2-i & 3-i\\
4-i & 5-i & 6-i\\
7-i & 8-i & 9-i
\end{bmatrix}
$$
\item $A^\top$ denotes the transpose of $A$.
i.e.,
$$
A=\begin{bmatrix}
1+i & 2+i & 3+i\\
4+i & 5+i & 6+i\\
7+i & 8+i & 9+i
\end{bmatrix},
A^\top=\begin{bmatrix}
1+i & 4+i & 7+i\\
2+i & 5+i & 8+i\\
3+i & 6+i & 9+i
\end{bmatrix}
$$
\item $A^\dagger=(A^*)^\top$ denotes the complex conjugate transpose, referred to as the adjoint, or Hermitian conjugate of $A$.
i.e.,
$$
A=\begin{bmatrix}
1+i & 2+i & 3+i\\
4+i & 5+i & 6+i\\
7+i & 8+i & 9+i
\end{bmatrix},
A^\dagger=\begin{bmatrix}
1-i & 4-i & 7-i\\
2-i & 5-i & 8-i\\
3-i & 6-i & 9-i
\end{bmatrix}
$$
\item $A$ is unitary if $A^\dagger A=AA^\dagger=I$.
\item $A$ is hermitian (self-adjoint in mathematics literature) if $A^\dagger=A$.
\end{enumerate}
\end{itemize}
\subsubsection{Motivation of Tensor product}
Recall from the traditional notation of product space of two vector spaces $V$ and $W$, that is, $V\times W$, is the set of all ordered pairs $(\ket{v},\ket{w})$ where $\ket{v}\in V$ and $\ket{w}\in W$.
The space has dimension $\dim V+\dim W$.
We want to define a vector space with the notation of multiplication of two vectors from different vector spaces.
That is
The probability that a photon passes the first filter $P_{\alpha_i}$ is given by the Born rule:
$$
(\ket{v_1}+\ket{v_2})\otimes \ket{w}=(\ket{v_1}\otimes \ket{w})+(\ket{v_2}\otimes \ket{w})
$$
$$
\ket{v}\otimes (\ket{w_1}+\ket{w_2})=(\ket{v}\otimes \ket{w_1})+(\ket{v}\otimes \ket{w_2})
\operatorname{Prob}(P_i=1)
=\operatorname{tr}(\rho P_{\alpha_i})
=\frac{1}{2} \operatorname{tr}(P_{\alpha_i})
=\frac{1}{2}
$$
and enables scalar multiplication by
If the photon passes the first filter, the post-measurement state is given by the L\"uders rule:
$$
\lambda (\ket{v}\otimes \ket{w})=(\lambda \ket{v})\otimes \ket{w}=\ket{v}\otimes (\lambda \ket{w})
\rho \longmapsto
\rho_i
=\frac{P_{\alpha_i}\rho P_{\alpha_i}}{\operatorname{tr}(\rho P_{\alpha_i})}
= P_{\alpha_i}.
$$
And we wish to build a way to associate the basis of $V$ and $W$ with the basis of $V\otimes W$. That makes the tensor product a vector space with dimension $\dim V\times \dim W$.
\begin{defn}
Definition of linear functional
A linear functional is a linear map from $V$ to $\mathbb{F}$.
\end{defn}
Note the difference between a linear functional and a linear map.
A generalized linear map is a function $f: V\to W$ satisfying the condition.
\begin{itemize}
\item $f(\ket{u}+\ket{v})=f(\ket{u})+f(\ket{v})$
\item $f(\lambda \ket{v})=\lambda f(\ket{v})$
\end{itemize}
\begin{defn}
A bilinear functional is a bilinear function $\beta:V\times W\to \mathbb{F}$ satisfying the condition that $\ket{v}\to \beta(\ket{v},\ket{w})$ is a linear functional for all $\ket{w}\in W$ and $\ket{w}\to \beta(\ket{v},\ket{w})$ is a linear functional for all $\ket{v}\in V$.
\end{defn}
The vector space of all bilinear functionals is denoted by $\mathcal{B}(V, W)$.
\begin{defn}
Let $V, W$ be two vector spaces.
Let $V'$ and $W'$ be the dual spaces of $V$ and $W$, respectively, that is $V'=\{\psi:V\to \mathbb{F}\}$ and $W'=\{\phi:W\to \mathbb{F}\}$, $\psi, \phi$ are linear functionals.
The tensor product of vectors $v\in V$ and $w\in W$ is the bilinear functional defined by $\forall (\psi,\phi)\in V'\times W'$ given by the notation
The probability that the photon then passes the second filter is
$$
(v\otimes w)(\psi,\phi)\coloneqq\psi(v)\phi(w)
\operatorname{Prob}(P_j=1 \mid P_i=1)
=\operatorname{tr}(P_{\alpha_i} P_{\alpha_j})
=\cos^2(\alpha_i-\alpha_j).
$$
The tensor product of two vector spaces $V$ and $W$ is the vector space $\mathcal{B}(V',W')$
Notice that the basis of such vector space is the linear combination of the basis of $V'$ and $W'$, that is, if $\{e_i\}$ is the basis of $V'$ and $\{f_j\}$ is the basis of $W'$, then $\{e_i\otimes f_j\}$ is the basis of $\mathcal{B}(V', W')$.
That is, every element of $\mathcal{B}(V', W')$ can be written as a linear combination of the basis.
Since $\{e_i\}$ and $\{f_j\}$ are bases of $V'$ and $W'$, respectively, then we can always find a set of linear functionals $\{\phi_i\}$ and $\{\psi_j\}$ such that $\phi_i(e_j)=\delta_{ij}$ and $\psi_j(f_i)=\delta_{ij}$.
Here $\delta_{ij}=\begin{cases}
1 & \text{if } i=j \\
0 & \text{otherwise}
\end{cases}$ is the Kronecker delta.
\end{defn}
Hence, the probability that the photon passes $P_{\alpha_i}$ and is then blocked by $P_{\alpha_j}$ is
$$
V\otimes W=\left\{\sum_{i=1}^n \sum_{j=1}^m a_{ij} \phi_i(v)\psi_j(w): \phi_i\in V', \psi_j\in W'\right\}
\begin{aligned}
\operatorname{Prob}(P_i=1, P_j=0)
&= \operatorname{Prob}(P_i=1)
- \operatorname{Prob}(P_i=1, P_j=1) \\
&= \frac12 - \frac12 \cos^2(\alpha_i-\alpha_j) \\
&= \frac12 \sin^2(\alpha_i-\alpha_j).
\end{aligned}
$$
Note that $\sum_{i=1}^n \sum_{j=1}^m a_{ij} \phi_i(v)\psi_j(w)$ is a bilinear functional that maps $V'\times W'$ to $\mathbb{F}$.
This enables basis-free construction of vector spaces with proper multiplication and scalar multiplication.
This vector space is equipped with the unique inner product $\langle v\otimes w, u\otimes x\rangle_{V\otimes W}$ defined by
$$
\langle v\otimes w, u\otimes x\rangle=\langle v,u\rangle_V\langle w,x\rangle_W
$$
In practice, we ignore the subscript of the vector space and just write $\langle v\otimes w, u\otimes x\rangle=\langle v,u\rangle\langle w,x\rangle$.
This introduces a new model in mathematics explaining quantum mechanics: the non-commutative probability theory.
\section{Non-commutative probability theory}
The non-commutative probability theory is a branch of generalized probability theory that studies the probability of events in non-commutative algebras.
There are several main components of the generalized probability theory; let's see how we can formulate them, comparing with the classical probability theory.
First, we define the Hilbert space in case one did not make the step from the linear algebra courses like me.
\begin{defn}
\label{defn:Hilbert_space}
Hilbert space:
A Hilbert space is a complete inner product space.
\end{defn}
That is, a vector space equipped with an inner product that is complete (every Cauchy sequence converges to a limit).
To introduce an example of Hilbert space we use when studying quantum mechanics, we need to introduce a common inner product used in $\mathbb{C}^n$.
\begin{defn}
\label{defn:Hermitian_inner_product}
Hermitian inner product:
On $\mathbb{C}^n$, the Hermitian inner product is defined by
$$
\langle u,v\rangle=\sum_{i=1}^n \overline{u_i}v_i
$$
\end{defn}
\begin{prop}
\label{prop:Hermitian_inner_product_with_complex_vectorspace}
The Hermitian inner product on the complex vector space $\C^n$ makes it a Hilbert space.
\end{prop}
\begin{proof}
We first verify that the Hermitian inner product
$$
\langle u,v\rangle = \sum_{i=1}^n \overline{u_i} v_i
$$
on $\C^n$ satisfies the axioms of an inner product:
\begin{enumerate}
\item \textbf{Conjugate symmetry:} For all $u,v\in\C^n$,
$$
\langle u,v\rangle =\sum_{i=1}^n \overline{u_i} v_i=\overline{\sum_{i=1}^n \overline{v_i} u_i}=\overline{\langle v,u\rangle}.
$$
\item \textbf{Linearity:} For any $u,v,w\in\C^n$ and scalars $a,b\in\C$, we have
$$
\langle u, av + bw\rangle = \sum_{i=1}^n \overline{u_i} (av_i + bw_i)=a\langle u,v\rangle + b\langle u,w\rangle.
$$
\item \textbf{Positive definiteness:} For every $u=(u_1,u_2,\cdots,u_n)\in\C^n$, let $u_j=a_j+b_ji$, where $a_j,b_j\in\mathbb{R}$.
$$
\langle u,u\rangle = \sum_{j=1}^n \overline{u_j} u_j=\sum_{i=1}^n (a_i^2+b_i^2)\geq 0,
$$
with equality if and only if $u=0$.
Therefore, the Hermitian inner product is an inner product.
\end{enumerate}
Next, we show that $\C^n$ is complete with respect to the norm induced by this inner product:
$$
\|u\| = \sqrt{\langle u,u\rangle}.
$$
Since $\C^n$ is finite-dimensional, every Cauchy sequence (with respect to any norm) converges in $\C^n$. This is a standard result in finite-dimensional normed spaces, which implies that $\C^n$ is indeed complete.
Therefore, since the Hermitian inner product fulfills the inner product axioms and $\C^n$ is complete, the complex vector space $\C^n$ with the Hermitian inner product is a Hilbert space.
\end{proof}
Another classical example of Hilbert space is $L^2(\Omega, \mathscr{F}, P)$, where $(\Omega, \mathscr{F}, P)$ is a measure space ($\Omega$ is a set, $\mathscr{F}$ is a $\sigma$-algebra on $\Omega$, and $P$ is a measure on $\mathscr{F}$). The $L^2$ space is the space of all function on $\Omega$ that is
\begin{enumerate}
\item \textbf{square integrable}: square integrable functions are the functions $f:\Omega\to \mathbb{C}$ such that
$$
\int_\Omega |f(\omega)|^2 dP(\omega)<\infty
$$
with inner product defined by
$$
\langle f,g\rangle=\int_\Omega \overline{f(\omega)}g(\omega)dP(\omega)
$$
\item \textbf{complex-valued}: functions are complex-valued measurable. $f=u+v i$ is complex-valued if $u$ and $v$ are real-valued measurable.
\end{enumerate}
\begin{prop}
\label{prop:L2_space_is_a_Hilbert_space}
$L^2(\Omega, \mathscr{F}, P)$ is a Hilbert space.
\end{prop}
\begin{proof}
We check the two conditions of the Hilbert space:
\begin{itemize}
\item Completeness:
Let $(f_n)$ be a Cauchy sequence in $L^2(\Omega, \mathscr{F}, P)$. Then for any $\epsilon>0$, there exists an $N$ such that for all $m,n\geq N$, we have
$$
\int_\Omega |f_m(\omega)-f_n(\omega)|^2 dP(\omega)<\epsilon^2
$$
This means that $(f_n)$ is a Cauchy sequence in the norm of $L^2(\Omega, \mathscr{F}, P)$.
\item Inner product:
The inner product is defined by
$$
\langle f,g\rangle=\int_\Omega \overline{f(\omega)}g(\omega)dP(\omega)
$$
This is a well-defined inner product on $L^2(\Omega, \mathscr{F}, P)$. We can check the properties of the inner product:
\begin{itemize}
\item Linearity:
$$
\langle af+bg,h\rangle=a\langle f,h\rangle+b\langle g,h\rangle
$$
\item Conjugate symmetry:
$$
\langle f,g\rangle=\overline{\langle g,f\rangle}
$$
\item Positive definiteness:
$$
\langle f,f\rangle\geq 0
$$
\end{itemize}
\end{itemize}
\end{proof}
Let $\mathscr{H}$ be a Hilbert space. $\mathscr{H}$ consists of complex-valued functions on a finite set $\Omega=\{1,2,\cdots,n\}$, and the functions $(e_1,e_2,\cdots,e_n)$ form an orthonormal basis of $\mathscr{H}$. (We use Dirac notation $|k\rangle$ to denote the basis vector $e_k$~\cite{parthasarathy1992quantum}.)
As an analog to the classical probability space $(\Omega,\mathscr{F},\mu)$, which consists of a sample space $\Omega$ and a probability measure $\mu$ on the state space $\mathscr{F}$, the non-commutative probability space $(\mathscr{H},\mathscr{P},\rho)$ consists of a Hilbert space $\mathscr{H}$ and a state $\rho$ on the space of all orthogonal projections $\mathscr{P}$.
The detailed definition of the non-commutative probability space is given below:
\begin{defn}
\label{defn:non-commutative_probability_space}
Non-commutative probability space:
A non-commutative probability space is a pair $(\mathscr{B}(\mathscr{H}),\mathscr{P})$, where $\mathscr{B}(\mathscr{H})$ is the set of all \textbf{bounded} linear operators on $\mathscr{H}$.
A linear operator on $\mathscr{H}$ is \textbf{bounded} if for all $u$ such that $\|u\|\leq 1$, we have $\|Au\|\leq M$ for some $M>0$.
$\mathscr{P}$ is the set of all orthogonal projections on $\mathscr{B}(\mathscr{H})$.
The set $\mathscr{P}=\{P\in\mathscr{B}(\mathscr{H}):P^*=P=P^2\}$ is the set of all orthogonal projections on $\mathscr{B}(\mathscr{H})$.
\end{defn}
Recall from classical probability theory, we call the initial probability distribution for possible outcomes in the classical probability theory as our \textit{state}, simillarly, we need to define the \textit{state} in the non-commutative probability theory.
\begin{defn}
\label{defn:state}
Non-commutative probability state:
A state on $(\mathscr{B}(\mathscr{H}),\mathscr{P})$ is a map $\rho:\mathscr{P}\to[0,1]$, (commonly named as density operator) such that:
\begin{itemize}
\item $\rho(O)=0$, where $O$ is the zero projection, and $\rho(I)=1$, where $I$ is the identity projection.
\item If $P_1,P_2,\ldots,P_n$ are pairwise disjoint orthogonal projections, then $\rho(P_1 + P_2 + \cdots + P_n) = \sum_{i=1}^n \rho(P_i)$.
\end{itemize}
\end{defn}
An example of a density operator can be given as follows:
If $(|\psi_1\rangle,|\psi_2\rangle,\cdots,|\psi_n\rangle)$ is an orthonormal basis of $\mathscr{H}$ consisting of eigenvectors of $\rho$, for the eigenvalues $p_1,p_2,\cdots,p_n$, then $p_j\geq 0$ and $\sum_{j=1}^n p_j=1$.
We can write $\rho$ as
\[
\rho=\sum_{j=1}^n p_j|\psi_j\rangle\langle\psi_j|
\]
(Under basis $|\psi_j\rangle$, it is a diagonal matrix with $p_j$ on the diagonal.)
% Then we need to introduce a theorem that ensures that every state on the space of all orthogonal projections on $\mathscr{H}$ can be represented by a density operator.
% \begin{theorem}
% \label{theorem:Gleason's_theorem}
% Gleason's theorem (Theorem 1.1.15 in~\cite{parthasarathy2005mathematical})
% Let $\mathscr{H}$ be a Hilbert space over $\mathbb{C}$ or $\mathbb{R}$ of dimension $n\geq 3$. Let $\mu$ be a state on the space $\mathscr{P}$ of projections on $\mathscr{H}$. Then there exists a unique density operator $\rho$ such that
% \[
% \mu(P)=\operatorname{Tr}(\rho P)
% \]
% for all $P\in\mathscr{P}$. $\mathscr{P}$ is the space of all orthogonal projections on $\mathscr{H}$.
% \end{theorem}
% This proof came from~\cite{parthasarathy2005mathematical}.
% \begin{proof}
% % TODO: FILL IN THE PROOF
% \end{proof}
% This theorem is a very important theorem in non-commutative probability theory; it states that any state on the space of all orthogonal projections on $\mathscr{H}$ can be represented by a density operator.
The counterpart of the random variable in the non-commutative probability theory is called an observable, which is a Hermitian operator on $\mathscr{H}$ (for all $\psi,\phi$ in the domain of $A$, we have $\langle A\psi,\phi\rangle=\langle\psi,A\phi\rangle$. This kind of operator ensures that our outcome interpreted as probability is a real number).
\begin{defn}
\label{defn:observable}
Observable:
Let $\mathscr{B}(\mathbb{R})$ be the set of all Borel sets on $\mathbb{R}$.
A random variable on the Hilbert space $\mathscr{H}$ is a projection-valued map (measure) $P:\mathscr{B}(\mathbb{R})\to\mathscr{P}$.
With the following properties:
\begin{itemize}
\item $P(\emptyset)=O$ (the zero projection)
\item $P(\mathbb{R})=I$ (the identity projection)
\item For any sequence $A_1,A_2,\cdots,A_n\in \mathscr{B}(\mathbb{R})$, the following holds:
\begin{itemize}
\item $P(\bigcup_{i=1}^n A_i)=\bigvee_{i=1}^n P(A_i)$
\item $P(\bigcap_{i=1}^n A_i)=\bigwedge_{i=1}^n P(A_i)$
\item $P(A^c)=I-P(A)$
\item If $A_j$ are mutually disjoint (that is $P(A_i)P(A_j)=P(A_j)P(A_i)=O$ for $i\neq j$), then $P(\bigcup_{j=1}^n A_j)=\sum_{j=1}^n P(A_j)$
\end{itemize}
\end{itemize}
\end{defn}
\begin{defn}
\label{defn:probability_of_random_variable}
Probability of a random variable:
For a system prepared in state $\rho$, the probability that the random variable given by the projection-valued measure $P$ is in the Borel set $A$ is $\operatorname{Tr}(\rho P(A))$.
\end{defn}
When operators commute, we recover classical probability measures.
\begin{defn}
\label{defn:measurement}
Definition of measurement:
A measurement (observation) of a system prepared in a given state produces an outcome $x$, $x$ is a physical event that is a subset of the set of all possible outcomes. For each $x$, we associate a measurement operator $M_x$ on $\mathscr{H}$.
Given the initial state (pure state, unit vector) $u$, the probability of measurement outcome $x$ is given by:
\[
p(x)=\|M_xu\|^2
\]
Note that to make sense of this definition, the collection of measurement operators $\{M_x\}$ must satisfy the completeness requirement:
\[
1=\sum_{x\in X} p(x)=\sum_{x\in X}\|M_xu\|^2=\sum_{x\in X}\langle M_xu,M_xu\rangle=\langle u,(\sum_{x\in X}M_x^*M_x)u\rangle
\]
So $\sum_{x\in X}M_x^*M_x=I$.
\end{defn}
\begin{prop}
\label{prop:indistinguishability}
Proposition of indistinguishability:
Suppose that we have two systems $u_1,u_2\in \mathscr{H}_1$, the two states are distinguishable if and only if they are orthogonal.
\end{prop}
\begin{proof}
Ways to distinguish the two states:
\begin{enumerate}
\item Set $X=\{0,1,2\}$ and $M_i=|u_i\rangle\langle u_i|$, $M_0=I-M_1-M_2$
\item Then $\{M_0,M_1,M_2\}$ is a complete collection of measurement operators on $\mathscr{H}$.
\item Suppose the prepared state is $u_1$, then $p(1)=\|M_1u_1\|^2=\|u_1\|^2=1$, $p(2)=\|M_2u_1\|^2=0$, $p(0)=\|M_0u_1\|^2=0$.
\end{enumerate}
If they are not orthogonal, then there is no choice of measurement operators to perfectly distinguish the two states.
\end{proof}
\textit{Intuitively, if the two states are not orthogonal, then for any measurement (projection) there exists non-zero probability of getting the same outcome for both states.}
Here is Table~\ref{tab:analog_of_classical_probability_theory_and_non_commutative_probability_theory} summarizing the analog of classical probability theory and non-commutative (\textit{quantum}) probability theory~\cite{Feres}:
\begin{table}
\centering
\renewcommand{\arraystretch}{1.5}
\caption{Analog of classical probability theory and non-commutative (\textit{quantum}) probability theory}
\label{tab:analog_of_classical_probability_theory_and_non_commutative_probability_theory}
{\tiny
\begin{tabular}{|p{0.5\linewidth}|p{0.5\linewidth}|}
\hline
\textbf{Classical probability} & \textbf{Non-commutative probability} \\
\hline
Sample space $\Omega$, cardinality $\vert\Omega\vert=n$, example: $\Omega=\{0,1\}$ & Complex Hilbert space $\mathscr{H}$, dimension $\dim\mathscr{H}=n$, example: $\mathscr{H}=\mathbb{C}^2$ \\
\hline
Common algebra of $\mathbb{C}$ valued functions & Algebra of bounded operators $\mathscr{B}(\mathscr{H})$ \\
\hline
$f\mapsto \bar{f}$ complex conjugation & $P\mapsto P^*$ adjoint \\
\hline
Events: indicator functions of sets & Projections: space of orthogonal projections $\mathscr{P}\subseteq\mathscr{B}(\mathscr{H})$ \\
\hline
functions $f$ such that $f^2=f=\overline{f}$ & orthogonal projections $P$ such that $P^*=P=P^2$ \\
\hline
$\mathbb{R}$-valued functions $f=\overline{f}$ & self-adjoint operators $A=A^*$ \\
\hline
$\mathbb{I}_{f^{-1}(\{\lambda\})}$ is the indicator function of the set $f^{-1}(\{\lambda\})$ & $P(\lambda)$ is the orthogonal projection to eigenspace \\
\hline
$f=\sum_{\lambda\in \operatorname{Range}(f)}\lambda \mathbb{I}_{f^{-1}(\{\lambda\})}$ & $A=\sum_{\lambda\in \operatorname{sp}(A)}\lambda P(\lambda)$ \\
\hline
Probability measure $\mu$ on $\Omega$ & Density operator $\rho$ on $\mathscr{H}$ \\
\hline
Delta measure $\delta_\omega$ & Pure state $\rho=\vert\psi\rangle\langle\psi\vert$ \\
\hline
$\mu$ is non-negative measure and $\sum_{i=1}^n\mu(\{i\})=1$ & $\rho$ is positive semi-definite and $\operatorname{Tr}(\rho)=1$ \\
\hline
Expected value of random variable $f$ is $\mathbb{E}_{\mu}(f)=\sum_{i=1}^n f(i)\mu(\{i\})$ & Expected value of operator $A$ is $\mathbb{E}_\rho(A)=\operatorname{Tr}(\rho A)$ \\
\hline
Variance of random variable $f$ is $\operatorname{Var}_\mu(f)=\sum_{i=1}^n (f(i)-\mathbb{E}_\mu(f))^2\mu(\{i\})$ & Variance of operator $A$ is $\operatorname{Var}_\rho(A)=\operatorname{Tr}(\rho A^2)-\operatorname{Tr}(\rho A)^2$ \\
\hline
Covariance of random variables $f$ and $g$ is $\operatorname{Cov}_\mu(f,g)=\sum_{i=1}^n (f(i)-\mathbb{E}_\mu(f))(g(i)-\mathbb{E}_\mu(g))\mu(\{i\})$ & Covariance of operators $A$ and $B$ is $\operatorname{Cov}_\rho(A,B)=\operatorname{Tr}(\rho A\circ B)-\operatorname{Tr}(\rho A)\operatorname{Tr}(\rho B)$ \\
\hline
Composite system is given by Cartesian product of the sample spaces $\Omega_1\times\Omega_2$ & Composite system is given by tensor product of the Hilbert spaces $\mathscr{H}_1\otimes\mathscr{H}_2$ \\
\hline
Product measure $\mu_1\times\mu_2$ on $\Omega_1\times\Omega_2$ & Tensor product of space $\rho_1\otimes\rho_2$ on $\mathscr{H}_1\otimes\mathscr{H}_2$ \\
\hline
Marginal distribution $\pi_*v$ & Partial trace $\operatorname{Tr}_2(\rho)$ \\
\hline
\end{tabular}
}
\vspace{0.5cm}
\end{table}
This agrees with the experimentally observed transmission probabilities, but it should be emphasized that this quantity corresponds to a \emph{sequential measurement} rather than a joint probability in the classical sense.
\section{Concentration of measure phenomenon}
@@ -612,6 +190,8 @@ Here is Table~\ref{tab:analog_of_classical_probability_theory_and_non_commutativ
That basically means that the function $f$ should not change the distance between any two pairs of points in $X$ by more than a factor of $L$.
This is a stronger condition than continuity, every Lipschitz function is continuous, but not every continuous function is Lipschitz.
\begin{lemma}
\label{lemma:isoperimetric_inequality_on_sphere}
Isoperimetric inequality on the sphere:
@@ -681,7 +261,7 @@ If $X\sim \operatorname{Unif}(S^n(\sqrt{n}))$, then for any fixed unit vector $x
\begin{figure}[h]
\centering
\includegraphics[width=0.8\textwidth]{./images/maxwell.png}
\includegraphics[width=0.8\textwidth]{../images/maxwell.png}
\caption{Maxwell-Boltzmann distribution law, image from \cite{romanvershyni}}
\label{fig:Maxwell-Boltzmann_distribution_law}
\end{figure}
@@ -968,16 +548,13 @@ Experimentally, we can have the following result:
As the dimension of the Hilbert space increases, the chance of getting an almost maximally entangled state increases (see Figure~\ref{fig:entropy_vs_dA}).
\begin{figure}
\begin{figure}[h]
\centering
\includegraphics[width=0.8\textwidth]{entropy_vs_dA.png}
\caption{Entropy vs $d_A$}
\label{fig:entropy_vs_dA}
\end{figure}
In Hayden's work, the result is also extended to the multiparty case~\cite{Hayden}, and the result is still under research and I will show the result in the final report if I have enough time.
% When compiled standalone, print this chapter's references at the end.
\ifSubfilesClassLoaded{
\printbibliography[title={References for Chapter 1}]

Binary file not shown.

View File

@@ -0,0 +1,48 @@
"""
plot the probability of the entropy of the reduced density matrix of the pure state being greater than log2(d_A) - alpha - beta
for different alpha values
IGNORE THE CONSTANT C
NOTE there is bug in the program, You should fix it if you want to use the visualization, it relates to the alpha range and you should not plot the prob of 0
"""
import numpy as np
import matplotlib.pyplot as plt
from quantum_states import sample_and_calculate
from tqdm import tqdm
# Set dimensions
db = 16
da_values = [8, 16, 32]
alpha_range = np.linspace(0, 2, 100) # Range of alpha values to plot
n_samples = 100000
plt.figure(figsize=(10, 6))
for da in tqdm(da_values, desc="Processing d_A values"):
# Calculate beta according to the formula
beta = da / (np.log(2) * db)
# Calculate probability for each alpha
predicted_probabilities = []
actual_probabilities = []
for alpha in tqdm(alpha_range, desc=f"Calculating probabilities for d_A={da}", leave=False):
# Calculate probability according to the formula
# Ignoring constant C as requested
prob = np.exp(-(da * db - 1) * alpha**2 / (np.log2(da))**2)
predicted_probabilities.append(prob)
# Calculate actual probability
entropies = sample_and_calculate(da, db, n_samples=n_samples)
actual_probabilities.append(np.sum(entropies > np.log2(da) - alpha - beta) / n_samples)
# plt.plot(alpha_range, predicted_probabilities, label=f'$d_A={da}$', linestyle='--')
plt.plot(alpha_range, actual_probabilities, label=f'$d_A={da}$', linestyle='-')
plt.xlabel(r'$\alpha$')
plt.ylabel('Probability')
plt.title(r'$\operatorname{Pr}[H(\psi_A) <\log_2(d_A)-\alpha-\beta]$ vs $\alpha$ for different $d_A$')
plt.legend()
plt.grid(True)
plt.yscale('log') # Use log scale for better visualization
plt.show()

View File

@@ -0,0 +1,52 @@
"""
plot the probability of the entropy of the reduced density matrix of the pure state being greater than log2(d_A) - alpha - beta
for different d_A values, with fixed alpha and d_B Note, d_B>d_A
"""
import numpy as np
import matplotlib.pyplot as plt
from quantum_states import sample_and_calculate
from tqdm import tqdm
# Set dimensions
db = 32
alpha = 0
da_range = np.arange(2, 10, 1) # Range of d_A values to plot
n_samples = 1000000
plt.figure(figsize=(10, 6))
predicted_probabilities = []
actual_probabilities = []
for da in tqdm(da_range, desc="Processing d_A values"):
# Calculate beta according to the formula
beta = da / (np.log(2) * db)
# Calculate probability according to the formula
# Ignoring constant C as requested
prob = np.exp(-((da * db - 1) * alpha**2 / (np.log2(da)**2)))
predicted_probabilities.append(prob)
# Calculate actual probability
entropies = sample_and_calculate(da, db, n_samples=n_samples)
count = np.sum(entropies < np.log2(da) - alpha - beta)
# early stop if count is 0
if count != 0:
actual_probabilities.append(count / n_samples)
else:
actual_probabilities.extend([np.nan] * (len(da_range) - len(actual_probabilities)))
break
# debug
print(f'da={da}, theoretical_prob={prob}, threshold={np.log2(da) - alpha - beta}, actual_prob={actual_probabilities[-1]}, entropy_heads={entropies[:10]}')
# plt.plot(da_range, predicted_probabilities, label=f'$d_A={da}$', linestyle='--')
plt.plot(da_range, actual_probabilities, label=f'$d_A={da}$', linestyle='-')
plt.xlabel(r'$d_A$')
plt.ylabel('Probability')
plt.title(r'$\operatorname{Pr}[H(\psi_A) < \log_2(d_A)-\alpha-\beta]$ vs $d_A$ for fixed $\alpha=$'+str(alpha)+r' and $d_B=$' +str(db)+ r' with $n=$' +str(n_samples))
# plt.legend()
plt.grid(True)
plt.yscale('log') # Use log scale for better visualization
plt.show()

View File

@@ -0,0 +1,55 @@
import numpy as np
import matplotlib.pyplot as plt
from quantum_states import sample_and_calculate
from tqdm import tqdm
# Set dimensions, keep db\geq da\geq 3
db = 64
da_values = [4, 8, 16, 32]
da_colors = ['b', 'g', 'r', 'c']
n_samples = 100000
plt.figure(figsize=(10, 6))
# Define range of deviations to test (in bits)
deviations = np.linspace(0, 1, 50) # Test deviations from 0 to 1 bits
for i, da in enumerate(tqdm(da_values, desc="Processing d_A values")):
# Calculate maximal entropy
max_entropy = np.log2(min(da, db))
# Sample random states and calculate their entropies
entropies = sample_and_calculate(da, db, n_samples=n_samples)
# Calculate probabilities for each deviation
probabilities = []
theoretical_probs = []
for dev in deviations:
# Count states that deviate by more than dev bits from max entropy
count = np.sum(max_entropy - entropies > dev)
# Omit the case where count is 0
if count != 0:
prob = count / len(entropies)
probabilities.append(prob)
else:
probabilities.append(np.nan)
# Calculate theoretical probability using concentration inequality
# note max_entropy - dev = max_entropy - beta - alpha, so alpha = dev - beta
beta = da / (np.log(2)*db)
alpha = dev - beta
theoretical_prob = np.exp(-(da * db - 1) * alpha**2 / (np.log2(da))**2)
# # debug
# print(f"dev: {dev}, beta: {beta}, alpha: {alpha}, theoretical_prob: {theoretical_prob}")
theoretical_probs.append(theoretical_prob)
plt.plot(deviations, probabilities, '-', label=f'$d_A={da}$ (simulated)', color=da_colors[i])
plt.plot(deviations, theoretical_probs, '--', label=f'$d_A={da}$ (theoretical)', color=da_colors[i])
plt.xlabel('Deviation from maximal entropy (bits)')
plt.ylabel('Probability')
plt.title(f'Probability of deviation from maximal entropy simulation with sample size {n_samples} for $d_B={db}$ ignoring the constant $C$')
plt.legend()
plt.grid(True)
plt.yscale('log') # Use log scale for better visualization
plt.show()

View File

@@ -0,0 +1,33 @@
import numpy as np
import matplotlib.pyplot as plt
from quantum_states import sample_and_calculate
from tqdm import tqdm
# Define range of dimensions to test
fixed_dim = 64
dimensions = np.arange(2, 64, 2) # Test dimensions from 2 to 50 in steps of 2
expected_entropies = []
theoretical_entropies = []
predicted_entropies = []
# Calculate entropies for each dimension
for dim in tqdm(dimensions, desc="Calculating entropies"):
# For each dimension, we'll keep one subsystem fixed at dim=2
# and vary the other dimension
entropies = sample_and_calculate(dim, fixed_dim, n_samples=1000)
expected_entropies.append(np.mean(entropies))
theoretical_entropies.append(np.log2(min(dim, fixed_dim)))
beta = min(dim, fixed_dim)/(2*np.log(2)*max(dim, fixed_dim))
predicted_entropies.append(np.log2(min(dim, fixed_dim)) - beta)
# Create the plot
plt.figure(figsize=(10, 6))
plt.plot(dimensions, expected_entropies, 'b-', label='Expected Entropy')
plt.plot(dimensions, theoretical_entropies, 'r--', label='Theoretical Entropy')
plt.plot(dimensions, predicted_entropies, 'g--', label='Predicted Entropy')
plt.xlabel('Dimension of Subsystem B')
plt.ylabel('von Neumann Entropy (bits)')
plt.title(f'von Neumann Entropy vs. System Dimension, with Dimension of Subsystem A = {fixed_dim}')
plt.legend()
plt.grid(True)
plt.show()

View File

@@ -0,0 +1,51 @@
import numpy as np
import matplotlib.pyplot as plt
from quantum_states import sample_and_calculate
from tqdm import tqdm
from mpl_toolkits.mplot3d import Axes3D
# Define range of dimensions to test
dimensionsA = np.arange(2, 64, 2) # Test dimensions from 2 to 50 in steps of 2
dimensionsB = np.arange(2, 64, 2) # Test dimensions from 2 to 50 in steps of 2
# Create meshgrid for 3D plot
X, Y = np.meshgrid(dimensionsA, dimensionsB)
Z = np.zeros_like(X, dtype=float)
# Calculate entropies for each dimension combination
total_iterations = len(dimensionsA) * len(dimensionsB)
pbar = tqdm(total=total_iterations, desc="Calculating entropies")
for i, dim_a in enumerate(dimensionsA):
for j, dim_b in enumerate(dimensionsB):
entropies = sample_and_calculate(dim_a, dim_b, n_samples=100)
Z[j,i] = np.mean(entropies)
pbar.update(1)
pbar.close()
# Create the 3D plot
fig = plt.figure(figsize=(12, 8))
ax = fig.add_subplot(111, projection='3d')
# Plot the surface
surf = ax.plot_surface(X, Y, Z, cmap='viridis')
# Add labels and title with larger font sizes
ax.set_xlabel('Dimension of Subsystem A', fontsize=12, labelpad=10)
ax.set_ylabel('Dimension of Subsystem B', fontsize=12, labelpad=10)
ax.set_zlabel('von Neumann Entropy (bits)', fontsize=12, labelpad=10)
ax.set_title('von Neumann Entropy vs. System Dimensions', fontsize=14, pad=20)
# Add colorbar
cbar = fig.colorbar(surf, ax=ax, label='Entropy')
cbar.ax.set_ylabel('Entropy', fontsize=12)
# Add tick labels with larger font size
ax.tick_params(axis='x', labelsize=10)
ax.tick_params(axis='y', labelsize=10)
ax.tick_params(axis='z', labelsize=10)
# Rotate the plot for better visibility
ax.view_init(elev=30, azim=45)
plt.show()

96
codes/quantum_states.py Normal file
View File

@@ -0,0 +1,96 @@
import numpy as np
from scipy.linalg import sqrtm
from scipy.stats import unitary_group
from tqdm import tqdm
def random_pure_state(dim_a, dim_b):
"""
Generate a random pure state for a bipartite system.
The random pure state is uniformly distributed by the Haar (Fubini-Study) measure on the unit sphere $S^{dim_a * dim_b - 1}$. (Invariant under the unitary group $U(dim_a) \times U(dim_b)$)
Args:
dim_a (int): Dimension of subsystem A
dim_b (int): Dimension of subsystem B
Returns:
numpy.ndarray: Random pure state vector of shape (dim_a * dim_b,)
"""
# Total dimension of the composite system
dim_total = dim_a * dim_b
# Generate non-zero random complex vector
while True:
state = np.random.normal(size=(dim_total,)) + 1j * np.random.normal(size=(dim_total,))
if np.linalg.norm(state) > 0:
break
# Normalize the state
state = state / np.linalg.norm(state)
return state
def von_neumann_entropy_bipartite_pure_state(state, dim_a, dim_b):
"""
Calculate the von Neumann entropy of the reduced density matrix.
Args:
state (numpy.ndarray): Pure state vector
dim_a (int): Dimension of subsystem A
dim_b (int): Dimension of subsystem B
Returns:
float: Von Neumann entropy
"""
# Reshape state vector to matrix form
state_matrix = state.reshape(dim_a, dim_b)
# Calculate reduced density matrix of subsystem A
rho_a = np.dot(state_matrix, state_matrix.conj().T)
# Calculate eigenvalues
eigenvals = np.linalg.eigvalsh(rho_a)
# Remove very small eigenvalues (numerical errors)
eigenvals = eigenvals[eigenvals > 1e-15]
# Calculate von Neumann entropy
entropy = -np.sum(eigenvals * np.log2(eigenvals))
return np.real(entropy)
def sample_and_calculate(dim_a, dim_b, n_samples=1000):
"""
Sample random pure states (generate random co) and calculate their von Neumann entropy.
Args:
dim_a (int): Dimension of subsystem A
dim_b (int): Dimension of subsystem B
n_samples (int): Number of samples to generate
Returns:
numpy.ndarray: Array of entropy values
"""
entropies = np.zeros(n_samples)
for i in tqdm(range(n_samples), desc=f"Sampling states (d_A={dim_a}, d_B={dim_b})", leave=False):
state = random_pure_state(dim_a, dim_b)
entropies[i] = von_neumann_entropy_bipartite_pure_state(state, dim_a, dim_b)
return entropies
# Example usage:
if __name__ == "__main__":
# Example: 2-qubit system
dim_a, dim_b = 50,100
# Generate single random state and calculate entropy
state = random_pure_state(dim_a, dim_b)
entropy = von_neumann_entropy_bipartite_pure_state(state, dim_a, dim_b)
print(f"Single state entropy: {entropy}")
# Sample multiple states
entropies = sample_and_calculate(dim_a, dim_b, n_samples=1000)
print(f"Expected entropy: {np.mean(entropies)}")
print(f"Theoretical entropy: {np.log2(max(dim_a, dim_b))}")
print(f"Standard deviation: {np.std(entropies)}")

32
codes/test.py Normal file
View File

@@ -0,0 +1,32 @@
# unit test for the functions in quantum_states.py
import unittest
import numpy as np
from quantum_states import random_pure_state, von_neumann_entropy_bipartite_pure_state
class LearningCase(unittest.TestCase):
def test_random_pure_state_shape_and_norm(self):
dim_a = 2
dim_b = 2
state = random_pure_state(dim_a, dim_b)
self.assertEqual(state.shape, (dim_a * dim_b,))
self.assertAlmostEqual(np.linalg.norm(state), 1)
def test_partial_trace_entropy(self):
dim_a = 2
dim_b = 2
state = random_pure_state(dim_a, dim_b)
self.assertAlmostEqual(von_neumann_entropy_bipartite_pure_state(state, dim_a, dim_b), von_neumann_entropy_bipartite_pure_state(state, dim_b, dim_a))
def test_sample_uniformly(self):
# calculate the distribution of the random pure state
dim_a = 2
dim_b = 2
state = random_pure_state(dim_a, dim_b)
def main():
unittest.main()
if __name__ == "__main__":
main()

View File

@@ -156,3 +156,13 @@
primaryClass={quant-ph},
url={https://arxiv.org/abs/1410.7188},
}
@book{axler2023linear,
title={Linear Algebra Done Right},
author={Axler, S.},
isbn={9783031410260},
series={Undergraduate Texts in Mathematics},
url={https://books.google.com/books?id=OdnfEAAAQBAJ},
year={2023},
publisher={Springer International Publishing}
}

BIN
main.pdf

Binary file not shown.

View File

@@ -34,6 +34,26 @@
giveninits=true
]{biblatex}
% --- Beamer-like blocks (printer-friendly) ---
\usepackage[most]{tcolorbox}
\usepackage{xcolor}
% A dedicated "Examples" block (optional convenience wrapper)
\newtcolorbox{examples}{%
enhanced,
breakable,
colback=white,
colframe=black!90,
coltitle=white, % title text color
colbacktitle=black!90, % <<< grey 80 title bar
boxrule=0.6pt,
arc=1.5mm,
left=1.2mm,right=1.2mm,top=1.0mm,bottom=1.0mm,
fonttitle=\bfseries,
title=Examples
}
% In the assembled book, we load *all* chapter bib files here,
% and print one combined bibliography at the end.
@@ -74,8 +94,8 @@
\mainmatter
% Each chapter is in its own file and included as a subfile.
\subfile{preface}
\subfile{chapters/chap0}
% \subfile{preface}
% \subfile{chapters/chap0}
\subfile{chapters/chap1}
\subfile{chapters/chap2}
\subfile{chapters/chap3}