HonorThesis/latex/chapters/chap0.tex

% chapters/chap0.tex
\documentclass[../main.tex]{subfiles}

% If this chapter is compiled *by itself*, we must load only its own .bib
% and print its bibliography at the end of the chapter.
\ifSubfilesClassLoaded{
  \addbibresource{\subfix{../main.bib}}
}

\begin{document}

\chapter*{Chapter 0: Brief definitions and basic concepts}
\addcontentsline{toc}{chapter}{Chapter 0: Brief definitions and basic concepts}
\markboth{Chapter 0: Brief definitions and basic concepts}{}

As the future version of me might forgot everything we have over the summer, as I did for now, I will make a review again from the simple definition to recall the necessary information to tell you why we are here and how we are going to proceed.

This section serve as reference for definitions, notations, and theorems that we will use later. This section can be safely ignored if you are already familiar with the definitions and theorems.

But for the future self who might have no idea what I'm talking about, we will provided detailed definitions to you to understand the concepts.

\section{Complex vector spaces}

The main vector space we are interested in is $\mathbb{C}^n$; therefore, all the linear operators we defined are from $\mathbb{C}^n$ to $\mathbb{C}^n$.

\begin{defn}
    \label{defn:braket}

    We denote a vector in vector space as $\ket{\psi}=(z_1,\ldots,z_n)$ (might also be infinite dimensional, and $z_i\in\mathbb{C}$).

\end{defn}


Here $\psi$ is just a label for the vector, and you don't need to worry about it too much. This is also called the ket, where the counterpart $\bra{\psi}$ is called the bra, used to denote the vector dual to $\psi$; such an element is a linear functional if you really want to know what that is.

Few additional notation will be introduced, in this document, we will follows the notation used in mathematics literature \cite{axler2023linear}

\begin{itemize}
    \item $\langle\psi|\varphi\rangle$ is the inner product between two vectors, and $\bra{\psi} A\ket{\varphi}$ is the inner product between $A\ket{\varphi}$ and $\bra{\psi}$, or equivalently $A^\dagger \bra{\psi}$ and $\ket{\varphi}$.
    \item Given a complex matrix $A=\mathbb{C}^{n\times n}$,
          \begin{enumerate}
              \item  $\overline{A}$ is the complex conjugate of $A$.
                    \begin{examples}
                        $$
                            A=\begin{bmatrix}
                                1+i & 2+i & 3+i \\
                                4+i & 5+i & 6+i \\
                                7+i & 8+i & 9+i\end{bmatrix},
                            \overline{A}=\begin{bmatrix}
                                1-i & 2-i & 3-i \\
                                4-i & 5-i & 6-i \\
                                7-i & 8-i & 9-i
                            \end{bmatrix}
                        $$
                    \end{examples}
              \item  $A^\top$ denotes the transpose of $A$.
                    \begin{examples}
                        $$
                            A=\begin{bmatrix}
                                1+i & 2+i & 3+i \\
                                4+i & 5+i & 6+i \\
                                7+i & 8+i & 9+i
                            \end{bmatrix},
                            A^\top=\begin{bmatrix}
                                1+i & 4+i & 7+i \\
                                2+i & 5+i & 8+i \\
                                3+i & 6+i & 9+i
                            \end{bmatrix}
                        $$
                    \end{examples}
              \item $A^*=\overline{(A^\top)}$ denotes the complex conjugate transpose, referred to as the adjoint, or Hermitian conjugate of $A$.
                    \begin{examples}
                        $$
                            A=\begin{bmatrix}
                                1+i & 2+i & 3+i \\
                                4+i & 5+i & 6+i \\
                                7+i & 8+i & 9+i
                            \end{bmatrix},
                            A^*=\begin{bmatrix}
                                1-i & 4-i & 7-i \\
                                2-i & 5-i & 8-i \\
                                3-i & 6-i & 9-i
                            \end{bmatrix}
                        $$
                    \end{examples}
              \item  $A$ is unitary if $A^* A=AA^*=I$.
              \item  $A$ is self-adjoint (hermitian in physics literature) if $A^*=A$.
          \end{enumerate}
\end{itemize}

\subsubsection{Motivation of Tensor product}

Recall from the traditional notation of product space of two vector spaces $V$ and $W$, that is, $V\times W$, is the set of all ordered pairs $(\ket{v},\ket{w})$ where $\ket{v}\in V$ and $\ket{w}\in W$.

The space has dimension $\dim V+\dim W$.

We want to define a vector space with the notation of multiplication of two vectors from different vector spaces.

That is

$$
    (\ket{v_1}+\ket{v_2})\otimes \ket{w}=(\ket{v_1}\otimes \ket{w})+(\ket{v_2}\otimes \ket{w})
$$
$$
    \ket{v}\otimes (\ket{w_1}+\ket{w_2})=(\ket{v}\otimes \ket{w_1})+(\ket{v}\otimes \ket{w_2})
$$

and enables scalar multiplication by

$$
    \lambda (\ket{v}\otimes \ket{w})=(\lambda \ket{v})\otimes \ket{w}=\ket{v}\otimes (\lambda \ket{w})
$$

And we wish to build a way to associate the basis of $V$ and $W$ with the basis of $V\otimes W$. That makes the tensor product a vector space with dimension $\dim V\times \dim W$.

\begin{defn}
    \label{defn:linear_functional}
    Definition of linear functional

    A linear functional is a linear map from $V$ to $\mathbb{F}$.

\end{defn}

Note the difference between a linear functional and a linear map.

A generalized linear map is a function $f: V\to W$ satisfying the condition.

\begin{itemize}
    \item $f(\ket{u}+\ket{v})=f(\ket{u})+f(\ket{v})$
    \item $f(\lambda \ket{v})=\lambda f(\ket{v})$
\end{itemize}


\begin{defn}
    \label{defn:bilinear_functional}
    A bilinear functional is a bilinear function $\beta:V\times W\to \mathbb{F}$ satisfying the condition that $\ket{v}\to \beta(\ket{v},\ket{w})$ is a linear functional for all $\ket{w}\in W$ and $\ket{w}\to \beta(\ket{v},\ket{w})$ is a linear functional for all $\ket{v}\in V$.

\end{defn}

The vector space of all bilinear functionals is denoted by $\mathcal{B}(V, W)$.


\begin{defn}
    \label{defn:tensor_product}
    Let $V, W$ be two vector spaces.

    Let $V'$ and $W'$ be the dual spaces of $V$ and $W$, respectively, that is $V'=\{\psi:V\to \mathbb{F}\}$ and $W'=\{\phi:W\to \mathbb{F}\}$, $\psi, \phi$ are linear functionals.

    The tensor product of vectors $v\in V$ and $w\in W$ is the bilinear functional defined by $\forall (\psi,\phi)\in V'\times W'$ given by the notation

    $$
        (v\otimes w)(\psi,\phi)=\psi(v)\phi(w)
    $$

    The tensor product of two vector spaces $V$ and $W$ is the vector space $\mathcal{B}(V',W')$

    Notice that the basis of such vector space is the linear combination of the basis of $V'$ and $W'$, that is, if $\{e_i\}$ is the basis of $V'$ and $\{f_j\}$ is the basis of $W'$, then $\{e_i\otimes f_j\}$ is the basis of $\mathcal{B}(V', W')$.

    That is, every element of $\mathcal{B}(V', W')$ can be written as a linear combination of the basis.

    Since $\{e_i\}$ and $\{f_j\}$ are bases of $V'$ and $W'$, respectively, then we can always find a set of linear functionals $\{\phi_i\}$ and $\{\psi_j\}$ such that $\phi_i(e_j)=\delta_{ij}$ and $\psi_j(f_i)=\delta_{ij}$.

    Here $\delta_{ij}=\begin{cases}
            1 & \text{if } i=j   \\
            0 & \text{otherwise}
        \end{cases}$ is the Kronecker delta.

    $$
        V\otimes W=\left\{\sum_{i=1}^n \sum_{j=1}^m a_{ij} \phi_i(v)\psi_j(w): \phi_i\in V', \psi_j\in W'\right\}
    $$

\end{defn}

Note that $\sum_{i=1}^n \sum_{j=1}^m a_{ij} \phi_i(v)\psi_j(w)$ is a bilinear functional that maps $V'\times W'$ to $\mathbb{F}$.

This enables basis-free construction of vector spaces with proper multiplication and scalar multiplication.

\begin{examples}[Examples of tensor product for vectors]

    Let $V = \mathbb{C}^2, W = \mathbb{C}^3$, choose bases $\{\ket{0}, \ket{1}\} \subset V, \{\ket{0}, \ket{1}, \ket{2}\} \subset W$.

    $$
        v=\begin{pmatrix}
            v_1 \\
            v_2
        \end{pmatrix}=v_1\ket{0}+v_2\ket{1}\in V,w=\begin{pmatrix}
            w_1 \\
            w_2 \\
            w_3
        \end{pmatrix}=w_1\ket{0}+w_2\ket{1}+w_3\ket{2}\in W
    $$.

    Then the tensor product $v\otimes w$ is given by

    $$
        v\otimes w=\begin{pmatrix}
            v_1 w_1 & v_1 w_2 & v_1 w_3 \\
            v_2 w_1 & v_2 w_2 & v_2 w_3
        \end{pmatrix}\in \mathbb{C}^6
    $$
\end{examples}

\begin{examples}[Examples of tensor product for vector spaces]

    Let $V = \mathbb{C}^2, W = \mathbb{C}^3$, choose bases $\{\ket{0}, \ket{1}\} \subset V, \{\ket{0}, \ket{1}, \ket{2}\} \subset W.$

    Then a basis of the tensor product is
    $$
        \{
        \ket{00}, \ket{01}, \ket{02},
        \ket{10}, \ket{11}, \ket{12}
        \},
    $$
    where $\ket{ij} := \ket{i}\otimes\ket{j}$.

    An example element of $V \otimes W$ is
    $$
        \ket{\psi}
        =
        2\,\ket{0}\otimes\ket{1}
        +
        (1+i)\,\ket{1}\otimes\ket{0}
        -
        i\,\ket{1}\otimes\ket{2}.
    $$

    With respect to the ordered basis
    $$
        (\ket{00}, \ket{01}, \ket{02}, \ket{10}, \ket{11}, \ket{12}),
    $$
    this tensor corresponds to the coordinate vector
    $$
        \ket{\psi}
        \;\longleftrightarrow\;
        \begin{pmatrix}
            0   \\
            2   \\
            0   \\
            1+i \\
            0   \\
            -i
        \end{pmatrix}
        \in \mathbb{C}^6.
    $$

    Using the canonical identification
    $$
        \mathbb{C}^2 \otimes \mathbb{C}^3 \cong \mathbb{C}^{2\times 3},
    $$
    where
    $$
        \ket{i}\otimes\ket{j} \longmapsto E_{ij},
    $$
    the same tensor is represented by the matrix
    $$
        \ket{\psi}
        \;\longleftrightarrow\;
        \begin{pmatrix}
            0   & 2 & 0  \\
            1+i & 0 & -i
        \end{pmatrix}.
    $$

\end{examples}

\begin{defn}
    \label{defn:inner_product_on_tensor_product}

    The vector space defined by the tensor product is equipped with the unique inner product $\langle v\otimes w, u\otimes x\rangle_{V\otimes W}: V\otimes W\times V\otimes W\to \mathbb{F}$ defined by

    $$
        \langle v\otimes w, u\otimes x\rangle=\langle v,u\rangle_V\langle w,x\rangle_W
    $$
\end{defn}

In practice, we ignore the subscript of the vector space and just write $\langle v\otimes w, u\otimes x\rangle=\langle v,u\rangle\langle w,x\rangle$.
Partial trace

\begin{defn}

    \label{defn:trace}

    Let $T$ be a linear operator on $\mathscr{H}$, $(e_1,e_2,\cdots,e_n)$ be a basis of $\mathscr{H}$ and $(\epsilon_1,\epsilon_2,\cdots,\epsilon_n)$ be a basis of dual space $\mathscr{H}^*$. Then the trace of $T$ is defined by

    $$
        \operatorname{Tr}(T)=\sum_{i=1}^n \epsilon_i(T(e_i))=\sum_{i=1}^n \langle e_i,T(e_i)\rangle
    $$

\end{defn}

This is equivalent to the sum of the diagonal elements of $T$.

\begin{defn}
    \label{defn:partial_trace}

    Let $T$ be a linear operator on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$, where $\mathscr{A}$ and $\mathscr{B}$ are finite-dimensional Hilbert spaces.

    An operator $T$ on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$ can be written as

    $$
        T=\sum_{i=1}^n a_i A_i\otimes B_i
    $$

    where $A_i$ is a linear operator on $\mathscr{A}$ and $B_i$ is a linear operator on $\mathscr{B}$.

    The $\mathscr{B}$-partial trace of $T$ ($\operatorname{Tr}_{\mathscr{B}}(T):\mathcal{L}(\mathscr{A}\otimes \mathscr{B})\to \mathcal{L}(\mathscr{A})$) is the linear operator on $\mathscr{A}$ defined by

    $$
        \operatorname{Tr}_{\mathscr{B}}(T)=\sum_{i=1}^n a_i \operatorname{Tr}(B_i) A_i
    $$

\end{defn}
Or we can define the map $L_v: \mathscr{A}\to \mathscr{A}\otimes \mathscr{B}$ by

$$
    L_v(u)=u\otimes v
$$

Note that $\langle u,L_v^*(u')\otimes v'\rangle=\langle u,u'\rangle \langle v,v'\rangle=\langle u\otimes v,u'\otimes v'\rangle=\langle L_v(u),u'\otimes v'\rangle$.

Therefore, $L_v^*\sum_{j} u_j\otimes v_j=\sum_{j} \langle v,v_j\rangle u_j$.

Then the partial trace of $T$ can also be defined by

Let $\{v_j\}$ be a set of orthonormal basis of $\mathscr{B}$.

$$
    \operatorname{Tr}_{\mathscr{B}}(T)=\sum_{j} L^*_{v_j}(T)L_{v_j}
$$


\begin{defn}
    \label{defn:partial_trace_with_respect_to_state}
    Let $T$ be a linear operator on $\mathscr{H}=\mathscr{A}\otimes \mathscr{B}$, where $\mathscr{A}$ and $\mathscr{B}$ are finite-dimensional Hilbert spaces.

    Let $\rho$ be a state on $\mathscr{B}$ consisting of orthonormal basis $\{v_j\}$ and eigenvalue $\{\lambda_j\}$.

    The partial trace of $T$ with respect to $\rho$ is the linear operator on $\mathscr{A}$ defined by

    $$
        \operatorname{Tr}_{\mathscr{B},\rho}(T)=\sum_{j} \lambda_j L^*_{v_j}(T)L_{v_j}
    $$
\end{defn}


This introduces a new model in mathematics explaining quantum mechanics: the non-commutative probability theory.

\section{Non-commutative probability theory}

The constructions above explain why tensor products and traces appear before probability is mentioned again: they are the algebraic devices that let composite quantum systems behave like probabilistic systems with marginals and expectations. The next section packages these operations into the operator-theoretic language of states, observables, and expectation values, which is the setting used later for random quantum states and entropy.

The non-commutative probability theory is a branch of generalized probability theory that studies the probability of events in non-commutative algebras.

There are several main components of the generalized probability theory; let's see how we can formulate them, comparing with the classical probability theory.

First, we define the Hilbert space in case one did not make the step from the linear algebra courses like me.

\begin{defn}
    \label{defn:Hilbert_space}
    Hilbert space:

    A Hilbert space is a complete inner product space.
\end{defn}

That is, a vector space equipped with an inner product, with the induced metric defined by the norm of the inner product, we have a metric space, which is complete. Reminds that complete mean that every Cauchy sequence, the sequence such that for any $\epsilon>0$, there exists an $N$ such that for all $m,n\geq N$, we have $|x_m-x_n|<\epsilon$, converges to a limit.

As a side note we will use later, we also defined the Borel measure on a space, here we use the following definition specialized for the space (manifolds) we are interested in.

\begin{defn}
    \label{defn:Borel_measure}
    Borel measure:

    Let $X$ be a topological space, then a Borel measure $\mu:\mathscr{B}(X)\to [0,\infty]$ on $X$ is a measure on the Borel $\sigma$-algebra of $X$ $\mathscr{B}(X)$ satisfying the following properties:

    \begin{enumerate}
        \item $X \in \mathscr{B}(X)$.
        \item Close under complement: If $A\in \mathscr{B}(X)$, then $A^c\in \mathscr{B}(X)$.
        \item Close under countable unions: If $E_1,E_2,\cdots$ are disjoint Borel sets, then $\mu(\bigcup_{i=1}^\infty E_i)=\sum_{i=1}^\infty \mu(E_i)$.
    \end{enumerate}
\end{defn}

In later sections, we will use Lebesgue measure, and Haar measure for various circumstances, their detailed definition may be introduced in later sections.

\begin{examples}

    To introduce an example of Hilbert space we use when studying quantum mechanics, we need to introduce a common inner product used in $\mathbb{C}^n$.


    \begin{prop}
        \label{prop:Hermitian_inner_product_with_complex_vectorspace}
        The Hermitian inner product on the complex vector space $\C^n$ makes it a Hilbert space.
    \end{prop}

    \begin{proof}
        We first verify that the Hermitian inner product
        $$
            \langle u,v\rangle = \sum_{i=1}^n \overline{u_i} v_i
        $$
        on $\C^n$ satisfies the axioms of an inner product:
        \begin{enumerate}
            \item \textbf{Conjugate symmetry:} For all $u,v\in\C^n$,
                  $$
                      \langle u,v\rangle =\sum_{i=1}^n \overline{u_i} v_i=\overline{\sum_{i=1}^n \overline{v_i} u_i}=\overline{\langle v,u\rangle}.
                  $$
            \item \textbf{Linearity:} For any $u,v,w\in\C^n$ and scalars $a,b\in\C$, we have
                  $$
                      \langle u, av + bw\rangle = \sum_{i=1}^n \overline{u_i} (av_i + bw_i)=a\langle u,v\rangle + b\langle u,w\rangle.
                  $$
            \item \textbf{Positive definiteness:} For every $u=(u_1,u_2,\cdots,u_n)\in\C^n$, let $u_j=a_j+b_ji$, where $a_j,b_j\in\mathbb{R}$.
                  $$
                      \langle u,u\rangle = \sum_{j=1}^n \overline{u_j} u_j=\sum_{i=1}^n (a_i^2+b_i^2)\geq 0,
                  $$
                  with equality if and only if $u=0$.

                  Therefore, the Hermitian inner product is an inner product.
        \end{enumerate}

        Next, we show that $\C^n$ is complete with respect to the norm induced by this inner product:
        $$
            \|u\| = \sqrt{\langle u,u\rangle}.
        $$
        Since $\C^n$ is finite-dimensional, every Cauchy sequence (with respect to any norm) converges in $\C^n$. This is a standard result in finite-dimensional normed spaces, which implies that $\C^n$ is indeed complete.

        Therefore, since the Hermitian inner product fulfills the inner product axioms and $\C^n$ is complete, the complex vector space $\C^n$ with the Hermitian inner product is a Hilbert space.
    \end{proof}

\end{examples}

Another classical example of Hilbert space is $L^2(\Omega, \mathscr{F}, P)$, where $(\Omega, \mathscr{F}, P)$ is a measure space ($\Omega$ is a set, $\mathscr{F}$ is a $\sigma$-algebra on $\Omega$, and $P$ is a measure on $\mathscr{F}$). The $L^2$ space is the space of all function on $\Omega$ that is

\begin{enumerate}
    \item \textbf{square integrable}: square integrable functions are the functions $f:\Omega\to \mathbb{C}$ such that
          $$
              \int_\Omega |f(\omega)|^2 dP(\omega)<\infty
          $$
          with inner product defined by
          $$
              \langle f,g\rangle=\int_\Omega \overline{f(\omega)}g(\omega)dP(\omega)
          $$

    \item \textbf{complex-valued}: functions are complex-valued measurable. $f=u+v i$ is complex-valued if $u$ and $v$ are real-valued measurable.
\end{enumerate}

\begin{examples}


    \begin{prop}
        \label{prop:L2_space_is_a_Hilbert_space}
        $L^2(\Omega, \mathscr{F}, P)$ is a Hilbert space.
    \end{prop}

    \begin{proof}
        We check the two conditions of the Hilbert space:
        \begin{itemize}
            \item Completeness:
                  Let $(f_n)$ be a Cauchy sequence in $L^2(\Omega, \mathscr{F}, P)$. Then for any $\epsilon>0$, there exists an $N$ such that for all $m,n\geq N$, we have
                  $$
                      \int_\Omega |f_m(\omega)-f_n(\omega)|^2 dP(\omega)<\epsilon^2
                  $$
                  This means that $(f_n)$ is a Cauchy sequence in the norm of $L^2(\Omega, \mathscr{F}, P)$.
            \item Inner product:
                  The inner product is defined by
                  $$
                      \langle f,g\rangle=\int_\Omega \overline{f(\omega)}g(\omega)dP(\omega)
                  $$
                  This is a well-defined inner product on $L^2(\Omega, \mathscr{F}, P)$. We can check the properties of the inner product:
                  \begin{itemize}
                      \item Linearity:
                            $$
                                \langle af+bg,h\rangle=a\langle f,h\rangle+b\langle g,h\rangle
                            $$
                      \item Conjugate symmetry:
                            $$
                                \langle f,g\rangle=\overline{\langle g,f\rangle}
                            $$
                      \item Positive definiteness:
                            $$
                                \langle f,f\rangle\geq 0
                            $$
                  \end{itemize}
        \end{itemize}
    \end{proof}

\end{examples}

Let $\mathscr{H}$ be a Hilbert space. $\mathscr{H}$ consists of complex-valued functions on a finite set $\Omega=\{1,2,\ldots,n\}$, and the functions $(e_1,e_2,\ldots,e_n)$ form an orthonormal basis of $\mathscr{H}$. (We use Dirac notation $|k\rangle$ to denote the basis vector $e_k$~\cite{parthasarathy1992quantum}.)

As an analog to the classical probability space $(\Omega,\mathscr{F},\mu)$, which consists of a sample space $\Omega$ and a probability measure $\mu$ on the state space $\mathscr{F}$, the non-commutative probability space $(\mathscr{H},\mathscr{P},\rho)$ consists of a Hilbert space $\mathscr{H}$ and a state $\rho$ on the space of all orthogonal projections $\mathscr{P}$.

The detailed definition of the non-commutative probability space is given below:

\begin{defn}
    \label{defn:non-commutative_probability_space}
    Non-commutative probability space:

    A non-commutative probability space is a pair $(\mathscr{B}(\mathscr{H}),\mathscr{P})$, where $\mathscr{B}(\mathscr{H})$ is the set of all \textbf{bounded} linear operators on $\mathscr{H}$.

    A linear operator on $\mathscr{H}$ is \textbf{bounded} if for all $u$ such that $\|u\|\leq 1$, we have $\|Au\|\leq M$ for some $M>0$.

    $\mathscr{P}$ is the set of all orthogonal projections on $\mathscr{B}(\mathscr{H})$.

    The set $\mathscr{P}=\{P\in\mathscr{B}(\mathscr{H}):P^*=P=P^2\}$ is the set of all orthogonal projections on $\mathscr{B}(\mathscr{H})$.
\end{defn}

Recall from classical probability theory, we call the initial probability distribution for possible outcomes in the classical probability theory as our \textit{state}, simillarly, we need to define the \textit{state} in the non-commutative probability theory.

\begin{defn}
    \label{defn:state}
    Non-commutative probability state:

    Given a non-commutative probability space $(\mathscr{B}(\mathscr{H}),\mathscr{P})$,

    A state is a unit vector $\ket{\psi}$ in the Hilbert space $\mathscr{H}$, such that $\bra{\psi}\ket{\psi}=1$.

    Every state uniquely defines a map $\rho:\mathscr{P}\to[0,1]$, $\rho(P)=\bra{\psi}P\ket{\psi}$ (commonly named as density operator) such that:
    \begin{itemize}
        \item $\rho(O)=0$, where $O$ is the zero projection, and $\rho(I)=1$, where $I$ is the identity projection.
        \item If $P_1,P_2,\ldots,P_n$ are pairwise disjoint orthogonal projections, then $\rho(P_1 + P_2 + \cdots + P_n) = \sum_{i=1}^n \rho(P_i)$.
    \end{itemize}
\end{defn}

Note that the pure states are the density operators that can be represented by a unit vector $\ket{\psi}$ in the Hilbert space $\mathscr{H}$, whereas mixed states are the density operators that cannot be represented by a unit vector in the Hilbert space $\mathscr{H}$.

If $(|\psi_1\rangle,|\psi_2\rangle,\cdots,|\psi_n\rangle)$ is an orthonormal basis of $\mathscr{H}$ consisting of eigenvectors of $\rho$, for the eigenvalues $p_1,p_2,\cdots,p_n$, then $p_j\geq 0$ and $\sum_{j=1}^n p_j=1$.

We can write $\rho$ as
$$
    \rho=\sum_{j=1}^n p_j|\psi_j\rangle\langle\psi_j|
$$
(Under basis $|\psi_j\rangle$, it is a diagonal matrix with $p_j$ on the diagonal.)

% Then we need to introduce a theorem that ensures that every state on the space of all orthogonal projections on $\mathscr{H}$ can be represented by a density operator.

% \begin{theorem}
% 	\label{theorem:Gleason's_theorem}
% 	Gleason's theorem (Theorem 1.1.15 in~\cite{parthasarathy2005mathematical})

%     Let $\mathscr{H}$ be a Hilbert space over $\mathbb{C}$ or $\mathbb{R}$ of dimension $n\geq 3$. Let $\mu$ be a state on the space $\mathscr{P}$ of projections on $\mathscr{H}$. Then there exists a unique density operator $\rho$ such that
%     $$
%     \mu(P)=\operatorname{Tr}(\rho P)
%     $$
%     for all $P\in\mathscr{P}$. $\mathscr{P}$ is the space of all orthogonal projections on $\mathscr{H}$.
% \end{theorem}

% This proof came from~\cite{parthasarathy2005mathematical}.

% \begin{proof}
% % TODO: FILL IN THE PROOF
% \end{proof}

% This theorem is a very important theorem in non-commutative probability theory; it states that any state on the space of all orthogonal projections on $\mathscr{H}$ can be represented by a density operator.

The counterpart of the random variable in the non-commutative probability theory is called an observable, which is a Hermitian operator on $\mathscr{H}$ (for all $\psi,\phi$ in the domain of $A$, we have $\langle A\psi,\phi\rangle=\langle\psi,A\phi\rangle$. This kind of operator ensures that our outcome interpreted as probability is a real number).

\begin{defn}
    \label{defn:observable}
    Observable:

    Let $\mathcal{B}(\mathbb{R})$ be the set of all Borel sets on $\mathbb{R}$.

    An (real-valued) observable (random variable) on the Hilbert space $\mathscr{H}$, denoted by $A$, is a projection-valued map (measure) $P_A:\mathscr{B}(\mathbb{R})\to\mathscr{P}(\mathscr{H})$.

    Satisfies the following properties:
    \begin{itemize}
        \item $P_A(\emptyset)=O$ (the zero projection)
        \item $P_A(\mathbb{R})=I$ (the identity projection)
        \item For any sequence $A_1,A_2,\cdots,A_n\in \mathscr{B}(\mathbb{R})$, the following holds:
              \begin{itemize}
                  \item $P_A(\bigcup_{i=1}^n A_i)=\bigvee_{i=1}^n P_A(A_i)$
                  \item $P_A(\bigcap_{i=1}^n A_i)=\bigwedge_{i=1}^n P_A(A_i)$
                  \item $P_A(A^c)=I-P_A(A),\forall A\in\mathscr{B}(\mathbb{R})$
              \end{itemize}
    \end{itemize}
\end{defn}

If $A$ is an observable determined by the map $P_A:\mathcal{B}(\mathbb{R})\to\mathcal{P}(\mathscr{H})$, $P_A$ is a spectral measure (a complete additive orthogonal projection valued measure on $\mathcal{B}(\mathbb{R})$). And every spectral measure can be represented by an observable. \cite{parthasarathy2005mathematical}

\begin{prop}
    If $A_j$ are mutually disjoint (that is $P_A(A_i)P_A(A_j)=P_A(A_j)P_A(A_i)=O$ for $i\neq j$), then $P_A(\bigcup_{j=1}^n A_j)=\sum_{j=1}^n P_A(A_j)$
\end{prop}

\begin{defn}
    \label{defn:probability_of_random_variable}
    Probability of a random variable:

    Let $A$ be a real-valued observable on a Hilbert space $\mathscr{H}$. $\rho$ be a state. The probability of observing the outcome $E\in \mathcal{B}(\mathbb{R})$ is given by:

    $$
        \mu(E)=\operatorname{Tr}(\rho P_A(E))
    $$
\end{defn}

Restriction of a quantum state to a commutative subalgebra defines an ordinary probability measure.

\begin{examples}
    Let
    $$
        Z=\begin{pmatrix}
            1 & 0  \\
            0 & -1
        \end{pmatrix}.
    $$

    The eigenvalues of $Z$ are $+1$ and $-1$, with corresponding normalized eigenvectors

    $$
        \ket{0}=\begin{pmatrix}1\\0\end{pmatrix},
        \qquad
        \ket{1}=\begin{pmatrix}0\\1\end{pmatrix}.
    $$

    The spectral projections are
    $$
        P_Z(\{1\}) = \ket{0}\bra{0}
        =
        \begin{pmatrix}
            1 & 0 \\
            0 & 0
        \end{pmatrix},
        \qquad
        P_Z(\{-1\}) =  \ket{1}\bra{1}
        =
        \begin{pmatrix}
            0 & 0 \\
            0 & 1
        \end{pmatrix}.
    $$

    The associated projection-valued measure $P_Z$ satisfies
    $$
        P_Z(\{1,-1\}) = I,
        \qquad
        P_Z(\emptyset)=0.
    $$

    %==============================
    % 4. Example: X measurement and its PVM
    %==============================

    Let
    $$
        X=\begin{pmatrix}
            0 & 1 \\
            1 & 0
        \end{pmatrix}.
    $$

    The normalized eigenvectors of $X$ are
    $$
        \ket{+}=\frac{1}{\sqrt{2}}\left(\ket{0}+\ket{1}\right),
        \qquad
        \ket{-}=\frac{1}{\sqrt{2}}\left(\ket{0}-\ket{1}\right),
    $$
    with eigenvalues $+1$ and $-1$, respectively.

    The corresponding spectral projections are
    $$
        P_X(\{1\}) = \ket{+}\bra{+}
        =
        \frac{1}{2}
        \begin{pmatrix}
            1 & 1 \\
            1 & 1
        \end{pmatrix},
    $$
    $$
        P_X(\{-1\}) = \ket{-}\bra{-}
        =
        \frac{1}{2}
        \begin{pmatrix}
            1  & -1 \\
            -1 & 1
        \end{pmatrix}.
    $$

    %==============================
    % 5. Noncommutativity of the projections
    %==============================

    Compute
    $$
        P_Z(\{1\})P_X(\{1\})
        =
        \begin{pmatrix}
            1 & 0 \\
            0 & 0
        \end{pmatrix}
        \cdot
        \frac{1}{2}
        \begin{pmatrix}
            1 & 1 \\
            1 & 1
        \end{pmatrix}
        =
        \frac{1}{2}
        \begin{pmatrix}
            1 & 1 \\
            0 & 0
        \end{pmatrix}.
    $$

    On the other hand,
    $$
        P_X(\{1\})P_Z(\{1\})
        =
        \frac{1}{2}
        \begin{pmatrix}
            1 & 1 \\
            1 & 1
        \end{pmatrix}
        \cdot
        \begin{pmatrix}
            1 & 0 \\
            0 & 0
        \end{pmatrix}
        =
        \frac{1}{2}
        \begin{pmatrix}
            1 & 0 \\
            1 & 0
        \end{pmatrix}.
    $$

    Since
    $$
        P_Z(\{1\})P_X(\{1\}) \neq P_X(\{1\})P_Z(\{1\}),
    $$
    the projections do not commute.

    Let $\rho$ be a density operator on $\mathbb C^2$, i.e.
    $$
        \rho \ge 0,
        \qquad
        \operatorname{Tr}(\rho)=1.
    $$

    For a pure state $\ket{\psi}$, one has
    $$
        \rho = \ket{\psi}\bra{\psi}.
    $$

    The probability that a measurement associated with a PVM $P$ yields an outcome in a Borel set $A\in \mathcal{B}$ is
    $$
        \mathbb P(A) = \operatorname{Tr}(\rho\, P(A)).
    $$

    For example, let
    $$
        \rho = \ket{0}\langle 0|
        =
        \begin{pmatrix}
            1 & 0 \\
            0 & 0
        \end{pmatrix}.
    $$

    Then
    $$
        \operatorname{Tr}\bigl(\rho\, P_Z(\{1\})\bigr) = 1,
        \qquad
        \operatorname{Tr}\bigl(\rho\, P_X(\{1\})\bigr) = \frac{1}{2}.
    $$

\end{examples}

\begin{defn}
    \label{defn:measurement}
    Definition of measurement:

    A measurement (observation) of a system prepared in a given state produces an outcome $x$, $x$ is a physical event that is a subset of the set of all possible outcomes. For each $x$, we associate a measurement operator $M_x$ on $\mathscr{H}$.

    Given the initial state (pure state, unit vector) $u$, the probability of measurement outcome $x$ is given by:
    $$
        p(x)=\|M_xu\|^2
    $$

    Note that to make sense of this definition, the collection of measurement operators $\{M_x\}$ must satisfy the completeness requirement:
    $$
        1=\sum_{x\in X} p(x)=\sum_{x\in X}\|M_xu\|^2=\sum_{x\in X}\langle M_xu,M_xu\rangle=\langle u,(\sum_{x\in X}M_x^*M_x)u\rangle
    $$
    So $\sum_{x\in X}M_x^*M_x=I$.

\end{defn}


Here is Table~\ref{tab:analog_of_classical_probability_theory_and_non_commutative_probability_theory} summarizing the analog of classical probability theory and non-commutative (\textit{quantum}) probability theory~\cite{Feres}:

\begin{table}[H]
    \centering
    \renewcommand{\arraystretch}{1.5}
    \caption{Analog of classical probability theory and non-commutative (\textit{quantum}) probability theory}
    \label{tab:analog_of_classical_probability_theory_and_non_commutative_probability_theory}
    {\small
        \begin{tabular}{|p{0.5\linewidth}|p{0.5\linewidth}|}
            \hline
            \textbf{Classical probability}                                                                                                                      & \textbf{Non-commutative probability}                                                                                                                      \\
            \hline
            Sample space $\Omega$, cardinality $\vert\Omega\vert=n$, example: $\Omega=\{0,1\}$                                                                  & Complex Hilbert space $\mathscr{H}$, dimension $\dim\mathscr{H}=n$, example: $\mathscr{H}=\mathbb{C}^2$                                                   \\
            \hline
            Common algebra of $\mathbb{C}$ valued functions                                                                                                     & Algebra of bounded operators $\mathcal{B}(\mathscr{H})$                                                                                                   \\
            \hline
            $f\mapsto \bar{f}$ complex conjugation                                                                                                              & $P\mapsto P^*$ adjoint                                                                                                                                    \\
            \hline
            Events: indicator functions of sets                                                                                                                 & Projections: space of orthogonal projections $\mathscr{P}\subseteq\mathscr{B}(\mathscr{H})$                                                               \\
            \hline
            functions $f$ such that $f^2=f=\overline{f}$                                                                                                        & orthogonal projections $P$ such that $P^*=P=P^2$                                                                                                          \\
            \hline
            $\mathbb{R}$-valued functions $f=\overline{f}$                                                                                                      & self-adjoint operators $A=A^*$                                                                                                                            \\
            \hline
            $\mathbb{I}_{f^{-1}(\{\lambda\})}$ is the indicator function of the set $f^{-1}(\{\lambda\})$                                                       & $P(\lambda)$ is the orthogonal projection to eigenspace                                                                                                   \\
            \hline
            $f=\sum_{\lambda\in \operatorname{Range}(f)}\lambda \mathbb{I}_{f^{-1}(\{\lambda\})}$                                                               & $A=\sum_{\lambda\in \operatorname{sp}(A)}\lambda P(\lambda)$                                                                                              \\
            \hline
            Probability measure $\mu$ on $\Omega$                                                                                                               & Density operator $\rho$ on $\mathscr{H}$                                                                                                                  \\
            \hline
            Delta measure $\delta_\omega$                                                                                                                       & Pure state $\rho=\vert\psi\rangle\langle\psi\vert$                                                                                                        \\
            \hline
            $\mu$ is non-negative measure and $\sum_{i=1}^n\mu(\{i\})=1$                                                                                        & $\rho$ is positive semi-definite and $\operatorname{Tr}(\rho)=1$                                                                                          \\
            \hline
            Expected value of random variable $f$ is $\mathbb{E}_{\mu}(f)=\sum_{i=1}^n f(i)\mu(\{i\})$                                                          & Expected value of operator $A$ is $\mathbb{E}_\rho(A)=\operatorname{Tr}(\rho A)$                                                                          \\
            \hline
            Variance of random variable $f$ is $\operatorname{Var}_\mu(f)=\sum_{i=1}^n (f(i)-\mathbb{E}_\mu(f))^2\mu(\{i\})$                                    & Variance of operator $A$ is $\operatorname{Var}_\rho(A)=\operatorname{Tr}(\rho A^2)-\operatorname{Tr}(\rho A)^2$                                          \\
            \hline
            Covariance of random variables $f$ and $g$ is $\operatorname{Cov}_\mu(f,g)=\sum_{i=1}^n (f(i)-\mathbb{E}_\mu(f))(g(i)-\mathbb{E}_\mu(g))\mu(\{i\})$ & Covariance of operators $A$ and $B$ is $\operatorname{Cov}_\rho(A,B)=\operatorname{Tr}(\rho A\circ B)-\operatorname{Tr}(\rho A)\operatorname{Tr}(\rho B)$ \\
            \hline
            Composite system is given by Cartesian product of the sample spaces $\Omega_1\times\Omega_2$                                                        & Composite system is given by tensor product of the Hilbert spaces $\mathscr{H}_1\otimes\mathscr{H}_2$                                                     \\
            \hline
            Product measure $\mu_1\times\mu_2$ on $\Omega_1\times\Omega_2$                                                                                      & Tensor product of space $\rho_1\otimes\rho_2$ on $\mathscr{H}_1\otimes\mathscr{H}_2$                                                                      \\
            \hline
            Marginal distribution $\pi_*v$                                                                                                                      & Partial trace $\operatorname{Tr}_2(\rho)$                                                                                                                 \\
            \hline
        \end{tabular}
    }
    \vspace{0.5cm}
\end{table}

\section{Manifolds}

Up to this point the emphasis has been algebraic and probabilistic. The concentration results used later, however, live naturally on curved spaces equipped with metrics and measures. For that reason the discussion now shifts from operator theory to manifold theory, starting with topological manifolds and then adding smooth and Riemannian structure until we can describe complex projective space as a genuine geometric state space.

In this section, we will introduce some basic definitions and theorems used in manifold theory that are relevant to our study. Assuming no prior knowledge of manifold theory but basic topology understanding. We will provide brief definitions and explanations for each term. From the most abstract Manifold definition to the Riemannian manifolds and related theorems.

\subsection{Manifolds}

\begin{defn}
    \label{defn:m-manifold}

    An $m$-manifold is a topological space $X$ that is

    \begin{enumerate}
        \item Hausdorff: every distinct two points in $X$ can be separated by two disjoint open sets.
        \item Second countable: $X$ has a countable basis.
        \item Every point $p$ has an open neighborhood $p\in U$ that is homeomorphic to an open subset of $\mathbb{R}^m$.
    \end{enumerate}
\end{defn}


\begin{examples}
    \label{example:second_countable_space}
    Let $X=\mathbb{R}$ and $\mathcal{B}=\{(a,b)|a,b\in \mathbb{R},a<b\}$ (collection of all open intervals with rational endpoints).

    Since the rational numbers are countable, so $\mathcal{B}$ is countable.

    So $\mathbb{R}$ is second countable.

    Likewise, $\mathbb{R}^n$ is also second countable.
\end{examples}

\begin{examples}
    \label{example:manifold}
    1-manifold is a curve and 2-manifold is a surface.
\end{examples}

\begin{theorem}
    \label{Theorem of imbedded space}

    Whithney's Embedding Theorem:

    If $X$ is a compact $m$-manifold, then $X$ can be imbedded in $\mathbb{R}^n$ for some $n$.
\end{theorem}

This proof is from topology course, and use additional one lemma:

\begin{lemma}
    \label{lemma:finite_partition_of_unity}

    Let $\{U_i\}_{i=1}^n$ be a finite open cover of a normal space $X$ (Every pair of closed sets in $X$ can be separated by two open sets in $X$).

Then there exists a partition of unity dominated by $\{U_i\}_{i=1}^n$.
\end{lemma}

\begin{proof}

Since $X$ is a $m$ compact manifold, $\forall x\in X$, there is an open neighborhood $U_x$ of $x$ such that $U_x$ is homeomorphic to $\mathbb{R}^m$. That means there exists $\varphi_i:U_x\to \varphi(U_x)\subseteq \mathbb{R}^m$.

Where $\{U_x\}_{x\in X}$ is an open cover of $X$. Since $X$ is compact, there is a finite subcover $\bigcup_{i=1}^k U_{x_i}=X$.

Apply the existence of a finite partition of unity, we can find a partition of unity dominated by $\{U_{x_i}\}_{i=1}^k$. With family of functions $\phi_i:\mathbb{R}^d\to[0,1]$.

Define $h_i:X\to \mathbb{R}^m$ by

$$
h_i(x)=\begin{cases}
\phi_i(x)\varphi_i(x) & \text{if }x=x_i\\
0 & \text{otherwise}
\end{cases}
$$

We claim that $h_i$ is continuous using pasting lemma.

On $U_i$, $h_i=\phi_i\varphi_i$ is product of two continuous functions therefore continuous.

On $X-\operatorname{supp}(\phi_i)$, $h_i=0$ is continuous.

By pasting lemma, $h_i$ is continuous.

Define

$$
F: X\to (\mathbb{R}^m\times \mathbb{R})^n
$$

where $x\mapsto (h_1(x),\varphi_1(x),h_2(x),\varphi_2(x),\dots,h_n(x),\varphi_n(x))$

We want to show that $F$ is imbedding map.

\begin{enumerate}

    \item $F$ is continuous


since it is a product of continuous functions.

\item $F$ is injective

that is, if $F(x_1)=F(x_2)$, then $x_1=x_2$.

By partition of unity, we have,

$h_1(x_1)=h_1(x_2), h_2(x_1)=h_2(x_2), \dots, h_n(x_1)=h_n(x_2)$.

And $\varphi_1(x_1)=\varphi_1(x_2), \varphi_2(x_1)=\varphi_2(x_2), \dots, \varphi_n(x_1)=\varphi_n(x_2)$.

Because $\sum_{i=1}^n \varphi_i(x_1)=1$, therefore there exists $\varphi_i(x_1)=\varphi_i(x_2)>0$.

Therefore $x1,x_2\in \operatorname{supp}(\phi_i)\subseteq U_i$.

By definition of $h$, $h_i(x_1)=h_i(x_2)$, $\varphi_i(x_1)\phi_i(x_1)=\varphi_i(x_2)\phi_i(x_2)$.

Using cancellation, $\phi_i(x_1)=\phi_i(x_2)$.

Therefore $x_1=x_2$ since $\phi_i(x_1)=\phi_i(x_2)$ is a homeomorphism.

\textit{In this proof, $\varphi$ ensures the imbedding is properly defined on the open sets}

\item  $F$ is a homeomorphism.

Note that if $f:X\to Y$ is continuous and $X$ is compact, $Y$ is Hausdorff, then $f$ is a closed map.

$F:X\to F(X)$ is a bijective map from a compact space to a Hausdorff space, therefore $F$ is a closed map.

Since $F$ is continuous, then $F^{-1}(C)$ where $C$ is a closed set in $F(X)$, $F^{-1}(C)$ is closed in $X$.

Therefore $F$ is a homeomorphism.
\end{enumerate}
\end{proof}

\subsection{Smooth manifolds and Lie groups}

This section is adopted from \cite{lee_introduction_2012}

The topological definition of a manifold tells us what the space looks like locally, but not how to differentiate on it. The next step is therefore to add charts with smooth transition maps. Once this smooth structure is available, notions such as differentials, submersions, and group actions can be stated precisely, and these are exactly the tools needed later for the Hopf fibration.

\begin{defn}
    \label{defn:partial_derivative}

    Let $U\subseteq \mathbb{R}^n$ and $f:U\to \mathbb{R}^n$ be a map.

    For any $a=(a_1,\cdots,a_n)\in U$, $j\in \{1,\cdots,n\}$, the $j$-th partial derivative of $F$ at $a$ is defined as

    $$
        \begin{aligned}
            \frac{\partial f}{\partial x_j}(a) & =\lim_{h\to 0}\frac{f(a_1,\cdots,a_j+h,\cdots,a_n)-f(a_1,\cdots,a_j,\cdots,a_n)}{h} \\
                                               & =\lim_{h\to 0}\frac{f(a+he_j)-f(a)}{h}
        \end{aligned}
    $$

\end{defn}

\begin{defn}
    \label{defn:continuously_differentiable_map}
    Let $U\subseteq \mathbb{R}^n$ and $f:U\to \mathbb{R}^n$ be a map.

    If for any $j\in \{1,\cdots,n\}$, the $j$-th partial derivative of $f$ is continuous at $a$, then $f$ is continuously differentiable at $a$.

    If $\forall a\in U$, $\frac{\partial f}{\partial x_j}$ exists and is continuous at $a$, then $f$ is continuously differentiable on $U$, or a $C^1$ map. (Note that $C^0$ map is just a continuous map.)
\end{defn}


\begin{defn}
    \label{defn:smooth_map}
    A function $f:U\to \mathbb{R}^n$ is smooth if it is of class $C^k$ for every $k\geq 0$ on $U$. Such function is called a diffeomorphism if it is also a \textbf{bijection} and its \textbf{inverse is also smooth}.
\end{defn}


\begin{defn}
    \label{defn:chart}

    Let $M$ be a smooth manifold. A \textbf{chart} is a pair $(U,\varphi)$ where $U\subseteq M$ is an open subset and $\varphi:U\to \hat{U}\subseteq \mathbb{R}^n$ is a homeomorphism (a continuous bijection map and its inverse is also continuous).

    If $p\in U$ and $\varphi(p)=0$, then we say that $p$ is the origin of the chart $(U,\varphi)$.

    For $p\in U$, we note that the continuous function $\varphi(p)=(x_1(p),\cdots,x_n(p))$ gives a vector in $\mathbb{R}^n$. The $(x_1(p),\cdots,x_n(p))$ is called the \textbf{local coordinates} of $p$ in the chart $(U,\varphi)$.

\end{defn}

\begin{defn}
    \label{defn:atlas}
    Let $M$ be a smooth manifold. An \textbf{atlas} is a collection of charts $\mathcal{A}=\{(U_\alpha,\phi_\alpha)\}_{\alpha\in I}$ such that $M=\bigcup_{\alpha\in I} U_\alpha$.

    An atlas is said to be \textbf{smooth} if the transition maps $\phi_\alpha\circ \phi_\beta^{-1}:\phi_\beta(U_\alpha\cap U_\beta)\to \phi_\alpha(U_\alpha\cap U_\beta)$ are smooth for all $\alpha, \beta\in I$.
\end{defn}


\begin{defn}
    \label{defn:smooth_manifold}
    A smooth manifold is a pair $(M,\mathcal{A})$ where $M$ is a topological manifold and $\mathcal{A}$ is a smooth atlas.
\end{defn}

\begin{defn}
    \label{defn:differential}
    Let $M$ and $N$ be smooth manifolds, and $f:M\to N$ be a smooth map. For each $p\in M$, the \textbf{differential} of $f$ at $p$ is the linear map
    $$
    df_p:T_pM\to T_{f(p)}N
    $$
\end{defn}

\begin{defn}
    \label{defn:smooth-submersion}
     A smooth map $f:M\to N$ is a \textbf{smooth submersion} if for each $p\in M$, the differential $df_p:T_pM\to T_{f(p)}N$ is surjective.

     Or equivalently $\operatorname{rank}(df_p)=\dim N$ for each $p\in M$.
\end{defn}

Here are some additional propositions that will be helpful for our study in later sections:

This one is from \cite{lee_introduction_2012} Theorem 4.26

\begin{theorem}
    \label{theorem:local_section_theorem}

    Let $M$ and $N$ be smooth manifolds and $\pi:M\to N$ is a smooth map. Then $\pi$ is a smooth submersion if and only if every point of $M$ is in the image of a smooth local section of $\pi$ (a local section of $\pi$ is a map $\sigma:U\to M$ defined on some open subset $U\subseteq N$ with $\pi\circ \sigma=Id_U$).
\end{theorem}

\subsection{Riemannian manifolds}

Smooth manifolds still do not measure lengths, angles, or volumes. To connect the manifold side of the thesis with concentration of measure, we need a metric structure that turns local smooth data into global geometric data. Riemannian metrics provide exactly that extra layer, and Riemannian submersions will later transfer this geometry from spheres to complex projective spaces.

\begin{defn}
    \label{defn:riemannian-metric}

    Let $M$ be a smooth manifold. A \textit{\textbf{Riemannian metric}} on $M$ is a smooth covariant tensor field $g\in \mathcal{T}^2(M)$ such that for each $p\in M$, $g_p$ is an inner product on $T_pM$.

    $g_p(v,v)\geq 0$ for each $p\in M$ and each $v\in T_pM$. equality holds if and only if $v=0$.

\end{defn}

\begin{defn}
    \label{defn:riemannian-submersion}
    Suppose $(\tilde{M},\tilde{g})$ and $(M,g)$ are smooth Riemannian manifolds, and $\pi:\tilde{M}\to M$ is a smooth submersion. Then $\pi$ is said to be a \textit{\textbf{Riemannian submersion}} if for each $x\in \tilde{M}$, the differential $d\pi_x:T_x\tilde{M}\to T_{\pi(x)}M$ restricts to a linear isometry from $H_x$ onto $T_{\pi(x)}M$.

    In other words, $\tilde{g}_x(v,w)=g_{\pi(x)}(d\pi_x(v),d\pi_x(w))$ whenever $v,w\in H_x$.
\end{defn}

\begin{theorem}
    \label{theorem:riemannian-submersion}

    Let $(\tilde{M},\tilde{g})$ be a Riemannian manifold, let $\pi:\tilde{M}\to M$ be a surjective smooth submersion, and let $G$ be a group acting on $\tilde{M}$. If the \textbf{action} is
    \begin{enumerate}
        \item isometric: the map $x\mapsto \varphi\cdot x$ is an isometry for each $\varphi\in G$.
        \item vertical: every element $\varphi\in G$ takes each fiber to itself, that is $\pi(\varphi\cdot p)=\pi(p)$ for all $p\in \tilde{M}$.
        \item transitive on fibers: for each $p,q\in \tilde{M}$ such that $\pi(p)=\pi(q)$, there exists $\varphi\in G$ such that $\varphi\cdot p = q$.
    \end{enumerate}
    Then there is a unique Riemannian metric on $M$ such that $\pi$ is a Riemannian submersion.

\end{theorem}

\begin{proof}
    For each $p\in \tilde{M}$, let
    $$
        V_p:=\ker(d\pi_p)\subseteq T_p\tilde{M}
    $$
    be the vertical space, and let
    $$
        H_p:=V_p^{\perp_{\tilde g}}
    $$
    be its $\tilde g$-orthogonal complement. Since $\pi$ is a surjective smooth submersion, each $d\pi_p:T_p\tilde M\to T_{\pi(p)}M$ is surjective, so
    $$
        T_p\tilde M = V_p\oplus H_p,
    $$
    and therefore the restriction
    $$
        d\pi_p|_{H_p}:H_p\to T_{\pi(p)}M
    $$
    is a linear isomorphism.

    We first show that the group action preserves the horizontal distribution. Fix $\varphi\in G$. Since the action is vertical, we have
    $$
        \pi(\varphi\cdot x)=\pi(x)\qquad\text{for all }x\in \tilde M.
    $$
    Differentiating at $p$ gives
    $$
        d\pi_{\varphi\cdot p}\circ d\varphi_p = d\pi_p.
    $$
    Hence if $v\in V_p=\ker(d\pi_p)$, then
    $$
        d\pi_{\varphi\cdot p}(d\varphi_p v)=d\pi_p(v)=0,
    $$
    so $d\varphi_p(V_p)\subseteq V_{\varphi\cdot p}$. Since $\varphi$ acts isometrically, $d\varphi_p$ is a linear isometry, and thus it preserves orthogonal complements. Therefore
    $$
        d\varphi_p(H_p)=H_{\varphi\cdot p}.
    $$

    We now define a metric on $M$. Let $m\in M$, and choose any $p\in \pi^{-1}(m)$. For $u,v\in T_mM$, let $\tilde u,\tilde v\in H_p$ be the unique horizontal lifts satisfying
    $$
        d\pi_p(\tilde u)=u,\qquad d\pi_p(\tilde v)=v.
    $$
    Define
    $$
        g_m(u,v):=\tilde g_p(\tilde u,\tilde v).
    $$
    This is a symmetric bilinear form on $T_mM$, and it is positive definite because $\tilde g_p$ is positive definite on $H_p$ and $d\pi_p|_{H_p}$ is an isomorphism.

    It remains to show that this definition is independent of the choice of $p$ in the fiber. Suppose $p,q\in \pi^{-1}(m)$. By transitivity of the action on fibers, there exists $\varphi\in G$ such that $\varphi\cdot p=q$. Let $\tilde u_p,\tilde v_p\in H_p$ be the horizontal lifts of $u,v$ at $p$, and define
    $$
        \tilde u_q:=d\varphi_p(\tilde u_p),\qquad \tilde v_q:=d\varphi_p(\tilde v_p).
    $$
    By the previous paragraph, $\tilde u_q,\tilde v_q\in H_q$. Moreover,
    $$
        d\pi_q(\tilde u_q)
        =
        d\pi_q(d\varphi_p\tilde u_p)
        =
        d\pi_p(\tilde u_p)
        =
        u,
    $$
    and similarly $d\pi_q(\tilde v_q)=v$. Thus $\tilde u_q,\tilde v_q$ are exactly the horizontal lifts of $u,v$ at $q$. Since $\varphi$ is an isometry,
    $$
        \tilde g_q(\tilde u_q,\tilde v_q)
        =
        \tilde g_q(d\varphi_p\tilde u_p,d\varphi_p\tilde v_p)
        =
        \tilde g_p(\tilde u_p,\tilde v_p).
    $$
    Therefore $g_m(u,v)$ is independent of the chosen point $p\in \pi^{-1}(m)$, so $g$ is well defined on $M$.

    Next we prove that $g$ is smooth. Let $m_0\in M$. Since $\pi$ is a smooth submersion, there exists an open neighborhood $U\subseteq M$ of $m_0$ and a smooth local section
    $$
        s:U\to \tilde M
        \qquad\text{such that}\qquad
        \pi\circ s=\mathrm{id}_U.
    $$
    Over $s(U)$, the vertical bundle $V=\ker d\pi$ is a smooth subbundle of $T\tilde M$, and hence so is its orthogonal complement $H=V^\perp$. For each $x\in U$, the restriction
    $$
        d\pi_{s(x)}|_{H_{s(x)}}:H_{s(x)}\to T_xM
    $$
    is a linear isomorphism, and these isomorphisms depend smoothly on $x$. Thus they define a smooth vector bundle isomorphism
    $$
        d\pi|_H:H|_{s(U)}\to TU,
    $$
    whose inverse is also smooth.

    If $X,Y$ are smooth vector fields on $U$, define their horizontal lifts along $s$ by
    $$
        X_x^H:=\bigl(d\pi_{s(x)}|_{H_{s(x)}}\bigr)^{-1}(X_x),
        \qquad
        Y_x^H:=\bigl(d\pi_{s(x)}|_{H_{s(x)}}\bigr)^{-1}(Y_x).
    $$
    Then $X^H$ and $Y^H$ are smooth vector fields along $s(U)$, and by construction,
    $$
        g(X,Y)(x)=\tilde g_{s(x)}(X_x^H,Y_x^H).
    $$
    Since the right-hand side depends smoothly on $x$, it follows that $g$ is a smooth Riemannian metric on $M$.

    By construction, for every $p\in \tilde M$ and every $\tilde u,\tilde v\in H_p$,
    $$
        g_{\pi(p)}(d\pi_p\tilde u,d\pi_p\tilde v)=\tilde g_p(\tilde u,\tilde v).
    $$
    Thus $d\pi_p:H_p\to T_{\pi(p)}M$ is an isometry for every $p$, so $\pi:(\tilde M,\tilde g)\to (M,g)$ is a Riemannian submersion.

    Finally, uniqueness is immediate. Indeed, if $g'$ is another Riemannian metric on $M$ such that $\pi:(\tilde M,\tilde g)\to (M,g')$ is a Riemannian submersion, then for any $m\in M$, any $p\in \pi^{-1}(m)$, and any $u,v\in T_mM$, letting $\tilde u,\tilde v\in H_p$ denote the horizontal lifts of $u,v$, we must have
    $$
        g'_m(u,v)=\tilde g_p(\tilde u,\tilde v)=g_m(u,v).
    $$
    Hence $g'=g$.

    Therefore there exists a unique Riemannian metric on $M$ such that $\pi$ is a Riemannian submersion.
\end{proof}

\subsection{Hopf fibration}

The previous subsection gives the abstract mechanism for pushing a metric through a quotient map. The Hopf fibration is the concrete instance needed in this thesis: it explains why the geometry of the sphere descends to complex projective space, and therefore why concentration on spheres is relevant to the geometry of pure quantum states.

There are some remaining steps for showing how the metric on Sphere induces the metric on complex projective space, now we will just drop the conclusion here so that we can continue our discussion:

\begin{itemize}
    \item Every nonzero vector can be normalized, so each pure state has a representative on the unit sphere
          $$
              S^{2n+1} \subset \mathbb C^{n+1}.
          $$
    \item Two unit vectors represent the same pure state exactly when they differ by a phase:
          $$
              z \sim e^{i\theta} z.
          $$
    \item Therefore
          $$
              \mathbb C P^n = S^{2n+1}/S^1.
          $$
\end{itemize}

\vspace{0.4em}
The quotient map
$$
    p:S^{2n+1}\to \mathbb C P^n, \qquad p(z)=[z]=\{\lambda z : \lambda \in \mathbb C^\times\},
$$
is the \textbf{Hopf fibration}.


The geometric picture is
$$
    S^{2n+1}
    \xrightarrow{\text{Hopf fibration}}
    \mathbb C P^n,
    \qquad
    \text{round metric}
    \rightsquigarrow
    \text{Fubini--Study metric}.
$$


The sphere $S^{2n+1}\subset \mathbb C^{n+1}$ has the \textbf{round metric}
$$
    g_{\mathrm{round}}=\sum_{j=0}^n (dx_j^2+dy_j^2)\big|_{S^{2n+1}},
$$
induced from the Euclidean metric on $\mathbb R^{2n+2}$.

In homogeneous coordinates $[z]\in\mathbb C P^n$, the \textbf{Fubini--Study metric} is
$$
    g_{FS}
    =
    \frac{\langle dz,dz\rangle \langle z,z\rangle-|\langle z,dz\rangle|^2}{\langle z,z\rangle^2}.
$$

\section{Quantum physics and terminologies}

The geometric discussion above identifies the right state space, but the thesis ultimately studies physical observables on that space. We now return to quantum terminology and translate the geometric objects into the language of states, measurements, Haar sampling, and reduced density matrices. This is the point where the manifold picture and the operator picture meet.

In this section, we will introduce some terminologies and theorems used in quantum physics that are relevant to our study. Assuming no prior knowledge of quantum physics, we will provide brief definitions and explanations for each term.

One might ask, what is the fundamental difference between a quantum system and a classical system, and why can we not directly apply those theorems in classical computers to a quantum computer? It turns out that quantum error-correcting codes are hard due to the following definitions and features for quantum computing.

\begin{defn}
    All quantum operations can be constructed by composing four kinds of transformations: (adapted from Chapter 10 of \cite{Bengtsson_Zyczkowski_2017})

    \begin{enumerate}
        \item Unitary operations. $U(\cdot)$ for any quantum state. It is possible to apply a non-unitary operation for an open quantum system, but that is usually not the focus for quantum computing and usually leads to non-recoverable loss of information that we wish to obtain.
        \item Extend the system. Given a quantum state $\rho\in\mathcal{H}^N$, we can extend it to a larger quantum system by "entangle" (For this report, you don't need to worry for how quantum entanglement works) it with some new states $\sigma\in \mathcal{H}^K$ (The space where the new state dwells is usually called ancilla system) and get $\rho'=\rho\otimes\sigma\in \mathcal{H}^N\otimes \mathcal{K}$.
        \item Partial trace. Given a quantum state $\rho\in\mathcal{H}^N$ and some reference state $\sigma\in\mathcal {H}^K$, we can trace out some subsystems and get a reduced state on the remaining subsystem.
        \item Selective measurement. Given a quantum state, we measure it and get a classical bit; unlike the classical case, the measurement is a probabilistic operation. (More specifically, this is some projection to a reference state corresponding to a classical bit output. For this report, you don't need to worry about how such a result is obtained and how the reference state is constructed.)
    \end{enumerate}
\end{defn}


$U(n)$ is the group of all $n\times n$ \textbf{unitary matrices} over $\mathbb{C}$,

$$
    U(n)=\{A\in \mathbb{C}^{n\times n}: A^*A=AA^*=I_n\}
$$

The uniqueness of such measurement came from the lemma below~\cite{Elizabeth_book}

\begin{lemma}
    \label{lemma:haar_measure}

    Let $(U(n), \| \cdot \|, \mu)$ be a metric measure space where $\| \cdot \|$ is the Hilbert-Schmidt norm and $\mu$ is a Borel probability measure.

    The Haar measure on $U(n)$ is the unique probability measure that is invariant under the action of $U(n)$ on itself.

    That is, for every Borel set $E\subseteq U(n)$ and every $A\in U(n)$, $\mu(AE)=\mu(EA)=\mu(E)$.

    The Haar measure is the unique probability measure that is invariant under the action of $U(n)$ on itself.
\end{lemma}


A finite-dimensional quantum system is modeled by a complex Hilbert space (a complete inner product space)
$$
    \mathcal H \cong \mathbb C^{n+1}.
$$

A \textbf{pure state} is represented by a unit vector
$$
    \psi \in \mathcal H, \qquad \|\psi\|=1.
$$

A \textbf{mixed state} is represented by a density matrix
$$
    \rho=\sum_{j=1}^n p_j|\psi_j\rangle\langle\psi_j|, \qquad \sum_{j=1}^n p_j=1, \qquad p_j\geq 0.
$$

Some key comparisons between pure states and mixed states:

Pure states describe maximal information; mixed states describe probabilistic mixtures or partial information.

Pure states form a curved geometric space; mixed states form a convex set inside the space of matrices.

Pure states live in the complex projective space.

\begin{itemize}
    \item Two nonzero vectors that differ by a nonzero complex scalar represent the same physical state:
          $$
              \psi \sim \lambda \psi, \qquad \lambda \in \mathbb C^\times.
          $$
    \item In particular, multiplying by a phase $e^{i\theta}$ does not change any physical predictions.
    \item Therefore the physical pure state is not a single vector, but the \emph{complex line} spanned by that vector.
\end{itemize}

Hence the space of pure states (denoted by $\mathcal{P}(\mathcal H)$) is
$$
    \mathcal{P}(\mathcal H)
    =
    (\mathcal H \setminus \{0\})/\mathbb C^\times.
$$

After choosing a basis $\mathcal H \cong \mathbb C^{n+1}$, this becomes
$$
    \mathcal{P}(\mathcal H) \cong \mathbb C P^n.
$$


\begin{prop}
    \label{prop:indistinguishability}
    Proposition of indistinguishability:

    Suppose that we have two systems $u_1,u_2\in \mathscr{H}_1$, the two states are distinguishable if and only if they are orthogonal.
\end{prop}

\begin{proof}
    Ways to distinguish the two states:
    \begin{enumerate}
        \item Set $X=\{0,1,2\}$ and $M_i=|u_i\rangle\langle u_i|$, $M_0=I-M_1-M_2$
        \item Then $\{M_0,M_1,M_2\}$ is a complete collection of measurement operators on $\mathscr{H}$.
        \item Suppose the prepared state is $u_1$, then $p(1)=\|M_1u_1\|^2=\|u_1\|^2=1$, $p(2)=\|M_2u_1\|^2=0$, $p(0)=\|M_0u_1\|^2=0$.
    \end{enumerate}

    If they are not orthogonal, then there is no choice of measurement operators to perfectly distinguish the two states.

\end{proof}

Intuitively, if the two states are not orthogonal, then for any measurement (projection) there exists non-zero probability of getting the same outcome for both states.

\subsection{Random quantum states}

The preceding material identifies the spaces and symmetries of quantum states. The next step is probabilistic: once Haar invariance is available, we can speak about random pure states and random mixed states in a way that matches the geometric viewpoint developed earlier. These definitions are the starting point for the concentration statements proved in the next chapter.

First, we need to define what is a random state in a bipartite system.


\begin{defn}
    \label{defn:random_pure_state}
    Pure state:

    A random pure state $\varphi$ is any random variable distributed according to the unitarily invariant probability measure on the pure states $\mathcal{P}(A)$ of the system $A$, denoted by $\varphi\in_R\mathcal{P}(A)$.
\end{defn}


It is trivial that for the space of pure state, we can easily apply the Haar measure as the unitarily invariant probability measure by sampling unit vectors on $S^n$ for some $n$. However, for the case of mixed states, that is a bit complicated and we need to use partial tracing to define the rank-$s$ random states.

\begin{defn}
    \label{defn:rank_s_random_state}
    Rank-$s$ random state.

    For a system $A$ and an integer $s\geq 1$, consider the distribution on the mixed states $\mathcal{S}(A)$ of $A$ induced by the partial trace over the second factor from the uniform distribution on pure states of $A\otimes\mathbb{C}^s$. Any random variable $\rho$ distributed as such will be called a rank-$s$ random state, denoted by $\rho\in_R \mathcal{S}_s(A)$. And $\mathcal{P}(A)=\mathcal{S}_1(A)$.
\end{defn}


% When compiled standalone, print this chapter's references at the end.
\ifSubfilesClassLoaded{
    \printbibliography[title={References}]
}

\end{document}