upgrade structures and migrate to nextra v4

2025-07-06 12:40:25 -05:00
parent 76e50de44d
commit 717520624d
317 changed files with 18143 additions and 22777 deletions
--- a/content/CSE559A/CSE559A_L3.md
+++ b/content/CSE559A/CSE559A_L3.md
@@ -0,0 +1,357 @@
+# CSE559A Lecture 3
+
+## Image formation
+
+### Degrees of Freedom
+
+$$
+x=K[R|t]X
+$$
+
+$$
+w\begin{bmatrix}
+x\\
+y\\
+1
+\end{bmatrix}
+=
+\begin{bmatrix}
+    \alpha & s & u_0 \\
+    0 & \beta & v_0 \\
+    0 & 0 & 1
+\end{bmatrix}
+\begin{bmatrix}
+r_{11} & r_{12} & r_{13} &t_x\\
+r_{21} & r_{22} & r_{23} &t_y\\
+r_{31} & r_{32} & r_{33} &t_z\\
+\end{bmatrix}
+\begin{bmatrix}
+x\\
+y\\
+z\\
+1
+\end{bmatrix}
+$$
+
+### Impact of translation of camera
+
+$$
+p=K[R|t]\begin{bmatrix}
+x\\
+y\\
+z\\
+0
+\end{bmatrix}=K[R]\begin{bmatrix}
+x\\
+y\\
+z\\
+\end{bmatrix}
+$$
+
+Projection of a vanishing point or projection of a point at infinity is invariant to translation.
+
+### Recover world coordinates from pixel coordinates
+
+$$
+\begin{bmatrix}
+u\\
+v\\
+1
+\end{bmatrix}=K[R|t]^{-1}X
+$$
+
+Key issue: where is the world origin $w$? Suppose $w=1/s$
+
+$$
+\begin{aligned}
+    \begin{bmatrix}
+        u\\
+        v\\
+        1
+    \end{bmatrix}
+    &=sK[R|t]X\\
+    K^{-1}\begin{bmatrix}
+        u\\
+        v\\
+        1
+    \end{bmatrix}
+    &=s[R|t]X\\
+    R^{-1}K^{-1}\begin{bmatrix}
+        u\\
+        v\\
+        1
+    \end{bmatrix}&=s[I|R^{-1}t]X\\
+    R^{-1}K^{-1}\begin{bmatrix}
+        u\\
+        v\\
+        1
+    \end{bmatrix}&=[I|R^{-1}t]sX\\
+    R^{-1}K^{-1}\begin{bmatrix}
+        u\\
+        v\\
+        1
+    \end{bmatrix}&=sX+sR^{-1}t\\
+    \frac{1}{s}R^{-1}K^{-1}\begin{bmatrix}
+        u\\
+        v\\
+        1
+    \end{bmatrix}-R^{-1}t&=X\\
+\end{aligned}
+$$
+
+## Projective Geometry
+
+### Orthographic Projection
+
+Special case of perspective projection when $f\to\infty$
+
+- Distance for the center of projection is infinite
+- Also called parallel projection
+- Projection matrix is
+
+$$
+w\begin{bmatrix}
+u\\
+v\\
+1
+\end{bmatrix}=
+\begin{bmatrix}
+f & 0 & 0 & 0\\
+0 & f & 0 & 0\\
+0 & 0 & 0 & s\\
+\end{bmatrix}
+\begin{bmatrix}
+x\\
+y\\
+z\\
+1
+\end{bmatrix}
+$$
+
+Continue in later part of the course
+
+## Image processing foundations
+
+### Motivation for image processing
+
+Representational Motivation:
+
+- We need more than raw pixel values
+
+Computational Motivation:
+
+- Many image processing operations must be run across many locations in a image
+- A loop in python is slow
+- High-level libraries reduce errors, developer time, and algorithm runtime
+- Two common libraries:
+  - Torch+Torchvision: Focus on deep learning
+  - scikit-image: Focus on classical image processing algorithms
+
+### Operations on images
+
+#### Point operations
+
+Operations that are applied to one pixel at a time
+
+Negative image
+
+$$
+I_{neg}(x,y)=L-1-I(x,y)
+$$
+
+Power law transformation:
+
+$$
+I_{out}(x,y)=cI(x,y)^{\gamma}
+$$
+
+- $c$ is a constant
+- $\gamma$ is the gamma value
+
+Contrast stretching
+
+use function to stretch the range of pixel values
+
+$$
+I_{out}(x,y)=f(I(x,y))
+$$
+
+- $f$ is a function that stretches the range of pixel values
+
+Image histogram
+
+- Histogram of an image is a plot of the frequency of each pixel value
+
+Limitations:
+
+- No spatial information
+- No information about the relationship between pixels
+
+#### Linear filtering in spatial domain
+
+Operations that are applied to a neighborhood at each position
+
+Used to:
+
+- Enhance image features
+  - Denoise, sharpen, resize
+- Extract information about image structure
+  - Edge detection, corner detection, blob detection
+- Detect image patterns
+  - Template matching
+- Convolutional Neural Networks
+
+Image filtering
+
+Do dot product of the image with a kernel
+
+$$
+h[m,n]=\sum_{k=0}^{m-i}\sum_{l=0}^{n-i}g[k,l]f[m+k,n+l]
+$$
+
+```python
+def filter2d(image, kernel):
+    """
+    Apply a 2D filter to an image, do not use this in practice
+    """
+    for i in range(image.shape[0]):
+        for j in range(image.shape[1]):
+            image[i, j] = np.dot(kernel, image[i-1:i+2, j-1:j+2])
+    return image
+```
+
+Computational cost: $k^2mn$, assume $k$ is the size of the kernel and $m$ and $n$ are the dimensions of the image
+
+Do not use this in practice, use built-in functions instead.
+
+**Box filter**
+
+$$
+\frac{1}{9}\begin{bmatrix}
+1 & 1 & 1\\
+1 & 1 & 1\\
+1 & 1 & 1
+\end{bmatrix}
+$$
+
+Smooths the image
+
+**Identity filter**
+
+$$
+\begin{bmatrix}
+0 & 0 & 0\\
+0 & 1 & 0\\
+0 & 0 & 0
+\end{bmatrix}
+$$
+
+Does not change the image
+
+**Sharpening filter**
+
+$$
+\begin{bmatrix}
+0 & 0 & 0 \\
+0 & 2 & 0 \\
+0 & 0 & 0
+\end{bmatrix}-
+\begin{bmatrix}
+1 & 1 & 1 \\
+1 & 1 & 1 \\
+1 & 1 & 1
+\end{bmatrix}
+$$
+
+Enhances the image edges
+
+**Vertical edge detection**
+
+$$
+\begin{bmatrix}
+1 & 0 & -1 \\
+2 & 0 & -2 \\
+1 & 0 & -1
+\end{bmatrix}
+$$
+
+Detects vertical edges
+
+**Horizontal edge detection**
+
+$$
+\begin{bmatrix}
+1 & 2 & 1 \\
+0 & 0 & 0 \\
+-1 & -2 & -1
+\end{bmatrix}
+$$
+
+Detects horizontal edges
+
+Key property:
+
+- Linear:
+  - `filter(I,f_1+f_2)=filter(I,f_1)+filter(I,f_2)`
+- Scale invariant:
+  - `filter(I,af)=a*filter(I,f)`
+- Shift invariant:
+  - `filter(I,shift(f))=shift(filter(I,f))`
+- Commutative:
+  - `filter(I,f_1)*filter(I,f_2)=filter(I,f_2)*filter(I,f_1)`
+- Associative:
+  - `filter(I,f_1)*(filter(I,f_2)*filter(I,f_3))=(filter(I,f_1)*filter(I,f_2))*filter(I,f_3)`
+- Distributive:
+  - `filter(I,f_1+f_2)=filter(I,f_1)+filter(I,f_2)`
+- Identity:
+  - `filter(I,f_0)=I`
+
+Important filter:
+
+**Gaussian filter**
+
+$$
+G(x,y)=\frac{1}{2\pi\sigma^2}e^{-\frac{x^2+y^2}{2\sigma^2}}
+$$
+
+Smooths the image (Gaussian blur)
+
+Common mistake: Make filter too large, visualize the filter before applying it (make the value on the edge $3\sigma$)
+
+Properties of Gaussian filter:
+
+- Remove high frequency components
+- Convolution with self is another Gaussian filter
+- Separable kernel:
+  - `G(x,y)=G(x)G(y)` (factorable into the product of two 1D Gaussian filters)
+
+##### Filter Separability
+
+- Separable filter:
+  - `f(x,y)=f(x)f(y)`
+
+Example:
+
+$$
+\begin{bmatrix}
+1 & 2 & 1 \\
+2 & 4 & 2 \\
+1 & 2 & 1
+\end{bmatrix}=
+\begin{bmatrix}
+1 \\
+2 \\
+1
+\end{bmatrix}\times
+\begin{bmatrix}
+1 & 2 & 1
+\end{bmatrix}
+$$
+
+Gaussian filter is separable
+
+$$
+G(x,y)=\frac{1}{2\pi\sigma^2}e^{-\frac{x^2+y^2}{2\sigma^2}}=G(x)G(y)
+$$
+
+This reduces the computational cost of the filter from $k^2mn$ to $2kmn$