final updates

2025-04-22 14:22:32 -05:00
parent cb355437ad
commit bb21338ccc
5 changed files with 368 additions and 2 deletions
--- a/pages/CSE559A/CSE559A_L25.md
+++ b/pages/CSE559A/CSE559A_L25.md
@@ -0,0 +1,217 @@
+# CSE559A Lecture 25
+
+## Geometry and Multiple Views
+
+### Cues for estimating Depth
+
+#### Multiple Views (the strongest depth cue)
+
+Two common settings:
+
+**Stereo vision**: a pair of cameras, usually with some constraints on the relative position of the two cameras.
+
+**Structure from (camera) motion**: cameras observing a scene from different viewpoints
+
+Structure and depth are inherently ambiguous from single views.
+
+Other hints for depth:
+
+- Occlusion
+- Perspective effects
+- Texture
+- Object motion
+- Shading
+- Focus/Defocus
+
+#### Focus on Stereo and Multiple Views
+
+Stereo correspondence: Given a point in one of the images, where could its corresponding points be in the other images?
+
+Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates of that point
+
+Motion: Given a set of corresponding points in two or more images, compute the camera parameters
+
+#### A simple example of estimating depth with stereo:
+
+Stereo: shape from "motion" between two views
+
+We'll need to consider:
+
+- Info on camera pose ("calibration")
+- Image point correspondences
+
+![Simple stereo system](https://notenextra.trance-0.com/CSE559A/Simple_stereo_system.png)
+
+Assume parallel optical axes, known camera parameters (i.e., calibrated cameras).  What is expression for Z?
+
+Similar triangles $(p_l, P, p_r)$ and $(O_l, P, O_r)$:
+
+$$
+\frac{T-x_l+x_r}{Z-f}=\frac{T}{Z}
+$$
+
+$$
+Z = \frac{f \cdot T}{x_l-x_r}
+$$
+
+### Camera Calibration
+
+Use an scene with known geometry
+
+- Correspond image points to 3d points
+- Get least squares solution (or non-linear solution)
+
+Solving unknown camera parameters:
+
+$$
+\begin{bmatrix}
+su\\
+sv\\
+s
+\end{bmatrix}
+= \begin{bmatrix}
+m_{11} & m_{12} & m_{13} & m_{14}\\
+m_{21} & m_{22} & m_{23} & m_{24}\\
+m_{31} & m_{32} & m_{33} & m_{34}
+\end{bmatrix}
+\begin{bmatrix}
+X\\
+Y\\
+Z\\
+1
+\end{bmatrix}
+$$
+
+Method 1: Homogenous linear system. Solve for m's entries using least squares.
+
+$$
+\begin{bmatrix} 
+X_1 & Y_1 & Z_1 & 1 & 0 & 0 & 0 & 0 & -u_1X_1 & -u_1Y_1 & -u_1Z_1 & -u_1 \\
+0 & 0 & 0 & 0 & X_1 & Y_1 & Z_1 & 1 & -v_1X_1 & -v_1Y_1 & -v_1Z_1 & -v_1 \\
+\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots\\
+X_n & Y_n & Z_n & 1 & 0 & 0 & 0 & 0 & -u_nX_n & -u_nY_n & -u_nZ_n & -u_n \\
+0 & 0 & 0 & 0 & X_n & Y_n & Z_n & 1 & -v_nX_n & -v_nY_n & -v_nZ_n & -v_n
+\end{bmatrix}
+\begin{bmatrix} m_{11} \\ m_{12} \\ m_{13} \\ m_{14} \\ m_{21} \\ m_{22} \\ m_{23} \\ m_{24} \\ m_{31} \\ m_{32} \\ m_{33} \\ m_{34} \end{bmatrix} = 0
+$$
+
+Method 2: Non-homogenous linear system. Solve for m's entries using least squares.
+
+**Advantages**
+
+- Easy to formulate and solve
+- Provides initialization for non-linear methods
+
+**Disadvantages**
+
+- Doesn't directly give you camera parameters
+- Doesn't model radial distortion
+- Can't impose constraints, such as known focal length
+
+**Non-linear methods are preferred**
+
+- Define error as difference between projected points and measured points
+- Minimize error using Newton's method or other non-linear optimization
+
+#### Triangulation
+
+Given projections of a 3D point in two or more images (with known camera matrices), find the coordinates of the point
+
+##### Approaches 1: Geometric approach
+
+Find shortest segment connecting the two viewing rays and let $X$ be the midpoint of that segment
+
+![Triangulation geometric approach](https://notenextra.trance-0.com/CSE559A/Triangulation_geometric_approach.png)
+
+##### Approaches 2: Non-linear optimization
+
+Minimize error between projected point and measured point
+
+$$
+||\operatorname{proj}(P_1 X) - x_1||_2^2 + ||\operatorname{proj}(P_2 X) - x_2||_2^2
+$$
+
+![Triangulation non-linear optimization](https://notenextra.trance-0.com/CSE559A/Triangulation_non_linear_optimization.png)
+
+##### Approaches 3: Linear approach
+
+$x_1\simeq P_1X$ and $x_2\simeq P_2X$
+
+$x_1\times P_1X = 0$ and $x_2\times P_2X = 0$
+
+$[x_{1_{\times}}]P_1X = 0$ and $[x_{2_{\times}}]P_2X = 0$
+
+Rewrite as:
+
+$$
+a\times b=\begin{bmatrix}
+0 & -a_3 & a_2\\
+a_3 & 0 & -a_1\\
+-a_2 & a_1 & 0
+\end{bmatrix}
+\begin{bmatrix}
+b_1\\
+b_2\\
+b_3
+\end{bmatrix}
+=[a_{\times}]b
+$$
+
+Using **singular value decomposition**, we can solve for $X$
+
+### Epipolar Geometry
+
+What constraints must hold between two projections of the same 3D point?
+
+Given a 2D point in one view, where can we find the corresponding point in the other view?
+
+Given only 2D correspondences, how can we calibrate the two cameras, i.e., estimate their relative position and orientation and the intrinsic parameters?
+
+Key ideas:
+
+- We can answer all these questions without knowledge of the 3D scene geometry
+- Important to think about projections of camera centers and visual rays into the other view
+
+#### Epipolar Geometry Setup
+
+![Epipolar geometry setup](https://notenextra.trance-0.com/CSE559A/Epipolar_geometry_setup.png)
+
+Suppose we have two cameras with centers $O,O'$
+
+The baseline is the line connecting the origins
+
+Epipoles $e,e'$ are where the baseline intersects the image planes, or projections of the other camera in each view
+
+Consider a point $X$, which projects to $x$ and $x'$
+
+The plane formed by $X,O,O'$ is called an epipolar plane
+There is a family of planes passing through $O$ and $O'$
+
+Epipolar lines are projections of the baseline into the image planes
+
+**Epipolar lines** connect the epipoles to the projections of $X$
+Equivalently, they are intersections of the epipolar plane with the image planes – thus, they come in matching pairs.
+
+**Application**: This constraint can be used to find correspondences between points in two camera. by the epipolar line in one image, we can find the corresponding feature in the other image.
+
+![Epipolar line for converging cameras](https://notenextra.trance-0.com/CSE559A/Epipolar_line_for_converging_cameras.png)
+
+Epipoles are finite and may be visible in the image.
+
+![Epipolar line for parallel cameras](https://notenextra.trance-0.com/CSE559A/Epipolar_line_for_parallel_cameras.png)
+
+Epipoles are infinite, epipolar lines parallel.
+
+![Epipolar line for perpendicular cameras](https://notenextra.trance-0.com/CSE559A/Epipolar_line_for_perpendicular_cameras.png)
+
+Epipole is "focus of expansion" and coincides with the principal point of the camera
+
+Epipolar lines go out from principal point
+
+Next class:
+
+### The Essential and Fundamental Matrices
+
+### Dense Stereo Matching
+
+