Files
NoteNextra-origin/content/CSE559A/CSE559A_L17.md
2025-07-06 12:40:25 -05:00

185 lines
4.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CSE559A Lecture 17
## Local Features
### Types of local features
#### Edge
Goal: Identify sudden changes in image intensity
Generate edge map as human artists.
An edge is a place of rapid change in the image intensity function.
Take the absolute value of the first derivative of the image intensity function.
For 2d functions, $\frac{\partial f}{\partial x}=\lim_{\Delta x\to 0}\frac{f(x+\Delta x)-f(x)}{\Delta x}$
For discrete images data, $\frac{\partial f}{\partial x}\approx \frac{f(x+1)-f(x)}{1}$
Run convolution with kernel $[1,0,-1]$ to get the first derivative in the x direction, without shifting. (generic kernel is $[1,-1]$)
Prewitt operator:
$$
M_x=\begin{bmatrix}
1 & 0 & -1 \\
1 & 0 & -1 \\
1 & 0 & -1 \\
\end{bmatrix}
\quad
M_y=\begin{bmatrix}
1 & 1 & 1 \\
0 & 0 & 0 \\
-1 & -1 & -1 \\
\end{bmatrix}
$$
Sobel operator:
$$
M_x=\begin{bmatrix}
1 & 0 & -1 \\
2 & 0 & -2 \\
1 & 0 & -1 \\
\end{bmatrix}
\quad
M_y=\begin{bmatrix}
1 & 2 & 1 \\
0 & 0 & 0 \\
-1 & -2 & -1 \\
\end{bmatrix}
$$
Roberts operator:
$$
M_x=\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}
\quad
M_y=\begin{bmatrix}
0 & 1 \\
-1 & 0 \\
\end{bmatrix}
$$
Image gradient:
$$
\nabla f = \left(\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}\right)
$$
Gradient magnitude:
$$
||\nabla f|| = \sqrt{\left(\frac{\partial f}{\partial x}\right)^2 + \left(\frac{\partial f}{\partial y}\right)^2}
$$
Gradient direction:
$$
\theta = \tan^{-1}\left(\frac{\frac{\partial f}{\partial y}}{\frac{\partial f}{\partial x}}\right)
$$
The gradient points in the direction of the most rapid increase in intensity.
> Application: Gradient-domain image editing
>
> Goal: solve for pixel values in the target region to match gradients of the source region while keeping the rest of the image unchanged.
>
> [Poisson Image Editing](http://www.cs.virginia.edu/~connelly/class/2014/comp_photo/proj2/poisson.pdf)
Noisy edge detection:
When the intensity function is very noisy, we can use a Gaussian smoothing filter to reduce the noise before taking the gradient.
Suppose pixels of the true image $f_{i,j}$ are corrupted by Gaussian noise $n_{i,j}$ with mean 0 and variance $\sigma^2$.
Then the noisy image is $g_{i,j}=(f_{i,j}+n_{i,j})-(f_{i,j+1}+n_{i,j+1})\approx N(0,2\sigma^2)$
To find edges, look for peaks in $\frac{d}{dx}(f\circ g)$ where $g$ is the Gaussian smoothing filter.
or we can directly use the Derivative of Gaussian (DoG) filter:
$$
\frac{d}{dx}g(x,\sigma)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{x^2}{2\sigma^2}}
$$
##### Separability of Gaussian filter
A Gaussian filter is separable if it can be written as a product of two 1D filters.
$$
\frac{d}{dx}g(x,\sigma)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{x^2}{2\sigma^2}}
\quad \frac{d}{dy}g(y,\sigma)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{y^2}{2\sigma^2}}
$$
##### Separable Derivative of Gaussian (DoG) filter
$$
\frac{d}{dx}g(x,y)\propto -x\exp\left(-\frac{x^2+y^2}{2\sigma^2}\right)
\quad \frac{d}{dy}g(x,y)\propto -y\exp\left(-\frac{x^2+y^2}{2\sigma^2}\right)
$$
##### Derivative of Gaussian: Scale
Using Gaussian derivatives with different values of 𝜎 finds structures at different scales or frequencies
(Take the hybrid image as an example)
##### Canny edge detector
1. Smooth the image with a Gaussian filter
2. Compute the gradient magnitude and direction of the smoothed image
3. Thresholding gradient magnitude
4. Non-maxima suppression
- For each location `q` above the threshold, check that the gradient magnitude is higher than at adjacent points `p` and `r` in the direction of the gradient
5. Thresholding the non-maxima suppressed gradient magnitude
6. Hysteresis thresholding
- Use two thresholds: high and low
- Start with a seed edge pixel with a gradient magnitude greater than the high threshold
- Follow the gradient direction to find all connected pixels with a gradient magnitude greater than the low threshold
##### Top-down segmentation
Data-driven top-down segmentation:
#### Interest point
Key point matching:
1. Find a set of distinctive keypoints in the image
2. Define a region of interest around each keypoint
3. Compute a local descriptor from the normalized region
4. Match local descriptors between images
Characteristic of good features:
- Repeatability
- The same feature can be found in several images despite geometric and photometric transformations
- Saliency
- Each feature is distinctive
- Compactness and efficiency
- Many fewer features than image pixels
- Locality
- A feature occupies a relatively small area of the image; robust to clutter and occlusion
##### Harris corner detector
### Applications of local features
#### Image alignment
#### 3D reconstruction
#### Motion tracking
#### Robot navigation
#### Indexing and database retrieval
#### Object recognition