73 lines
1.4 KiB
Markdown
73 lines
1.4 KiB
Markdown
# CSE559A Lecture 14
|
|
|
|
## Object Detection
|
|
|
|
AP (Average Precision)
|
|
|
|
### Benchmarks
|
|
|
|
#### PASCAL VOC Challenge
|
|
|
|
20 Challenge classes.
|
|
|
|
CNN increases the accuracy of object detection.
|
|
|
|
#### COCO dataset
|
|
|
|
Common objects in context.
|
|
|
|
Semantic segmentation. Every pixel is classified to tags.
|
|
|
|
Instance segmentation. Every pixel is classified and grouped into instances.
|
|
|
|
### Object detection: outline
|
|
|
|
Proposal generation
|
|
|
|
Object recognition
|
|
|
|
#### R-CNN
|
|
|
|
Proposal generation
|
|
|
|
Use CNN to extract features from proposals.
|
|
|
|
with SVM to classify proposals.
|
|
|
|
Use selective search to generate proposals.
|
|
|
|
Use AlexNet finetuned on PASCAL VOC to extract features.
|
|
|
|
Pros:
|
|
|
|
- Much more accurate than previous approaches
|
|
- Andy deep architecture can immediately be "plugged in"
|
|
|
|
Cons:
|
|
|
|
- Not a single end-to-end trainable system
|
|
- Fine-tune network with softmax classifier (log loss)
|
|
- Train post-hoc linear SVMs (hinge loss)
|
|
- Train post-hoc bounding box regressors (least squares)
|
|
- Training is slow 2000CNN passes for each image
|
|
- Inference (detection) was slow
|
|
|
|
#### Fast R-CNN
|
|
|
|
Proposal generation
|
|
|
|
Use CNN to extract features from proposals.
|
|
|
|
##### ROI pooling and ROI alignment
|
|
|
|
ROI pooling:
|
|
|
|
- Pooling is applied to the feature map.
|
|
- Pooling is applied to the proposal.
|
|
|
|
ROI alignment:
|
|
|
|
- Align the proposal to the feature map.
|
|
- Align the proposal to the feature map.
|
|
|
|
Use bounding box regression to refine the proposal. |