update
This commit is contained in:
77
pages/CSE559A/CSE559A_L14.md
Normal file
77
pages/CSE559A/CSE559A_L14.md
Normal file
@@ -0,0 +1,77 @@
|
||||
# CSE559A Lecture 14
|
||||
|
||||
## Neural Network Training
|
||||
|
||||
## Object Detection
|
||||
|
||||
AP (Average Precision)
|
||||
|
||||
### Benchmarks
|
||||
|
||||
#### PASCAL VOC Challenge
|
||||
|
||||
20 Challenge classes.
|
||||
|
||||
CNN increases the accuracy of object detection.
|
||||
|
||||
#### COCO dataset
|
||||
|
||||
Common objects in context.
|
||||
|
||||
Semantic segmentation. Every pixel is classified to tags.
|
||||
|
||||
Instance segmentation. Every pixel is classified and grouped into instances.
|
||||
|
||||
### Object detection: outline
|
||||
|
||||
Proposal generation
|
||||
|
||||
Object recognition
|
||||
|
||||
#### R-CNN
|
||||
|
||||
Proposal generation
|
||||
|
||||
Use CNN to extract features from proposals.
|
||||
|
||||
with SVM to classify proposals.
|
||||
|
||||
Use selective search to generate proposals.
|
||||
|
||||
Use AlexNet finetuned on PASCAL VOC to extract features.
|
||||
|
||||
Pros:
|
||||
|
||||
- Much more accurate than previous approaches
|
||||
- Andy deep architecture can immediately be "plugged in"
|
||||
|
||||
Cons:
|
||||
|
||||
- Not a single end-to-end trainable system
|
||||
- Fine-tune network with softmax classifier (log loss)
|
||||
- Train post-hoc linear SVMs (hinge loss)
|
||||
- Train post-hoc bounding box regressors (least squares)
|
||||
- Training is slow 2000CNN passes for each image
|
||||
- Inference (detection) was slow
|
||||
|
||||
#### Fast R-CNN
|
||||
|
||||
Proposal generation
|
||||
|
||||
Use CNN to extract features from proposals.
|
||||
|
||||
##### ROI pooling and ROI alignment
|
||||
|
||||
ROI pooling:
|
||||
|
||||
- Pooling is applied to the feature map.
|
||||
- Pooling is applied to the proposal.
|
||||
|
||||
ROI alignment:
|
||||
|
||||
- Align the proposal to the feature map.
|
||||
- Align the proposal to the feature map.
|
||||
|
||||
Use bounding box regression to refine the proposal.
|
||||
|
||||
|
||||
@@ -16,4 +16,5 @@ export default {
|
||||
CSE559A_L11: "Computer Vision (Lecture 11)",
|
||||
CSE559A_L12: "Computer Vision (Lecture 12)",
|
||||
CSE559A_L13: "Computer Vision (Lecture 13)",
|
||||
CSE559A_L14: "Computer Vision (Lecture 14)",
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user