21 lines
820 B
Markdown
21 lines
820 B
Markdown
# CSE5519 Advances in Computer Vision (Topic A: 2023 - 2024: Semantic Segmentation)
|
|
|
|
## Segment Anything
|
|
|
|
[link to the paper](https://arxiv.org/pdf/2304.02643)
|
|
|
|
### Novelty in Segment Anything
|
|
|
|
Brute force approach with large scale training data (400x) more
|
|
|
|
#### Dataset construction
|
|
|
|
- Model-assisted manual annotation
|
|
- Semi-automatic annotation
|
|
- Automatic annotation (predict mask for 32x32 patches)
|
|
|
|
> [!TIP]
|
|
>
|
|
> This paper shows a remarkable breakthrough in semantic segmentation with a brute force approach using a large scale training data. The authors use a transformer encoder to get the final segmentation map.
|
|
>
|
|
> I'm really interested in the scalability of the model. Is there any approach to reduce the training data size or the model size with comparable performance via distillation or other techniques? |