NoteNextra-origin/content/CSE5519/CSE5519_A3.md

# CSE5519 Advances in Computer Vision (Topic A: 2023 - 2024: Semantic Segmentation)

## Segment Anything

[link to the paper](https://arxiv.org/pdf/2304.02643)

### Novelty in Segment Anything

Brute force approach with large scale training data (400x) more

#### Dataset construction

- Model-assisted manual annotation
- Semi-automatic annotation
- Automatic annotation (predict mask for 32x32 patches)

> [!TIP]
>
> This paper shows a remarkable breakthrough in semantic segmentation with a brute force approach using a large scale training data. The authors use a transformer encoder to get the final segmentation map.
>
> I'm really interested in the scalability of the model. Is there any approach to reduce the training data size or the model size with comparable performance via distillation or other techniques?