Files
NoteNextra-origin/content/CSE5519/CSE5519_A4.md
Trance-0 0597afb511 updates?
2025-11-14 11:15:12 -06:00

15 lines
1.0 KiB
Markdown

# CSE5519 Advances in Computer Vision (Topic A: 2025: Semantic Segmentation)
## Dual Semantic Guidance for Open Vocabulary Sematic segmentation
[link to the paper](https://openaccess.thecvf.com/content/CVPR2025/papers/Wang_Dual_Semantic_Guidance_for_Open_Vocabulary_Semantic_Segmentation_CVPR_2025_paper.pdf)
## Novelty in Dual Semantic Guidance
Use dual semantic guidance for semantic segmentation. For each mask, deploy clip like object detection to align the mask with text description.
> [!TIP]
>
> This paper proposed a generalizable semantic segmentation model with a CLIP-like image-text encoder to refine the mask prediction.
>
> However, I wonder how this model generalized to segment different faces of geometry and create a clear boundary between different objects and the background. In most cases, CLIP may not need complete image information to predict the object and can make a decision based on partial objects. If we have some novel objects containing features of two that might be out of CLIP's codebook, will the CLIP-alignment still work?