Files
NoteNextra-origin/content/CSE5519/CSE5519_H5.md
Trance-0 d90faea29b
Some checks failed
Sync from Gitea (main→main, keep workflow) / mirror (push) Has been cancelled
updates
2025-11-20 10:15:25 -06:00

22 lines
1.3 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CSE5519 Advances in Computer Vision (Topic H: 2025: Safety, Robustness, and Evaluation of CV Models)
## Incorporating Geo-Diverse Knowledge into Prompting for Increased Geographic Robustness in Object Recognition
Does adding geographical context to CLIP prompts improve recognition across geographies?
Yes, about 1%
Can an LLM provide useful geographic descriptive knowledge to improve recognition?
Yes
How can we optimize soft prompts for CLIP using an accessible data source with consideration of target geographies not represented in the training set?
Where can soft prompts enhanced with geographical knowledge provide the most benefits?
> [!TIP]
>
> This model proposed an effective way to improve the model performance by self-querying geographical data.
>
> I wonder what might be the ultimate boundary of the LLM-generated context and performance improvement. Theoretically,  it seems that we can use LLM to generate the majority of possible contexts before making predictions and use the context to improve the performance. However, introducing additional (might be irrelevant) information may generate hallucinations. I wonder if we can find a general approach to let LLM generate a decent context for the task and use the context to improve the performance.