updates?
This commit is contained in:
@@ -1,2 +1,15 @@
|
||||
# CSE5519 Advances in Computer Vision (Topic E: 2025: Deep Learning for Geometric Computer Vision)
|
||||
|
||||
## VGGT: Visual Geometry Grounded Transformer
|
||||
|
||||
[link to paper](https://arxiv.org/pdf/2503.11651)
|
||||
|
||||
### Novelty in VGGT
|
||||
|
||||
Use alternating attention to encode the image.
|
||||
|
||||
> [!TIP]
|
||||
>
|
||||
> VGGT uses a feed-forward neural network that directly infers all key 3D attributes of a scene using alternating attention and is robust to some non-rigid deformations.
|
||||
>
|
||||
> I wonder how this model adapts to different light settings for the same image, how the non-Lambertian reflectance is captured, and how this framework can be extended to recover the true color of the objects and evaluate the surface properties of the objects.
|
||||
Reference in New Issue
Block a user