updates

2025-10-16 15:52:48 -05:00
parent 858624b24d
commit bf344359a1
3 changed files with 247 additions and 23 deletions
--- a/content/CSE5519/CSE5519_D3.md
+++ b/content/CSE5519/CSE5519_D3.md
@@ -1,2 +1,15 @@
 # CSE5519 Advances in Computer Vision (Topic D: 2023: Image and Video Generation)

+## Scalable Diffusion Models with Transformers
+
+[link to paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Peebles_Scalable_Diffusion_Models_with_Transformers_ICCV_2023_paper.pdf)
+
+Create a diffusion model with transformers.
+
+Train conditional DiT models over latent patches replacing the U-Net.
+
+> [!TIP]
+>
+> This paper provides a scalable way to integrate the conditional DiT models over latent patches, replacing the U-Net to improve the performance of image generation.
+> 
+> I wonder how classifier-free guidance is used in training the DiT and if the model also has in-context learning ability, as other transformer models do.