CVPR 2025 Tutorial on

Scalable Generative Models in Computer Vision

Location TBD

June 11, 2025


Overview

Generative models have emerged as a transformative force in computer vision, enabling breakthroughs in image, video, and 3D content synthesis. Recent advancements in model architectures and generative frameworks have driven unprecedented scalability, allowing models to handle larger datasets, longer context lengths, and more complex distributions. This tutorial will provide a comprehensive discussion of these advancements, focusing on frontier techniques for scaling generative models and their applications to video synthesis, 3D reconstruction, and virtual world simulation. Attendees will gain insights into the design principles behind scalable models, learn about key technical innovations, and understand the broader implications for the future of computer vision. By addressing both theoretical and practical aspects, this tutorial aims to equip researchers with the knowledge to explore, build, and deploy next-generation scalable generative models.


Speakers


Tutorial Schedule

Time Session
9:00 - 9:10 Opening Remarks
9:10 - 10:00 Saining Xie
Generating More with Less: A Representation Learning Perspective
10:00 - 10:50 Deepti Ghadiyaram
TBD
10:50 - 11:40 Jiatao Gu
TBD
11:40 - 2:00 Lunch Break
2:00 - 2:50 Kaiming He
TBD
2:50 - 3:40 Sherry Yang
Scaling Generative World Models for Embodied Learning
3:40 - 4:00 Coffee Break
4:00 - 4:50 Arash Vahdat
TBD
4:50 - 5:00 Conclusion

Organizers

Willis Ma
NYU
Oscar Michel
NYU

Contact: Willis Ma, Oscar Michel, Saining Xie