Sora视频生成模型30篇相关论文集合！《视频生成模型作为世界模拟器》中引用的所有论文集合

深度学习与图网络 · 公众号 · · 2024-02-21 21:36

正文

OpenAI 的首个视频生成模型 Sora，让「一句话生成视频」的前沿 AI 技术向上突破了一大截，引发了业界对于生成式 AI 技术方向的大讨论。

OpenAI 探索了视频数据生成模型的大规模训练。具体来说，研究人员在可变持续时间、分辨率和宽高比的视频和图像上联合训练了一个文本条件扩散模型。作者利用对视频和图像潜在代码的时空补丁进行操作的 transformer 架构，其最大的模型 Sora 能够生成长达一分钟的高质量视频。

OpenAI 认为，新展示的结果表明，扩展视频生成模型是构建物理世界通用模拟器的一条有前途的途径。

Unsupervised Learning of Video Representations using LSTMs

Paper • 1502.04681 •Published Feb 17, 2015
Recurrent Environment Simulators

Paper • 1704.02254 •Published Apr 7, 2017
World Models

Paper • 1803.10122 •Published Mar 27, 2018
Generating Videos with Scene Dynamics

Paper • 1609.02612 •Published Sep 9, 2016
MoCoGAN: Decomposing Motion and Content for Video Generation

Paper • 1707.04993 •Published Jul 17, 2017
Adversarial Video Generation on Complex Datasets

Paper • 1907.06571 •Published Jul 16, 2019
Generating Long Videos of Dynamic Scenes

Paper • 2206.03429 •Published Jun 8, 2022
VideoGPT: Video Generation using VQ-VAE and Transformers

Paper • 2104.10157 •Published Apr 21, 2021 • 2
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

Paper • 2111.12417 •Published Nov 24, 2021
Imagen Video: High Definition Video Generation with Diffusion Models

Paper • 2210.02303 •Published Oct 5, 2022
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

Paper • 2304.08818 •Published Apr 18, 2023 • 4
Photorealistic Video Generation with Diffusion Models

Paper • 2312.06662 •Published Dec 12, 2023 • 20
Language Models are Few-Shot Learners

Paper • 2005.14165 •Published May 29, 2020 • 7
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Paper • 2010.11929 •Published Oct 23, 2020 • 2
ViViT: A Video Vision Transformer

Paper • 2103.15691 •Published Mar 29, 2021
Masked Autoencoders Are Scalable Vision Learners

Paper • 2111.06377 •Published Nov 12, 2021 • 1
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Paper • 2307.06304 •Published Jul 13, 2023 • 23
High-Resolution Image Synthesis with Latent Diffusion Models

Paper • 2112.10752 •Published Dec 21, 2021 • 5
Auto-Encoding Variational Bayes

Paper • 1312.6114 •Published Dec 21, 2013
Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Paper • 1503.03585 •Published Mar 12, 2015
Denoising Diffusion Probabilistic Models

Paper • 2006.11239 •Published Jun 20, 2020 • 1
Improved Denoising Diffusion Probabilistic Models

Paper • 2102.09672 •Published Feb 19, 2021 • 1
Diffusion Models Beat GANs on Image Synthesis

Paper • 2105.05233 •Published May 12, 2021
Elucidating the Design Space of Diffusion-Based Generative Models

Paper • 2206.00364 •Published Jun 1, 2022 • 1
Scalable Diffusion Models with Transformers

Paper • 2212.09748 •Published Dec 20, 2022 • 4
Zero-Shot Text-to-Image Generation

Sora视频生成模型30篇相关论文集合！《视频生成模型作为世界模拟器》中引用的所有论文集合

正文

请到「今天看啥」查看全文