专栏名称: 深度学习与图网络
关注图网络、图表示学习,最近顶会顶刊动态以及机器学习基本方法,包括无监督学习、半监督学习、弱监督学习、元学习等
目录
相关文章推荐
云南气象  ·  云南西部、中北部、东部雨(雪)不断 ... ·  20 小时前  
掌上春城  ·  大涨!还在涨!创历史新高 ·  昨天  
云南广播电视台  ·  昆明市政府发布4名同志任免职通知 ·  昨天  
掌上春城  ·  今天推文是DeepSeek写的🧡 ·  3 天前  
918云南交通台  ·  揪心,已造成30余人失联! ·  3 天前  
51好读  ›  专栏  ›  深度学习与图网络

Sora视频生成模型30篇相关论文集合!《视频生成模型作为世界模拟器》中引用的所有论文集合

深度学习与图网络  · 公众号  ·  · 2024-02-21 21:36

正文

OpenAI 的首个视频生成模型 Sora,让「一句话生成视频」的前沿 AI 技术向上突破了一大截,引发了业界对于生成式 AI 技术方向的大讨论。

OpenAI 探索了视频数据生成模型的大规模训练。具体来说,研究人员在可变持续时间、分辨率和宽高比的视频和图像上联合训练了一个文本条件扩散模型。作者利用对视频和图像潜在代码的时空补丁进行操作的 transformer 架构,其最大的模型 Sora 能够生成长达一分钟的高质量视频。


OpenAI 认为,新展示的结果表明,扩展视频生成模型是构建物理世界通用模拟器的一条有前途的途径。


  • Unsupervised Learning of Video Representations using LSTMs

    Paper 1502.04681 •Published

  • Recurrent Environment Simulators

    Paper 1704.02254 •Published

  • World Models

    Paper 1803.10122 •Published

  • Generating Videos with Scene Dynamics

    Paper 1609.02612 •Published

  • MoCoGAN: Decomposing Motion and Content for Video Generation

    Paper 1707.04993 •Published

  • Adversarial Video Generation on Complex Datasets

    Paper 1907.06571 •Published

  • Generating Long Videos of Dynamic Scenes

    Paper 2206.03429 •Published

  • VideoGPT: Video Generation using VQ-VAE and Transformers

    Paper 2104.10157 •Published 2

  • NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion

    Paper 2111.12417 •Published

  • Imagen Video: High Definition Video Generation with Diffusion Models

    Paper 2210.02303 •Published

  • Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

    Paper 2304.08818 •Published 4

  • Photorealistic Video Generation with Diffusion Models

    Paper 2312.06662 •Published 20

  • Language Models are Few-Shot Learners

    Paper 2005.14165 •Published 7

  • An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Paper 2010.11929 •Published 2

  • ViViT: A Video Vision Transformer

    Paper 2103.15691 •Published

  • Masked Autoencoders Are Scalable Vision Learners

    Paper 2111.06377 •Published 1

  • Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

    Paper 2307.06304 •Published 23

  • High-Resolution Image Synthesis with Latent Diffusion Models

    Paper 2112.10752 •Published 5

  • Auto-Encoding Variational Bayes

    Paper 1312.6114 •Published

  • Deep Unsupervised Learning using Nonequilibrium Thermodynamics

    Paper 1503.03585 •Published

  • Denoising Diffusion Probabilistic Models

    Paper 2006.11239 •Published 1

  • Improved Denoising Diffusion Probabilistic Models

    Paper 2102.09672 •Published 1

  • Diffusion Models Beat GANs on Image Synthesis

    Paper 2105.05233 •Published

  • Elucidating the Design Space of Diffusion-Based Generative Models

    Paper 2206.00364 •Published 1

  • Scalable Diffusion Models with Transformers

    Paper 2212.09748 •Published 4

  • Zero-Shot Text-to-Image Generation







请到「今天看啥」查看全文