专栏名称: 自动驾驶之心
自动驾驶开发者社区,关注计算机视觉、多维感知融合、部署落地、定位规控、领域方案等,坚持为领域输出最前沿的技术方向!
目录
相关文章推荐
基少成多  ·  风口来了,一键 起“飞” ? ·  昨天  
中国基金报  ·  新华基金,大举增资 ·  昨天  
老罗话指数投资  ·  2025年3月10日A股主要指数估值表 ·  2 天前  
中国基金报  ·  FOF基金经理重仓!这个量化天团,牛在哪里? ·  2 天前  
中国基金报  ·  张文宏点名记者减肥:你太胖了 ·  3 天前  
51好读  ›  专栏  ›  自动驾驶之心

CVPR 2025图像/视频/3D生成论文汇总(附论文呢/代码)

自动驾驶之心  · 公众号  ·  · 2025-03-06 07:30

正文

作者 | Kobay 编辑 | 自动驾驶之心

原文链接: https://zhuanlan.zhihu.com/p/27979298565

点击下方 卡片 ,关注“ 自动驾驶之心 ”公众号

戳我-> 领取 自动驾驶近15个 方向 学习 路线
>>点击进入→ 自动驾驶之心 『CVPR 2025』技术交流群
本文只做学术分享,如有侵权,联系删文

Awesome-CVPR2025-AIGC

A Collection of Papers and Codes for CVPR2025 AIGC

整理汇总下2025年CVPR AIGC相关的论文和代码,具体如下。

最新修改版本会首先更新在Github,欢迎star,fork和PR~

也欢迎对AIGC相关任务感兴趣的朋友一块更新~

github.com/Kobaayyy/Awesome-CVPR2025-CVPR2024-ECCV2024-AIGC/blob/main/CVPR2025.md

论文接收公布时间:2025年2月27日

【Contents】

  1. 图像生成(Image Generation/Image Synthesis)
  2. 图像编辑(Image Editing)
  3. 视频生成(Video Generation/Image Synthesis)
  4. 视频编辑(Video Editing)
  5. 3D生成(3D Generation/3D Synthesis)
  6. 3D编辑(3D Editing)
  7. 多模态大语言模型(Multi-Modal Large Language Model)
  8. 其他多任务(Others)

1.图像生成(Image Generation/Image Synthesis)

Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

  • Paper: https://arxiv.org/abs/2411.17787
  • Code: https://github.com/czg1225/CoDe

Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification

  • Paper: https://arxiv.org/abs/2408.16266
  • Code: https://github.com/scuwyh2000/Diff-II

Parallelized Autoregressive Visual Generation

  • Paper: https://arxiv.org/abs/2412.15119
  • Code: https://github.com/Epiphqny/PAR

PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation

  • Paper: https://arxiv.org/abs/2412.03177
  • Code: https://github.com/hqhQAQ/PatchDPO

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

  • Paper: https://arxiv.org/abs/2501.01423
  • Code: https://github.com/hustvl/LightningDiT

Rectified Diffusion Guidance for Conditional Generation

  • Paper: https://arxiv.org/abs/2410.18737
  • Code: https://github.com/thuxmf/recfg

SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models

  • Paper: https://arxiv.org/abs/2403.09055
  • Code: https://github.com/ironjr/semantic-draw

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models

  • Paper: https://arxiv.org/abs/2412.04852
  • Code: https://github.com/taco-group/SleeperMark

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

  • Paper: https://arxiv.org/abs/2412.03069
  • Code: https://github.com/ByteFlow-AI/TokenFlow

2.图像编辑(Image Editing)

Attention Distillation: A Unified Approach to Visual Characteristics Transfer

  • Paper: https://arxiv.org/abs/2502.20235
  • Code: https://github.com/xugao97/AttentionDistillation

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

  • Paper: https://arxiv.org/abs/2411.16832
  • Code: https://github.com/taco-group/FaceLock

EmoEdit: Evoking Emotions through Image Manipulation

  • Paper: https://arxiv.org/abs/2405.12661
  • Code: https://github.com/JingyuanYY/EmoEdit

K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs

  • Paper: https://arxiv.org/abs/2502.18461
  • Code: https://github.com/HVision-NKU/K-LoRA

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

  • Paper: https://arxiv.org/abs/2412.08503
  • Code: https://github.com/Westlake-AGI-Lab/StyleStudio

3.视频生成(Video Generation/Video Synthesis)

ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way

  • Paper: https://arxiv.org/abs/2410.06241
  • Code: https://github.com/Bujiazi/ByTheWay

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

  • Paper: https://arxiv.org/abs/2411.17440






请到「今天看啥」查看全文