专栏名称: 自动驾驶之心

自动驾驶开发者社区，关注计算机视觉、多维感知融合、部署落地、定位规控、领域方案等，坚持为领域输出最前沿的技术方向！

CVPR 2025图像/视频/3D生成论文汇总（附论文呢/代码）

自动驾驶之心 · 公众号 · · 2025-03-06 07:30

正文

作者 | Kobay 编辑 | 自动驾驶之心

原文链接： https://zhuanlan.zhihu.com/p/27979298565

点击下方卡片，关注“ 自动驾驶之心 ”公众号

戳我-> 领取 自动驾驶近15个 方向学习路线

>>点击进入→ 自动驾驶之心 『CVPR 2025』技术交流群

本文只做学术分享，如有侵权，联系删文

Awesome-CVPR2025-AIGC

A Collection of Papers and Codes for CVPR2025 AIGC

整理汇总下2025年CVPR AIGC相关的论文和代码，具体如下。

最新修改版本会首先更新在Github，欢迎star，fork和PR~

也欢迎对AIGC相关任务感兴趣的朋友一块更新～

github.com/Kobaayyy/Awesome-CVPR2025-CVPR2024-ECCV2024-AIGC/blob/main/CVPR2025.md

论文接收公布时间：2025年2月27日

【Contents】

图像生成(Image Generation/Image Synthesis)
图像编辑（Image Editing)
视频生成(Video Generation/Image Synthesis)
视频编辑(Video Editing)
3D生成(3D Generation/3D Synthesis)
3D编辑(3D Editing)
多模态大语言模型(Multi-Modal Large Language Model)
其他多任务(Others)

1.图像生成(Image Generation/Image Synthesis)

Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

Paper: https://arxiv.org/abs/2411.17787
Code: https://github.com/czg1225/CoDe

Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification

Paper: https://arxiv.org/abs/2408.16266
Code: https://github.com/scuwyh2000/Diff-II

Parallelized Autoregressive Visual Generation

Paper: https://arxiv.org/abs/2412.15119
Code: https://github.com/Epiphqny/PAR

PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation

Paper: https://arxiv.org/abs/2412.03177
Code: https://github.com/hqhQAQ/PatchDPO

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper: https://arxiv.org/abs/2501.01423
Code: https://github.com/hustvl/LightningDiT

Rectified Diffusion Guidance for Conditional Generation

Paper: https://arxiv.org/abs/2410.18737
Code: https://github.com/thuxmf/recfg

SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models

Paper: https://arxiv.org/abs/2403.09055
Code: https://github.com/ironjr/semantic-draw

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models

Paper: https://arxiv.org/abs/2412.04852
Code: https://github.com/taco-group/SleeperMark

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

Paper: https://arxiv.org/abs/2412.03069
Code: https://github.com/ByteFlow-AI/TokenFlow

2.图像编辑(Image Editing)

Attention Distillation: A Unified Approach to Visual Characteristics Transfer

Paper: https://arxiv.org/abs/2502.20235
Code: https://github.com/xugao97/AttentionDistillation

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Paper: https://arxiv.org/abs/2411.16832
Code: https://github.com/taco-group/FaceLock

EmoEdit: Evoking Emotions through Image Manipulation

Paper: https://arxiv.org/abs/2405.12661
Code: https://github.com/JingyuanYY/EmoEdit

K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs

Paper: https://arxiv.org/abs/2502.18461
Code: https://github.com/HVision-NKU/K-LoRA

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

Paper: https://arxiv.org/abs/2412.08503
Code: https://github.com/Westlake-AGI-Lab/StyleStudio

3.视频生成(Video Generation/Video Synthesis)

ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way

Paper: https://arxiv.org/abs/2410.06241
Code: https://github.com/Bujiazi/ByTheWay

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Paper: https://arxiv.org/abs/2411.17440

CVPR 2025图像/视频/3D生成论文汇总（附论文呢/代码）

正文

【Contents】

1.图像生成(Image Generation/Image Synthesis)

2.图像编辑(Image Editing)

3.视频生成(Video Generation/Video Synthesis)

请到「今天看啥」查看全文