几篇论文实现代码:
《Generalizable and Animatable Gaussian Head Avatar》(NeurIPS 2024) GitHub: github.com/xg-chu/GAGAvatar_track
《PromptFix: You Prompt and We Fix the Photo》(NeurIPS 2024) GitHub: github.com/yeates/PromptFix
《DeMo: Decoupling Motion Forecasting into Directional Intentions and Dynamic States》(NeurIPS 2024) GitHub: github.com/fudan-zvg/DeMo [fig12]
《PPFlow: Target-Aware Peptide Design with Torsional Flow Matching》(ICML 2024) GitHub: github.com/EDAPINENUT/ppflow
《text-guided 3d face synthesis - from generation to editing》(CVPR 2024) GitHub: github.com/JiejiangWu/FaceG2E
《SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design》(CVPR 2024) GitHub: github.com/ysj9909/SHViT
《ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers》(CVPR 2024) GitHub: github.com/ViewFormerOcc/ViewFormer-Occ [fig10]
《LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding》(ACL 2024) GitHub: github.com/facebookresearch/LayerSkip
《LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models》(2024) GitHub: github.com/opendatalab/LOKI
《BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities》(2024) GitHub: github.com/haoosz/BiGR
《BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities》(2024) GitHub: github.com/haoosz/BiGR [fig1]
《Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models》(2024) GitHub: github.com/ZqlwMatt/Tex4D [fig2]
《D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement》(2024) GitHub: github.com/Peterande/D-FINE [fig3]
《HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction》(2024) GitHub: github.com/Open3DVLab/HiSplat
《Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein design》(2024) GitHub: github.com/ChenyuWang-Monica/DRAKES [fig4]
《DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception》(2024) GitHub: github.com/opendatalab/DocLayout-YOLO [fig5]
《A Comparative Study on Reasoning Patterns of OpenAI's o1 Model》(2024) GitHub: github.com/Open-Source-O1/o1_Reasoning_Patterns_Study
《EchoPrime: A Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation》(2024) GitHub: github.com/echonet/EchoPrime
《VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding》(2024) GitHub: github.com/OpenRobotLab/VLM-Grounder [fig6]
《DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion》(2024) GitHub: github.com/leoShen917/DreamMover
《Deep Height Decoupling for Precise Vision-based 3D Occupancy Prediction》(2024) GitHub: github.com/yanzq95/DHD [fig7]
《Depth Any Video with Scalable Synthetic Data》(2024) GitHub: github.com/Nightmare-n/DepthAnyVideo [fig8]
《MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models》(2024) GitHub: github.com/richard-peng-xia/MMed-RAG [fig9]
《I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers via Projected Flow》(2024) GitHub: github.com/PRIS-CV/I-Max
《FlipAttack: Jailbreak LLMs via Flipping》(2024) GitHub: github.com/yueliu1999/FlipAttack
《D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement》(2024) GitHub: github.com/Peterande/D-FINE
《SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting》(2024) GitHub: github.com/liwrui/SceneDreamer360 [fig11]
《SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree》(2024) GitHub: github.com/Mark12Ding/SAM2Long
《MagicClay: Sculpting Meshes With Generative Neural Fields》(2024) GitHub: github.com/amirbarda/MagicClay
《DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing》(2024) GitHub: github.com/choi403/DiffusionGuard
《Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models》(2024) GitHub: github.com/RManLuo/graph-constrained-reasoning
《SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models》(2024) GitHub: github.com/Gunale0926/SORSA
《Generalizable and Animatable Gaussian Head Avatar》(NeurIPS 2024) GitHub: github.com/xg-chu/GAGAvatar_track
《PromptFix: You Prompt and We Fix the Photo》(NeurIPS 2024) GitHub: github.com/yeates/PromptFix
《DeMo: Decoupling Motion Forecasting into Directional Intentions and Dynamic States》(NeurIPS 2024) GitHub: github.com/fudan-zvg/DeMo [fig12]
《PPFlow: Target-Aware Peptide Design with Torsional Flow Matching》(ICML 2024) GitHub: github.com/EDAPINENUT/ppflow
《text-guided 3d face synthesis - from generation to editing》(CVPR 2024) GitHub: github.com/JiejiangWu/FaceG2E
《SHViT: Single-Head Vision Transformer with Memory Efficient Macro Design》(CVPR 2024) GitHub: github.com/ysj9909/SHViT
《ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers》(CVPR 2024) GitHub: github.com/ViewFormerOcc/ViewFormer-Occ [fig10]
《LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding》(ACL 2024) GitHub: github.com/facebookresearch/LayerSkip
《LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models》(2024) GitHub: github.com/opendatalab/LOKI
《BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities》(2024) GitHub: github.com/haoosz/BiGR
《BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities》(2024) GitHub: github.com/haoosz/BiGR [fig1]
《Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models》(2024) GitHub: github.com/ZqlwMatt/Tex4D [fig2]
《D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement》(2024) GitHub: github.com/Peterande/D-FINE [fig3]
《HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction》(2024) GitHub: github.com/Open3DVLab/HiSplat
《Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein design》(2024) GitHub: github.com/ChenyuWang-Monica/DRAKES [fig4]
《DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception》(2024) GitHub: github.com/opendatalab/DocLayout-YOLO [fig5]
《A Comparative Study on Reasoning Patterns of OpenAI's o1 Model》(2024) GitHub: github.com/Open-Source-O1/o1_Reasoning_Patterns_Study
《EchoPrime: A Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation》(2024) GitHub: github.com/echonet/EchoPrime
《VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding》(2024) GitHub: github.com/OpenRobotLab/VLM-Grounder [fig6]
《DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion》(2024) GitHub: github.com/leoShen917/DreamMover
《Deep Height Decoupling for Precise Vision-based 3D Occupancy Prediction》(2024) GitHub: github.com/yanzq95/DHD [fig7]
《Depth Any Video with Scalable Synthetic Data》(2024) GitHub: github.com/Nightmare-n/DepthAnyVideo [fig8]
《MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models》(2024) GitHub: github.com/richard-peng-xia/MMed-RAG [fig9]
《I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers via Projected Flow》(2024) GitHub: github.com/PRIS-CV/I-Max
《FlipAttack: Jailbreak LLMs via Flipping》(2024) GitHub: github.com/yueliu1999/FlipAttack
《D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement》(2024) GitHub: github.com/Peterande/D-FINE
《SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting》(2024) GitHub: github.com/liwrui/SceneDreamer360 [fig11]
《SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree》(2024) GitHub: github.com/Mark12Ding/SAM2Long
《MagicClay: Sculpting Meshes With Generative Neural Fields》(2024) GitHub: github.com/amirbarda/MagicClay
《DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing》(2024) GitHub: github.com/choi403/DiffusionGuard
《Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models》(2024) GitHub: github.com/RManLuo/graph-constrained-reasoning
《SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models》(2024) GitHub: github.com/Gunale0926/SORSA