几篇论文实现代码:
《Generalizable and Animatable Gaussian Head Avatar》(NeurIPS 2024) GitHub: github.com/xg-chu/GAGAvatar
《QueST: Self-Supervised Skill Abstractions for Continuous Control》(NeurIPS 2024) GitHub: github.com/pairlab/QueST
《MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing》(NeurIPS 2024) GitHub: github.com/ewrfcas/MVInpainter
《DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation》(NeurIPS 2024) GitHub: github.com/WU-CVGL/DreamMesh4D [fig9]
《Pyramidal Flow Matching for Efficient Video Generative Modeling》(2024) GitHub: github.com/jy0205/Pyramid-Flow [fig1]
《Aria: An Open Multimodal Native Mixture-of-Experts Model》(2024) GitHub: github.com/rhymes-ai/Aria
《MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering》(2024) GitHub: github.com/openai/mle-bench
《Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think》(2024) GitHub: github.com/sihyun-yu/REPA
《SyllableLM: Learning Coarse Semantic Units for Speech Language Models》(2024) GitHub: github.com/AlanBaade/SyllableLM
《Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation》(2024) GitHub: github.com/OpenGVLab/PhyGenBench [fig2]
《SPA: 3D Spatial-Awareness Enables Effective Embodied Representation》(2024) GitHub: github.com/HaoyiZhu/SPA
《VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks》(2024) GitHub: github.com/TIGER-AI-Lab/VLM2Vec [fig3]
《RGB↔X: Image Decomposition and Synthesis Using Material- and Lighting-aware Diffusion Models》(2024) GitHub: github.com/zheng95z/rgbx [fig4]
《ACDC: Automated Creation of Digital Cousins for Robust Policy Learning》(2024) GitHub: github.com/cremebrule/digital-cousins
《Computation Cost Attack on 3D Gaussian Splatting》(2024) GitHub: github.com/jiahaolu97/poison-splat [fig5]
《Fast Feedforward 3D Gaussian Splatting Compression》(CVPR 2024) GitHub: github.com/YihangChen-ee/FCGS [fig6]
《Story-Adapter: A Training-free Iterative Framework for Long Story Visualization》(2024) GitHub: github.com/jwmao1/story-adapter [fig7]
《End-to-end Piano Performance-MIDI to Score Conversion with Transformers》(2024) GitHub: github.com/TimFelixBeyer/MIDI2ScoreTransformer
《RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation》(2024) GitHub: github.com/thu-ml/RoboticsDiffusionTransformer [fig8]
《Embodied Agent Interface (EAgent): Benchmarking LLMs for Embodied Decision Making》(2024) GitHub: github.com/embodied-agent-interface/embodied-agent-interface
《UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation》(2024) GitHub: github.com/LiheYoung/UniMatch-V2
《MaskBit: Embedding-free Image Generation via Bit Tokens》(2024) GitHub: github.com/lucidrains/maskbit-pytorch
《Berkeley Humanoid: A Research Platform for Learning-based Control》(2024) GitHub: github.com/HybridRobotics/isaac_berkeley_humanoid
《Segmented Curved-Voxel Occupancy Descriptor for Dynamic-Aware LiDAR Odometry and Mapping》(2024) GitHub: github.com/Yixin-F/better_fastlio2
《GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs》(2024) GitHub: github.com/GenSim2/GenSim2
《MatMamba: A Matryoshka State Space Model》(2024) GitHub: github.com/ScaledFoundations/MatMamba [fig10]
《LightRAG: Simple and Fast Retrieval-Augmented Generation》(2024) GitHub: github.com/HKUDS/LightRAG [fig11]
《UniMuMo: Unified Text, Music and Motion Generation》(2024) GitHub: github.com/hanyangclarence/UniMuMo [fig12]
《Generalizable and Animatable Gaussian Head Avatar》(NeurIPS 2024) GitHub: github.com/xg-chu/GAGAvatar
《QueST: Self-Supervised Skill Abstractions for Continuous Control》(NeurIPS 2024) GitHub: github.com/pairlab/QueST
《MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing》(NeurIPS 2024) GitHub: github.com/ewrfcas/MVInpainter
《DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation》(NeurIPS 2024) GitHub: github.com/WU-CVGL/DreamMesh4D [fig9]
《Pyramidal Flow Matching for Efficient Video Generative Modeling》(2024) GitHub: github.com/jy0205/Pyramid-Flow [fig1]
《Aria: An Open Multimodal Native Mixture-of-Experts Model》(2024) GitHub: github.com/rhymes-ai/Aria
《MLE-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering》(2024) GitHub: github.com/openai/mle-bench
《Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think》(2024) GitHub: github.com/sihyun-yu/REPA
《SyllableLM: Learning Coarse Semantic Units for Speech Language Models》(2024) GitHub: github.com/AlanBaade/SyllableLM
《Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation》(2024) GitHub: github.com/OpenGVLab/PhyGenBench [fig2]
《SPA: 3D Spatial-Awareness Enables Effective Embodied Representation》(2024) GitHub: github.com/HaoyiZhu/SPA
《VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks》(2024) GitHub: github.com/TIGER-AI-Lab/VLM2Vec [fig3]
《RGB↔X: Image Decomposition and Synthesis Using Material- and Lighting-aware Diffusion Models》(2024) GitHub: github.com/zheng95z/rgbx [fig4]
《ACDC: Automated Creation of Digital Cousins for Robust Policy Learning》(2024) GitHub: github.com/cremebrule/digital-cousins
《Computation Cost Attack on 3D Gaussian Splatting》(2024) GitHub: github.com/jiahaolu97/poison-splat [fig5]
《Fast Feedforward 3D Gaussian Splatting Compression》(CVPR 2024) GitHub: github.com/YihangChen-ee/FCGS [fig6]
《Story-Adapter: A Training-free Iterative Framework for Long Story Visualization》(2024) GitHub: github.com/jwmao1/story-adapter [fig7]
《End-to-end Piano Performance-MIDI to Score Conversion with Transformers》(2024) GitHub: github.com/TimFelixBeyer/MIDI2ScoreTransformer
《RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation》(2024) GitHub: github.com/thu-ml/RoboticsDiffusionTransformer [fig8]
《Embodied Agent Interface (EAgent): Benchmarking LLMs for Embodied Decision Making》(2024) GitHub: github.com/embodied-agent-interface/embodied-agent-interface
《UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation》(2024) GitHub: github.com/LiheYoung/UniMatch-V2
《MaskBit: Embedding-free Image Generation via Bit Tokens》(2024) GitHub: github.com/lucidrains/maskbit-pytorch
《Berkeley Humanoid: A Research Platform for Learning-based Control》(2024) GitHub: github.com/HybridRobotics/isaac_berkeley_humanoid
《Segmented Curved-Voxel Occupancy Descriptor for Dynamic-Aware LiDAR Odometry and Mapping》(2024) GitHub: github.com/Yixin-F/better_fastlio2
《GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs》(2024) GitHub: github.com/GenSim2/GenSim2
《MatMamba: A Matryoshka State Space Model》(2024) GitHub: github.com/ScaledFoundations/MatMamba [fig10]
《LightRAG: Simple and Fast Retrieval-Augmented Generation》(2024) GitHub: github.com/HKUDS/LightRAG [fig11]
《UniMuMo: Unified Text, Music and Motion Generation》(2024) GitHub: github.com/hanyangclarence/UniMuMo [fig12]