几篇论文实现代码:
《DiffGS: Functional Gaussian Splatting Diffusion》(NeurIPS 2024) GitHub: github.com/weiqi-zhang/DiffGS [fig2]
《CoMo: Controllable Motion Generation through Language Guided Pose Code Editing》(ECCV 2024) GitHub: github.com/yh2371/CoMo
《Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation》(ECCV 2024) GitHub: github.com/jiaosiyu1999/MAFT-Plus [fig9]
《The Scene Language: Representing Scenes with Programs, Words, and Embeddings》(2024) GitHub: github.com/zzyunzhi/scene-language [fig1]
《HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning》(2024) GitHub: github.com/rail-berkeley/hil-serl
《InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write》(2024) GitHub: github.com/google-research/inksight [fig3]
《LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior》(2024) GitHub: github.com/hywang66/LARP [fig4]
《MAC-VO: Metrics-aware Covariance for Learning-based Stereo Visual Odometry》(2024) GitHub: github.com/MAC-VO/MAC-VO
《VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models》(2024) GitHub: github.com/lisadunlap/VibeCheck
《Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization》(2024) GitHub: github.com/apple/ml-hypercloning [fig5]
《CDChat: A Large Multimodal Model for Remote Sensing Change Description》(2024) GitHub: github.com/techmn/cdchat
《MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models》(2024) GitHub: github.com/Lillianwei-h/MMIE [fig6]
《OmniBench: Towards The Future of Universal Omni-Language Models》(2024) GitHub: github.com/multimodal-art-projection/OmniBench
《EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control》(2024) GitHub: github.com/tonychenxyz/emoknob [fig7]
《TSELM: Target Speaker Extraction using Discrete Tokens and Language Models》(2024) GitHub: github.com/Beilong-Tang/TSELM
《Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models》(2024) GitHub: github.com/Sanoojan/REFace
《CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution》(2024) GitHub: github.com/open-compass/CompassJudger
《mDPO: Conditional Preference Optimization for Multimodal Large Language Models》(EMNLP 2024) GitHub: github.com/luka-group/mDPO [fig8]
《Easy and Precise Segmentation-Guided Diffusion Models》(2024) GitHub: github.com/mazurowski-lab/segmentation-guided-diffusion
《AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation》(2024) GitHub: github.com/UCSC-VLAA/AttnGCG-attack
《Reactzyme: A Benchmark for Enzyme-Reaction Prediction》(2024) GitHub: github.com/WillHua127/reactzyme
《DiffGS: Functional Gaussian Splatting Diffusion》(NeurIPS 2024) GitHub: github.com/weiqi-zhang/DiffGS [fig2]
《CoMo: Controllable Motion Generation through Language Guided Pose Code Editing》(ECCV 2024) GitHub: github.com/yh2371/CoMo
《Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation》(ECCV 2024) GitHub: github.com/jiaosiyu1999/MAFT-Plus [fig9]
《The Scene Language: Representing Scenes with Programs, Words, and Embeddings》(2024) GitHub: github.com/zzyunzhi/scene-language [fig1]
《HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning》(2024) GitHub: github.com/rail-berkeley/hil-serl
《InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write》(2024) GitHub: github.com/google-research/inksight [fig3]
《LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior》(2024) GitHub: github.com/hywang66/LARP [fig4]
《MAC-VO: Metrics-aware Covariance for Learning-based Stereo Visual Odometry》(2024) GitHub: github.com/MAC-VO/MAC-VO
《VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models》(2024) GitHub: github.com/lisadunlap/VibeCheck
《Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization》(2024) GitHub: github.com/apple/ml-hypercloning [fig5]
《CDChat: A Large Multimodal Model for Remote Sensing Change Description》(2024) GitHub: github.com/techmn/cdchat
《MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models》(2024) GitHub: github.com/Lillianwei-h/MMIE [fig6]
《OmniBench: Towards The Future of Universal Omni-Language Models》(2024) GitHub: github.com/multimodal-art-projection/OmniBench
《EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control》(2024) GitHub: github.com/tonychenxyz/emoknob [fig7]
《TSELM: Target Speaker Extraction using Discrete Tokens and Language Models》(2024) GitHub: github.com/Beilong-Tang/TSELM
《Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models》(2024) GitHub: github.com/Sanoojan/REFace
《CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution》(2024) GitHub: github.com/open-compass/CompassJudger
《mDPO: Conditional Preference Optimization for Multimodal Large Language Models》(EMNLP 2024) GitHub: github.com/luka-group/mDPO [fig8]
《Easy and Precise Segmentation-Guided Diffusion Models》(2024) GitHub: github.com/mazurowski-lab/segmentation-guided-diffusion
《AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation》(2024) GitHub: github.com/UCSC-VLAA/AttnGCG-attack
《Reactzyme: A Benchmark for Enzyme-Reaction Prediction》(2024) GitHub: github.com/WillHua127/reactzyme