几篇论文实现代码:
《Don't Look Twice: Run-Length Tokenization for Faster Video Transformers》(NeurIPS 2024) GitHub: github.com/rccchoudhury/rlt [fig1]
《PromptFix: You Prompt and We Fix the Photo》(NeurIPS 2024) GitHub: github.com/yeates/PromptFix
《Harmonizing Visual Text Comprehension and Generation》(NeurIPS 2024) GitHub: github.com/bytedance/TextHarmony
《Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling》(ICML 2024) GitHub: github.com/DenisBless/variational_sampling_methods
《ActionVOS: Actions as Prompts for Video Object Segmentation》(CVPR 2024) GitHub: github.com/ut-vision/ActionVOS [fig7]
《Open-Vocabulary Segmentation with Semantic-Assisted Calibration》(CVPR 2024) GitHub: github.com/yongliu20/SCAN [fig10]
《LLaVA-o1: Let Vision Language Models Reason Step-by-Step》(2024) GitHub: github.com/PKU-YuanGroup/LLaVA-o1
《EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations》(2024) GitHub: github.com/BUAADreamer/EasyRAG
《GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details》(2024) GitHub: github.com/boqian-li/GarmentDreamer
《Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding》(2024) GitHub: github.com/SalesforceAIResearch/LaTRO
《HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems》(2024) GitHub: github.com/plageon/HtmlRAG [fig2]
《You Only Need One Color Space: An Efficient Network for Low-light Image Enhancement》(2024) GitHub: github.com/Fediory/HVI-CIDNet
《General Geospatial Inference with a Population Dynamics Foundation Model》(2024) GitHub: github.com/google-research/population-dynamics
《M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation》(2024) GitHub: github.com/OliverRensu/MVAR
《GenZ-ICP: Generalizable and Degeneracy-Robust LiDAR Odometry Using an Adaptive Weighting》(IEEE RA-L 2024) GitHub: github.com/cocel-postech/genz-icp
《A Foundation Model for Joint Segmentation, Detection, and Recognition of Biomedical Objects Across Nine Modalities》(Nature Methods 2024) GitHub: github.com/microsoft/BiomedParse [fig3]
《MARS: Unleashing the Power of Variance Reduction for Training Large Models》(arXiv 2024) GitHub: github.com/AGI-Arena/MARS
《OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented Language Models》(arXiv 2024) GitHub: github.com/AkariAsai/OpenScholar [fig4]
《BioNeMo Framework: For building and adapting AI models in drug discovery at scale》(2024) GitHub: github.com/NVIDIA/bionemo-framework
《StableV2V: Stablizing Shape Consistency in Video-to-Video Editing》(arXiv 2024) GitHub: github.com/AlonzoLeeeooo/StableV2V
《JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation》(2024) GitHub: github.com/jdh-algo/JoyVASA [fig5]
《FineTuneBench: How well do commercial fine-tuning APIs infuse knowledge into LLMs?》(2024) GitHub: github.com/kevinwu23/StanfordFineTuneBench [fig6]
《Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge》(2024) GitHub: github.com/zzwjames/FailureLLMUnlearning
《Generative Agent Simulations of 1,000 People》(2024) GitHub: github.com/joonspk-research/genagents
《CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting》(2024) GitHub: github.com/LiuXiangrui/CompGS [fig8]
《Equivariant Diffusion Policy》(2024) GitHub: github.com/pointW/equidiff [fig9]
《Revisiting BPR: A Replicability Study of a Common Recommender System Baseline》(2024) GitHub: github.com/Nemexur/revisit-bpr
《PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings》(2024) GitHub: github.com/joonaskalda/PixIT [fig11]
《The Super Weight in Large Language Models》(2024) GitHub: github.com/mengxiayu/LLMSuperWeight
《AllWeatherNet:Unified Image enhancement for autonomous driving under adverse weather and lowlight-conditions》(2024) GitHub: github.com/Jumponthemoon/AllWeatherNet [fig12]
《Don't Look Twice: Run-Length Tokenization for Faster Video Transformers》(NeurIPS 2024) GitHub: github.com/rccchoudhury/rlt [fig1]
《PromptFix: You Prompt and We Fix the Photo》(NeurIPS 2024) GitHub: github.com/yeates/PromptFix
《Harmonizing Visual Text Comprehension and Generation》(NeurIPS 2024) GitHub: github.com/bytedance/TextHarmony
《Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling》(ICML 2024) GitHub: github.com/DenisBless/variational_sampling_methods
《ActionVOS: Actions as Prompts for Video Object Segmentation》(CVPR 2024) GitHub: github.com/ut-vision/ActionVOS [fig7]
《Open-Vocabulary Segmentation with Semantic-Assisted Calibration》(CVPR 2024) GitHub: github.com/yongliu20/SCAN [fig10]
《LLaVA-o1: Let Vision Language Models Reason Step-by-Step》(2024) GitHub: github.com/PKU-YuanGroup/LLaVA-o1
《EasyRAG: Efficient Retrieval-Augmented Generation Framework for Automated Network Operations》(2024) GitHub: github.com/BUAADreamer/EasyRAG
《GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details》(2024) GitHub: github.com/boqian-li/GarmentDreamer
《Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding》(2024) GitHub: github.com/SalesforceAIResearch/LaTRO
《HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems》(2024) GitHub: github.com/plageon/HtmlRAG [fig2]
《You Only Need One Color Space: An Efficient Network for Low-light Image Enhancement》(2024) GitHub: github.com/Fediory/HVI-CIDNet
《General Geospatial Inference with a Population Dynamics Foundation Model》(2024) GitHub: github.com/google-research/population-dynamics
《M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation》(2024) GitHub: github.com/OliverRensu/MVAR
《GenZ-ICP: Generalizable and Degeneracy-Robust LiDAR Odometry Using an Adaptive Weighting》(IEEE RA-L 2024) GitHub: github.com/cocel-postech/genz-icp
《A Foundation Model for Joint Segmentation, Detection, and Recognition of Biomedical Objects Across Nine Modalities》(Nature Methods 2024) GitHub: github.com/microsoft/BiomedParse [fig3]
《MARS: Unleashing the Power of Variance Reduction for Training Large Models》(arXiv 2024) GitHub: github.com/AGI-Arena/MARS
《OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented Language Models》(arXiv 2024) GitHub: github.com/AkariAsai/OpenScholar [fig4]
《BioNeMo Framework: For building and adapting AI models in drug discovery at scale》(2024) GitHub: github.com/NVIDIA/bionemo-framework
《StableV2V: Stablizing Shape Consistency in Video-to-Video Editing》(arXiv 2024) GitHub: github.com/AlonzoLeeeooo/StableV2V
《JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation》(2024) GitHub: github.com/jdh-algo/JoyVASA [fig5]
《FineTuneBench: How well do commercial fine-tuning APIs infuse knowledge into LLMs?》(2024) GitHub: github.com/kevinwu23/StanfordFineTuneBench [fig6]
《Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge》(2024) GitHub: github.com/zzwjames/FailureLLMUnlearning
《Generative Agent Simulations of 1,000 People》(2024) GitHub: github.com/joonspk-research/genagents
《CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting》(2024) GitHub: github.com/LiuXiangrui/CompGS [fig8]
《Equivariant Diffusion Policy》(2024) GitHub: github.com/pointW/equidiff [fig9]
《Revisiting BPR: A Replicability Study of a Common Recommender System Baseline》(2024) GitHub: github.com/Nemexur/revisit-bpr
《PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings》(2024) GitHub: github.com/joonaskalda/PixIT [fig11]
《The Super Weight in Large Language Models》(2024) GitHub: github.com/mengxiayu/LLMSuperWeight
《AllWeatherNet:Unified Image enhancement for autonomous driving under adverse weather and lowlight-conditions》(2024) GitHub: github.com/Jumponthemoon/AllWeatherNet [fig12]