几篇论文实现代码:
《Mitigating Object Hallucination via Concentric Causal Attention》(NeurIPS 2024) GitHub: github.com/xing0047/cca-llava
《The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio》(2024) GitHub: github.com/DAMO-NLP-SG/CMM [fig1]
《Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision》(2024) GitHub: github.com/Shengcao-Cao/groundLMM
《BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation》(2024) GitHub: github.com/UT-Austin-RobIn/BUMBLE [fig2]
《Robust Loop Closure by Textual Cues in Challenging Environments》(2024) GitHub: github.com/TongxingJin/TXTLCD
《How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold》(2024) GitHub: github.com/vsahil/MIMETIC-2 [fig3]
《Mitigating Object Hallucination via Concentric Causal Attention》(NeurIPS 2024) GitHub: github.com/xing0047/cca-llava
《The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio》(2024) GitHub: github.com/DAMO-NLP-SG/CMM [fig1]
《Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision》(2024) GitHub: github.com/Shengcao-Cao/groundLMM
《BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation》(2024) GitHub: github.com/UT-Austin-RobIn/BUMBLE [fig2]
《Robust Loop Closure by Textual Cues in Challenging Environments》(2024) GitHub: github.com/TongxingJin/TXTLCD
《How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold》(2024) GitHub: github.com/vsahil/MIMETIC-2 [fig3]