几篇论文实现代码：《ReFIR: Grounding Large-20241122205630_爱可可-爱生活的专栏文章_微信文章

几篇论文实现代码：
《ReFIR: Grounding Large Restoration Models with Retrieval Augmentation》(NeurIPS 2024) GitHub: github.com/csguoh/ReFIR [fig7]
《Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery》(NeurIPS 2024) GitHub: github.com/agopal42/syncx [fig9]
《TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control》(EMNLP 2024) GitHub: github.com/AaronZ345/TCSinger
《Real-time 3D-aware Portrait Video Relighting》(CVPR 2024) GitHub: github.com/GhostCai/PortraitRelighting
《Benchmarking Agentic LLM and VLM Reasoning On Games》(2024) GitHub: github.com/balrog-ai/BALROG
《OASIS: Open Agents Social Interaction Simulations on One Million Agents》(2024) GitHub: github.com/camel-ai/oasis
《OASIS: Open Agents Social Interaction Simulations on One Million Agents》(2024) GitHub: github.com/camel-ai/oasis [fig1]
《Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography》(2024) GitHub: github.com/batmanlab/Mammo-CLIP
《Find Any Part in 3D》(2024) GitHub: github.com/ziqi-ma/Find3D [fig2]
《REDUCIO! Generating 1024*1024 Video within 16 Seconds using Extremely Compressed Motion Latents》(2024) GitHub: github.com/microsoft/Reducio-VAE [fig3]
《Do Music Generation Models Encode Music Theory?》(2024) GitHub: github.com/brown-palm/syntheory
《EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation》(2024) GitHub: github.com/antgroup/echomimic_v2
《When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training》(2024) GitHub: github.com/haonan3/AnchorContext [fig4]
《OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented Language Models》(2024) GitHub: github.com/AkariAsai/ScholarQABench
《Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning》(2024) GitHub: github.com/ZebangCheng/Emotion-LLaMA [fig5]
《Distill Visual Chart Reasoning Ability from LLMs to MLLMs》(2024) GitHub: github.com/hewei2001/ReachQA [fig6]
《Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension》(2024) GitHub: github.com/Leon1207/Video-RAG-master [fig8]
《Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos》(2024) GitHub: github.com/Vinoground/Vinoground
《Identity Preserving 3D Head Stylization with Multiview Score Distillation》(2024) GitHub: github.com/three-bee/3d_head_stylization
《WhisperNER: Unified Open Named Entity and Speech Recognition》(2024) GitHub: github.com/aiola-lab/whisper-ner
《TabGraphs: A Benchmark and Strong Baselines for Learning on Graphs with Tabular Node Features》(2024) GitHub: github.com/yandex-research/tabgraphs
《Teaching VLMs to Localize Specific Objects from In-context Examples》(2024) GitHub: github.com/SivanDoveh/IPLoc
《V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion》(2024) GitHub: github.com/ylwhxht/V2X-R
《Disentangling Memory and Reasoning Ability in Large Language Models》(2024) GitHub: github.com/MingyuJ666/Disentangling-Memory-and-Reasoning

几篇论文实现代码：《ReFIR: Grounding Large-20241122205630

正文

2024-11-22 20:56
本条微博链接