专栏名称: 爱可可-爱生活
知名互联网资讯博主 北邮PRIS模式识别实验室陈老师
目录
相关文章推荐
爱可可-爱生活  ·  本文通过引入离散Girsanov定理,分析了 ... ·  2 天前  
宝玉xp  ·  转发微博-20241231040426 ·  2 天前  
爱可可-爱生活  ·  a16z合伙人Justine ... ·  3 天前  
爱可可-爱生活  ·  恭喜@WENJIA順 ... ·  3 天前  
51好读  ›  专栏  ›  爱可可-爱生活

几篇论文实现代码:《Long-Form Speech Gener-20241227141946

爱可可-爱生活  · 微博  · AI  · 2024-12-27 14:19

正文

2024-12-27 14:19

几篇论文实现代码:
《Long-Form Speech Generation with Spoken Language Models》(2024) GitHub: github.com/google-deepmind/librispeech-long
《DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs》(2024) GitHub: github.com/MengLcool/SliMM
《DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT》(2024) GitHub: github.com/YvanYin/DrivingWorld [fig1]
《WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents》(2024) GitHub: github.com/elated-sawyer/WALL-E [fig2]
《VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks》(2024) GitHub: github.com/OpenMOSS/VLABench
《MINIMA: Modality Invariant Image Matching》(2024) GitHub: github.com/LSXI7/MINIMA [fig3]
《DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery》(2024) GitHub: github.com/DroneSplat/anonymous_code
《DriveMM: All-in-One Large Multimodal Model for Autonomous Driving》(2024) GitHub: github.com/zhijian11/DriveMM [fig4]
《Dense-Face: Personalized Face Generation Model via Dense Annotation Prediction》(2024) GitHub: github.com/CHELSEA234/Dense-Face
《Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models》(2024) GitHub: github.com/KbsdJames/omni-math-rule
《GraphAgent: Agentic Graph Language Assistant》(2024) GitHub: github.com/HKUDS/GraphAgent
《Sound bubbles on hearables》(2024) GitHub: github.com/chentuochao/Sound_Bubble
《ICAL: Continual Learning of Multimodal Agents by Transforming Trajectories into Actionable Insights》(2024) GitHub: github.com/Gabesarch/ICAL