OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video lmitation, https://arxiv.org/abs/2410.11792
Humanoid Parkour Learning, https://arxiv.org/abs/2406.10759
Adapting Humanoid Locomotion over Challenging Terrain via Two-Phase Training , https://openreview.net/attachment?id=O0oK2bVist&name=pdf
机器人学习、规划
Theia: Distilling Diverse Vision Foundation Models for Robot Learning , https://arxiv.org/pdf/2407.20179
BodyTransformer:Leveraging RobotEmbodimentforPolicyLearning , https://openreview.net/pdf?id=Oce2215aJE
Gameplay Filters: Robust Zero-Shot Safety through Adversarial Imagination , https://openreview.net/pdf?id=Ke5xrnBFAR
Learning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics Models , https://openreview.net/pdf?id=evCXwlCMIi
Towards Open-World Grasping with Large Vision-Language Models , https://openreview.net/pdf?id=QUzwHYJ9Hf
Safe Bayesian Optimization for the Control of High-Dimensional Embodied Systems , https://openreview.net/pdf?id=8PcRynpd1m
LeLaN: Learning A Language-Conditioned Navigation Policy from In-the-Wild Videos , https://openreview.net/pdf?id=zIWu9Kmlqk
Trajectory Improvement and Reward Learning from Comparative Language Feedback, https://openreview.net/pdf?id=1tCteNSbFH
Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation, https://openreview.net/forum?id=qUSa3F79am
Learning Transparent Reward Models via Unsupervised Feature Selection , https://openreview.net/pdf?id=2sg4PY1W9d
MaIL: Improving Imitation Learning with Selective State Space Models, https://openreview.net/pdf?id=IssXUYvVTg
Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight , https://openreview.net/forum?id=bt0PX0e4rE
Autonomous Improvement of Instruction Following Skills via Foundation Models , https://openreview.net/attachment?id=8Ar8b00GJC&name=pdf
Robotic Control via Embodied Chain-of-Thought Reasoning , https://openreview.net/pdf?id=S70MgnIA0v
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation , https://openreview.net/attachment?id=AuJnXGq3AL&name=pdf
机械臂
DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands , https://arxiv.org/abs/2310.08809
General Flow as Foundation Affordance for Scalable Robot Learning, Chengbo Yuan, Chuan Wen, Tong Zhang, Yang Gao†, https://general-flow.github.io/, CoRL 2024.
Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation, Tong Zhang, Yingdong Hu, Jiacheng You, Yang Gao†, https://sgrv2-robot.github.io/, CoRL 2024.
HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers, Jianke Zhang∗, Yanjiang Guo∗, Xiaoyu Chen,Yen-Jen Wang, Yucheng Hu, Chengming Shi, Jianyu Chen†, https://arxiv.org/abs/2410.05273, CoRL 2024.
Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning, Zhecheng Yuan*, Tianming Wei*, Shuiqi Cheng, Gu Zhang, Yuanpei Chen, Huazhe Xu†, https://gemcollector.github.io/maniwhere/, CoRL 2024.
RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation, Chongkai Gao, Zhengrong Xue, Shuying Deng, Tianhai Liang, Siqi Yang, Lin Shao, Huazhe Xu†, https://riemann-web.github.io/, CoRL 2024.
ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter , https://arxiv.org/abs/2407.11298
ALOHAUnleashed: A Simple Recipe for Robot Dexterity , https://aloha-unleashed.github.io/assets/aloha_unleashed.pdf . 双臂操作。
Mobile ALOHA: Learning Bimanual Mobile Manipulation using Low-Cost Whole-Body Teleoperation , https://openreview.net/forum?id=FO6tePGRZj . 双臂操作。
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands , https://openreview.net/attachment?id=4Of4UWyBXE&name=pdf . 双臂操作。
DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes , https://openreview.net/attachment?id=5W0iZR9J7h&name=pdf
导航
Uncertainty-Aware Decision Transformer for Stochastic Driving Environments,https://arxiv.org/abs/2309.16397
InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment , https://arxiv.org/pdf/2406.04882
Context-Aware Replanning with Pre-explored Semantic Map for Object Navigation , https://openreview.net/attachment?id=Dftu4r5jHe&name=pdf
Lifelong Autonomous Fine-Tuning of Navigation Foundation Models in the Wild, https://openreview.net/attachment?id=vBj5oC60Lk&name=pdf
具身感知
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding , https://arxiv.org/pdf/2410.13860 . 3D场景理解,3D视觉定位。
GraspSplats: Efficient Manipulation with 3D Feature Splatting , https://arxiv.org/html/2409.02084 .
Transferable Tactile Transformers for Representation LearningAcross Diverse Sensors and Tasks, https://arxiv.org/abs/2406.13640
D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation, https://openreview.net/attachment?id=7E3JAys1xO&name=pdf
LiDARGrid: Self-supervised 3D Opacity Grid from LiDAR for Scene Forecasting, https://openreview.net/attachment?id=MfuzopqVOX&name=pdf
自动驾驶运动规划
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models , https://arxiv.org/abs/2402.12289 . 用于场景描述,场景分析和分层规划。
Uncertainty-Aware Decision Transformer for Stochastic Driving Environments , https://arxiv.org/abs/2309.16397 . 提出了 UNREST,一种针对随机驾驶环境的规划方法。
Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving , https://arxiv.org/pdf/2409.06702
机器人操作
OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video lmitation, https://arxiv.org/abs/2410.11792
Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own , https://arxiv.org/abs/2310.02635
General Flow as Foundation Affordance for Scalable Robot Learning , https://arxiv.org/abs/2401.11439
A Universal Semantic-Geometric Representation for Robotic Manipulation , https://arxiv.org/abs/2306.10474
Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning, Zhecheng Yuan*, Tianming Wei*, Shuiqi Cheng, Gu Zhang, Yuanpei Chen, Huazhe Xu†, https://gemcollector.github.io/maniwhere/, CoRL 2024.
Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning , https://arxiv.org/abs/2407.15815
GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs , https://arxiv.org/abs/2410.03645
RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation , https://arxiv.org/abs/2403.19460
RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model , https://arxiv.org/abs/2406.10157