VLA 線 · 查看同日 AI 報告 →查看同日 AI 报告 →

2026-04-09

VLA 研究日報 Pulsar

LIVE

— AI 線今日無資料 —— AI 线今日无资料 —

VLA 線VLA 线 · cs.RO · cs.AI · cs.LG

ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving 针对语言驱动驾驶代理的指令鲁棒性研究，虽涉及 VLA 概念但局限于自动驾驶仿真，缺乏通用机器人操作验证。 HF-PAPER
Belief Dynamics for Detecting Behavioral Shifts in Safe Collaborative Manipulation Devashri Naik et al. · 提出基于信念动态的行为偏移检测以保障协作安全，方法偏向传统控制理论，未展示与 VLA 架构的直接融合路径。 CS.RO
From Video to Control: A Survey of Learning Manipulation Interfaces from Temporal Visual Data [Jia Pan] Linfang Zheng et al. · 综述视频数据转化为机器人控制接口的进展，适合了解领域背景，但无新算法或即时可复用的工程贡献。 CS.RO
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing StarVLA Community · 提供模块化 VLA 开发代码库，支持快速集成感知与动作模块，本周内可复用其架构搭建实验原型。 CS.RO
GaussFly: Contrastive Reinforcement Learning for Visuomotor Policies in 3D Gaussian Fields Yuhang Zhang et al. · 在 3D 高斯场中利用对比强化学习训练无人机视动策略，仅限仿真且针对特定飞行器，通用操作迁移性不明。 CS.RO
RoboPlayground: Democratizing Robotic Evaluation through Structured Physical Domains [UW|Fox] Yi Ru Wang et al. · 提出结构化物理域评估框架，允许用户自定义任务约束与成功标准，可直接用于扩展现有 VLA 基准测试。 CS.RO
ExpressMM: Expressive Mobile Manipulation Behaviors in Human-Robot Interactions Souren Pashangpour et al. · 研究移动操作机器人的表达性行为以沟通意图，侧重人机交互体验，对 VLA 核心控制架构改进有限。 CS.RO
CoEnv: Driving Embodied Multi-Agent Collaboration via Compositional Environment Li Kang et al. · 通过组合环境驱动多智能体协作，主要解决空间协调问题，缺乏针对 VLA 模型的具体训练或推理创新。 CS.RO
Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation Jiahua Ma et al. · 引入指代感知机制增强视动策略在分布外错误下的鲁棒性，提供动态重路由轨迹的具体实现方案。 CS.RO
Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming Baoshun Tong et al. · 利用多样性感知红队测试揭示 VLA 语言脆弱性，提供具体的攻击生成工具包，可用于模型安全性评估。 CS.RO
Grounding Hierarchical Vision-Language-Action Models Through Explicit Language-Action Alignment Theodor Wulff et al. · 通过显式语言 - 动作对齐提升层级 VLA 透明度，理论贡献为主，缺乏大规模实验验证其泛化能力提升。 CS.RO
A1: A Fully Transparent Open-Source, Adaptive and Efficient Truncated Vision-Language-Action Model Kaidong Zhang et al. · 提出截断式高效 VLA 架构 A1，显著降低推理成本并开源代码，适合资源受限场景的快速部署尝试。 CS.RO

2026-04-09

VLA 研究日報VLA 研究日报

18 篇 8 篇共 26 篇

🔧 技術技术

Practical VLA 2026-04-09

StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing

StarVLA Community · 提供模块化 VLA 开发代码库，支持快速集成感知与动作模块，本周内可复用其架构搭建实验原型。

cs.RO 閱讀原文

Practical VLA [UW|Fox] 2026-04-09

RoboPlayground: Democratizing Robotic Evaluation through Structured Physical Domains

Yi Ru Wang et al. · 提出结构化物理域评估框架，允许用户自定义任务约束与成功标准，可直接用于扩展现有 VLA 基准测试。

cs.RO 閱讀原文

Practical VLA 2026-04-09

Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation

Jiahua Ma et al. · 引入指代感知机制增强视动策略在分布外错误下的鲁棒性，提供动态重路由轨迹的具体实现方案。

cs.RO 閱讀原文

Practical VLA 2026-04-09

Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming

Baoshun Tong et al. · 利用多样性感知红队测试揭示 VLA 语言脆弱性，提供具体的攻击生成工具包，可用于模型安全性评估。

cs.RO 閱讀原文

Practical VLA 2026-04-09

A1: A Fully Transparent Open-Source, Adaptive and Efficient Truncated Vision-Language-Action Model

Kaidong Zhang et al. · 提出截断式高效 VLA 架构 A1，显著降低推理成本并开源代码，适合资源受限场景的快速部署尝试。

cs.RO 閱讀原文

Practical VLA 2026-04-09

BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination

Xingyu Peng et al. · 发布面向长程时空协调的双臂操作基准 BiCoord，填补现有基准在复杂协作任务上的空白，立即可用。

cs.RO 閱讀原文

Practical VLA 2026-04-09

On-the-Fly VLA Adaptation via Test-Time Reinforcement Learning

Changyu Liu et al. · 利用测试时强化学习实现 VLA 即时自适应，解决分布偏移问题，提供可集成的在线微调算法框架。

cs.RO 閱讀原文

Practical VLA [Physical Intelligence] 2026-04-09

SnapFlow: One-Step Action Generation for Flow-Matching VLAs via Progressive Self-Distillation

Wuyang Luan et al. · 通过渐进自蒸馏实现流匹配 VLA 的单步动作生成，大幅降低推理延迟，代码开源后可直接替换现有解码器。

cs.AI 閱讀原文

📖 背景閱讀背景阅读

Background VLA 2026-04-09

ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving

针对语言驱动驾驶代理的指令鲁棒性研究，虽涉及 VLA 概念但局限于自动驾驶仿真，缺乏通用机器人操作验证。

hf-papers 閱讀原文

Background VLA 2026-04-09

Belief Dynamics for Detecting Behavioral Shifts in Safe Collaborative Manipulation

Devashri Naik et al. · 提出基于信念动态的行为偏移检测以保障协作安全，方法偏向传统控制理论，未展示与 VLA 架构的直接融合路径。

cs.RO 閱讀原文

Background VLA [Jia Pan] 2026-04-09

From Video to Control: A Survey of Learning Manipulation Interfaces from Temporal Visual Data

Linfang Zheng et al. · 综述视频数据转化为机器人控制接口的进展，适合了解领域背景，但无新算法或即时可复用的工程贡献。

cs.RO 閱讀原文

Background VLA 2026-04-09

GaussFly: Contrastive Reinforcement Learning for Visuomotor Policies in 3D Gaussian Fields

Yuhang Zhang et al. · 在 3D 高斯场中利用对比强化学习训练无人机视动策略，仅限仿真且针对特定飞行器，通用操作迁移性不明。

cs.RO 閱讀原文

Background VLA 2026-04-09

ExpressMM: Expressive Mobile Manipulation Behaviors in Human-Robot Interactions

Souren Pashangpour et al. · 研究移动操作机器人的表达性行为以沟通意图，侧重人机交互体验，对 VLA 核心控制架构改进有限。

cs.RO 閱讀原文

Background VLA 2026-04-09

CoEnv: Driving Embodied Multi-Agent Collaboration via Compositional Environment

Li Kang et al. · 通过组合环境驱动多智能体协作，主要解决空间协调问题，缺乏针对 VLA 模型的具体训练或推理创新。

cs.RO 閱讀原文

Background VLA 2026-04-09

Grounding Hierarchical Vision-Language-Action Models Through Explicit Language-Action Alignment

Theodor Wulff et al. · 通过显式语言 - 动作对齐提升层级 VLA 透明度，理论贡献为主，缺乏大规模实验验证其泛化能力提升。

cs.RO 閱讀原文

Background VLA 2026-04-09

Rectified Schr\"odinger Bridge Matching for Few-Step Visual Navigation

Wuyang Luan et al. · arXiv:2604.05673v1 Announce Type: new Abstract: Visual navigation is a core challenge in Embodied AI, requiring autonomous agents to translate high-dimensional sensory observations into continuous, long-horizon action trajectories. While generative policies based on diffusion models and Schr\"odinger Bridges (SB) effectively capture multimodal action distributions, they require dozens of integration steps due to high-variance stochastic transport, posing a critical barrier for real-time robotic

cs.RO 閱讀原文

Background VLA 2026-04-09

GraspSense: Physically Grounded Grasp and Grip Planning for a Dexterous Robotic Hand via Language-Guided Perception and Force Maps

Elizaveta Semenyakina et al. · 结合语言引导与力地图进行灵巧手抓取规划，侧重物理接触策略，未展示端到端 VLA 训练流程。

cs.RO 閱讀原文

Background VLA 2026-04-09

HiPolicy: Hierarchical Multi-Frequency Action Chunking for Policy Learning

Jiyao Zhang et al. · 提出分层多频动作分块策略平衡长程依赖与精细控制，方法合理但实验仅在简单仿真任务，缺乏实机验证。

cs.RO 閱讀原文

Background VLA 2026-04-09

VLA-InfoEntropy: A Training-Free Vision-Attention Information Entropy Approach for Vision-Language-Action Models Inference Acceleration and Success

Chuhang Liu et al. · 利用视觉注意力熵加速 VLA 推理且无需训练，思路巧妙但性能提升幅度未知，需进一步实验确认有效性。

cs.RO 閱讀原文

Background VLA 2026-04-09

STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization

Hao Li et al. · 通过旋转增强矢量量化学习多样技能抽象，是对 VQ-VAE 的改进，但未展示在复杂 VLA 任务中的显著优势。

cs.RO 閱讀原文

Background VLA [UMD|Manocha] 2026-04-09

SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models

Xiyang Wu et al. · 提出针对 VLA 的黑盒攻击框架 SABER，揭示指令通道漏洞，主要用于安全研究而非提升模型性能。

cs.RO 閱讀原文

Background VLA [Han Zhao] 2026-04-09

Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance

Wenxuan Song et al. · 加速离散扩散 VLA 至实时性能，主要工程优化，摘要未明确具体加速比及泛化能力保持情况。

cs.RO 閱讀原文

Background VLA [Han Zhao] 2026-04-09

DFM-VLA: Iterative Action Refinement for Robot Manipulation via Discrete Flow Matching

Jiayi Chen et al. · 利用离散流匹配迭代细化动作，理论上优于顺序解码，但缺乏与当前 SOTA 扩散策略的完整对比实验。

cs.RO 閱讀原文

Background VLA 2026-04-09

GeoPredict: Leveraging Predictive Kinematics and 3D Gaussian Geometry for Precise VLA Manipulation

Jingjing Qian et al. · 结合预测运动学与 3D 高斯几何提升 VLA 精度，解决 2D -centric 局限，但实验细节不足难以评估实效。

cs.RO 閱讀原文

Background VLA 2026-04-09

RL-VLA$^3$: A Flexible and Asynchronous Reinforcement Learning Framework for VLA Training

Haoran Sun et al. · 提出灵活异步 RL 框架用于 VLA 训练，架构设计合理但缺乏具体任务上的性能提升数据支撑。

cs.AI 閱讀原文

Background VLA 2026-04-09

Gaze-Regularized Vision-Language-Action Models for Robotic Manipulation

Anupam Pani et al. · 利用人类注视正则化 VLA 以提升细粒度操作，生物启发思路有趣，但缺乏大规模数据集验证其通用性。

cs.CV 閱讀原文

首頁首页 / VLA 日報VLA 日报 / 2026-04-09