Skip to content

DAILY INTELLIGENCE · AI APP + VLA

研究日報研究日报

AI App 精選與 VLA 論文評級的每日合輯AI App 精选与 VLA 论文评级的每日合辑下方匯總近 7 天的信號總覽,歸檔索引可按領域過濾下方汇总近 7 天的信号总览,归档索引可按领域过滤

近 7 日匯總近 7 日汇总 05-30 → 06-10
40 AI精選AI精选
167 VLA論文VLA论文
⚡ 0 突破
🔧 77 工具/技術工具/技术
📖 90 背景/觀點背景/观点
近 3 天內容近 3 天内容 06-08 → 06-10
2026-06-10 最新 38
Anthropic 发布 Claude Fable 5 + Mythos 5:最强模型附带「静默降权」争议 AI Google Gemini 3.5 Live Translate:70+ 语言实时语音翻译模型 AI Google Gemma 4 12B:encoder-free 多模态模型,16GB VRAM 可跑 AI 微信 AI Agent 正式开放接入:14亿月活生态调度小程序 AI Karpathy:Jevons 悖论在 AI 编码时代生效 AI Grit:用 Agent Swarm 从 0 用 Rust 重写 Git AI OpenAI 秘密提交 IPO 文件,目标估值 1 万亿美元 AI HF Spaces agents.md:Agent 组合两个 Space 构建 3D 巴黎画廊 AI VoLo: A Physical Orchestrator for Open-Vocabulary Long-Horizon Manipulation VLA Q-VGM: Q-Guided Value-Gradient Matching for Flow-Matching VLA Policies VLA EgoAERO: Learning Dexterous Manipulation from a Single Egocentric Video without Object Assets VLA vla.cpp: A Unified Inference Runtime for Vision-Language-Action Models VLA Revisiting Articulated Parts Perception in Robot Manipulation VLA Ego-Pi: VLA Fine-Tuning for Ego-Centric Human and Robot Data VLA CLASP: Language-Driven Robot Skill Selection and Composition using Task-Parameterized Learning VLA SIMPLE: Simulation-Based Policy Learning and Evaluation for Humanoid Loco-manipulation VLA MotionVLA: Injecting Geometric Motion into Vision-Language-Action Model VLA PACT: Self-Evolving Physical Safety Alignment for Diffusion Policies in Embodied Manipulation VLA GraspFoM: Towards Reconstruction-Driven Robotic Grasping with 3D Foundation Priors VLA EgoPriMo: Egocentric Motion Generation for Interactive Humanoid Control VLA Two Bridges, One Pathway: From VLMs to Generalizable VLAs with Embodied Trajectory-Coupled Data VLA GEAR-VLA: Learning Geometry-Aware Action Representations for Generalizable Robotic Manipulation VLA OASIS: From Simulation Data Collection to Real-World Humanoid Loco-Manipulation VLA FAWAM: Force-Aware World Action Models for Closed-Loop Contact-Rich Manipulation VLA Real-IKEA: Physical Fidelity is the Prerequisite for Robust Manipulation VLA HARBOR: A Harness Framework for Agentic Robot Reinforcement Learning VLA Latent Diffusion Policy: Shaping Latent Spaces for Diffusion-Based Robotic Manipulation VLA Language as a Sensor: Calibrated Spatial Belief Estimation in 3D Scenes from Natural Language VLA IR-SIM: A Lightweight Skill-Native Simulator for Navigation, Learning, and Benchmarking VLA Dream-Tac: A Unified Tactile World Action Model for Contact-Rich Robot Manipulation VLA Guided Discovery of New Behaviors using Diffusion Policies VLA Unifying Object-Centric World Models and Diffusion Policy: A Hierarchical Framework for Multi-Stage Robotic Tasks VLA Video2Sim2Real: Full-Stack Autonomous Dexterous Skill Acquisition from a Single Human Video VLA Benchmarking Vision-Language-Action Models on SO-101: Failure and Recovery Analysis VLA MotionWAM: Towards Foundation World Action Models for Real-Time Humanoid Loco-Manipulation VLA Back to the Familiar Future: Failure Recovery for VLA Policies via Pre-Imagined Milestone Selection VLA TORL-VLA: Tactile Guided Online Reinforcement Learning for Contact-Rich Manipulation VLA ReGIL: Retrieval-Guided Imitation Learning from a Single Demonstration VLA
2026-06-09 昨日 34
OpenAI 秘密提交 S-1 草案,IPO 信号明确 AI ChatGPT 史上最大改版:从聊天框到 Agent 平台 AI WWDC26: Apple Intelligence 全面重构,Siri AI + Gemini 合作 AI Apple 开源 Core AI PyTorch Extensions AI OpenAI 发布 Built to benefit everyone 战略愿景 AI AWS Cross-Region Inference 欧洲指南 AI Robots Need More than VLA and World Models VLA PhyRoGen: Synthetic Generation of Physical Robot Manipulation Puzzles Using Procedural Content Generation VLA What Matters When Cotraining Robot Manipulation Policies on Everyday Human Videos? VLA AxisGuide: Grounding Robot Action Coordinate System in RGB Observations for Robust Visuomotor Manipulation VLA ActionMap: Robot Policy Learning via Voxel Action Heatmap VLA Task Editing for Generalizable 3D Visuomotor Policy Learning VLA Coarse-to-Control: Action-Token Planning for Vision-Language-Action Models VLA QuadVerse: An Integrated Framework Aligning Visual-Physical Reality for Quadruped Simulation VLA Robotic Policy Adaptation via Weight-Space Meta-Learning VLA RhinoVLA Technical Report VLA Spline Policy: A Structured Representation for Robot Policies VLA AEGIS: A Backup Reflex for Physical AI VLA LARA: Latent Action Representation Alignment for Vision-Language-Action Models VLA Chameleon: Control-Indexed Prospective Memory for Visuomotor Manipulation VLA ViVa: A Video-Generative Value Model for Robot Reinforcement Learning VLA GEM-4D: Geometry-Enhanced Video World Models for Robot Manipulation VLA CHDP: Cooperative Hybrid Diffusion Policies for Reinforcement Learning in Parameterized Action Space VLA ActQuant: Sub-4-bit Action-Guided Quantization for Vision-Language-Action Models VLA GenPO++: Generative Policy Optimization with Jacobian-free Likelihood Ratios VLA Chunking the Critic: A Transformer-based Soft Actor-Critic with N-Step Returns VLA STRIPS-WM: Learning Grounded Propositional STRIPS-style World Models from Images VLA Where to Touch, How to Contact: Hierarchical RL-MPC Framework for Geometry-Aware Long-Horizon Dexterous Manipulation VLA SERNF: Sample-Efficient Real-World Dexterous Policy Fine-Tuning via Action-Chunked Critics and Normalizing Flows VLA Latent Geometry Beyond Search: Amortizing Planning in World Models VLA Expanding Spatial and Temporal Context for Robotic Imitation Learning With Scene Graphs VLA The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective VLA Audio-Visual World Models: Grounding Multisensory Imagination for Embodied Agents VLA Bootstrap Theory of Representational Emergence: Explanatory Insufficiency as a Driver of Representation Learning and World Models VLA

歸檔索引归档索引

80 期
— 2026 年 6 月 —
AI 最新
Anthropic 发布 Claude Fable 5 + Mythos 5:最强模型附带「静默降权」争议
8 篇
VLA 最新
🔧 14 📖 16 VoLo: A Physical Orchestrator for Open-Vocabulary Long-Horizon Manipulation
30 篇
AI 昨日
OpenAI 秘密提交 S-1 草案,IPO 信号明确
6 篇
VLA 昨日
🔧 10 📖 18 Robots Need More than VLA and World Models
28 篇
AI 2日前
LLMs are eroding my software engineering career
4 篇
VLA
🔧 2 Let It Be Simple: One-Step Action Generation for Vision-Language-Action Models
2 篇
AI
Anthropic 研究所发布「递归自我改进」研究
5 篇
VLA
🔧 11 📖 13 VISTA: Vision-Grounded and Physics-Validated Adaptation of UMI data for VLA Training
24 篇
VLA
🔧 12 📖 15 See Less, Specify More: Visual Evidence Budgets for Generalizable VLAs
27 篇
AI
Anthropic 秘密递交 S-1 IPO 申请,估值逼近 $1 万亿
6 篇
VLA
🔧 10 📖 19 World-Task Factorization for Robot Learning
29 篇
VLA
🔧 18 📖 9 ELAN4D: Embodiment-Centric 4D Supervision for Vision-Language-Action Models via Plug-and-Play Adaptation
27 篇
AI
AI 订阅疲劳:「我本来不想建这 16 个项目的」
5 篇
— 2026 年 5 月 —
AI
Anthropic 完成 $650 亿 Series H 融资,估值 $9650 亿首次反超 OpenAI
6 篇
VLA
⚡ 1 🔧 8 📖 20 GEM: Generative Supervision Helps Embodied Intelligence
29 篇
AI
Anthropic H 轮融资 650 亿美元,估值 9650 亿美元
6 篇
VLA
🔧 13 📖 13 GEM: Generative Supervision Helps Embodied Intelligence
26 篇
AI
SQLite 添加 AGENTS.md:明确拒绝 agentic code,但接受 agentic bug reports
6 篇
VLA
🔧 9 📖 17 PhyPush: One Push is All You Need for Sensorless Physical Property Estimation with Physics-Guided Transformers
26 篇
AI
Microsoft Copilot Cowork 被曝数据泄露:Agent 系统安全再敲警钟
6 篇
VLA
⚡ 1 🔧 12 📖 17 EXPO-FT: Sample-Efficient Reinforcement Learning Finetuning for Vision-Language-Action Models
30 篇
AI
Claude Memory Files + Dreams + Conway:Anthropic 记忆架构大翻新
5 篇
VLA
🔧 10 📖 11 Agentic-VLA: Efficient Online Adaptation for Vision-Language-Action Models
21 篇
AI
DeepSeek Reasonix:DeepSeek 原生编码 Agent,高缓存+低成本
6 篇
AI
NVIDIA Nemotron-Labs Diffusion LM:并行生成+迭代精炼,打破自回归速度瓶颈
5 篇
VLA
📖 1 Bioinspired ionic thermoreceptors with anisotropic architecture for thermotactile perception in robots
1 篇
AI
微软叫停 Claude Code + Uber 烧光全年 AI 预算
7 篇
VLA
🔧 12 📖 11 Learning Structural Latent Points for Efficient Visual Representations in Robotic Manipulation
23 篇
AI
Datasette Agent 首发:可插拔 AI 数据助手
5 篇
VLA
⚡ 1 🔧 15 📖 26 Learning Structural Latent Points for Efficient Visual Representations in Robotic Manipulation
42 篇
AI
OpenAI 模型自主证明 80 年数学猜想:Erdős 平面单位距离问题
7 篇
VLA
🔧 9 📖 25 Rethinking Muon Beyond Pretraining: Spectral Failures and High-Pass Remedies for VLA and RLVR
34 篇
AI
Google 发布 Gemini 3.5 Flash:智能体基准超越 GPT-5.5,价格涨3x
6 篇
VLA
⚡ 1 🔧 9 📖 18 Key-Gram: Extensible World Knowledge for Embodied Manipulation
28 篇
AI
Anthropic 收购 Stainless:SDK/MCP 工具链全面内化
6 篇
VLA
🔧 11 📖 20 PhysBrain 1.0 Technical Report
31 篇
AI
OpenClaw 创始人 3人团队月烧130万美元跑100个Codex Agent
6 篇
AI
OpenAI IPO 前大规模重组:Brockman 全面接管,ChatGPT/Codex/API 三线合一
5 篇
AI
GPT-5.6 曝光 + Codex vs Claude Code 补贴战全面开打
6 篇
VLA
🔧 10 📖 23 SECOND-Grasp: Semantic Contact-guided Dexterous Grasping
33 篇
AI
Anthropic "Code w/ Claude" 大会:Claude Platform 多智能体编排 + Claude Code 异步 Routines
7 篇
VLA
🔧 11 📖 17 StereoPolicy: Improving Robotic Manipulation Policies via Stereo Perception
28 篇
AI
Google Gemini Omni 泄露:全模态视频生成模型,黑板推公式全对
4 篇
VLA
🔧 12 📖 15 StereoPolicy: Improving Robotic Manipulation Policies via Stereo Perception
27 篇
AI
谷歌首次发现黑客用 AI 开发零日漏洞攻击工具
5 篇
VLA
⚡ 1 🔧 14 📖 15 BioProVLA-Agent: An Affordable, Protocol-Driven, Vision-Enhanced VLA-Enabled Embodied Multi-Agent System with Closed-Loop-Capable Reasoning for Biological Laboratory Manipulation
30 篇
AI
OpenAI Symphony: 自主实现运行框架,让团队管理工作而非监督 Agent
5 篇
VLA
🔧 10 📖 17 VLA-GSE: Boosting Parameter-Efficient Fine-Tuning in VLA with Generalized and Specialized Experts
27 篇
VLA
⚡ 1 🔧 7 📖 11 From Pixels to Tokens: A Systematic Study of Latent Action Supervision for Vision-Language-Action Models
19 篇
VLA
🔧 11 📖 13 RLDX-1 Technical Report
24 篇
VLA
⚡ 1 🔧 11 📖 17 Online Safety Filter for Deformable Object Manipulation with Horizon Agnostic Neural Operators
29 篇
VLA
🔧 9 📖 9 Being-H0.7: A Latent World-Action Model from Egocentric Videos
18 篇
VLA
📖 1 Continuum tactile sensing via an amplified liquid metal interface
1 篇
VLA
📖 1 Graph World Models: Concepts, Taxonomy, and Future Directions
1 篇
VLA
🔧 9 📖 15 World2Minecraft: Occupancy-Driven Simulated Scenes Construction
24 篇
VLA
🔧 8 📖 10 Demonstrate once, execute on many: Kinematic intelligence for cross-robot skill transfer
18 篇
— 2026 年 4 月 —
VLA
🔧 13 📖 19 Demonstrate once, execute on many: Kinematic intelligence for cross-robot skill transfer
32 篇
VLA
🔧 12 📖 13 AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents
25 篇
VLA
🔧 6 📖 8 CodeGraphVLP: Code-as-Planner Meets Semantic-Graph State for Non-Markovian Vision-Language-Action Models
14 篇
VLA
📖 2 Closed-loop tactile-visual interactivity via chip-free luminescent fibers enabled by capacitive coupling
2 篇
VLA
🔧 1 From Noise to Intent: Anchoring Generative VLA Policies with Residual Bridges
1 篇
VLA
🔧 12 📖 16 From embodied intelligence to physical AI
28 篇
VLA
🔧 15 📖 24 Object-centric task representation and transfer using diffused orientation fields
39 篇
VLA
🔧 6 📖 14 FASTER: Value-Guided Sampling for Fast RL
20 篇
VLA
🔧 21 📖 8 Demonstrate once, execute on many: Kinematic intelligence for cross-robot skill transfer
29 篇
VLA
⚡ 2 🔧 4 📖 9 Model-Based Reinforcement Learning Exploits Passive Body Dynamics for High-Performance Biped Robot Locomotion
15 篇
VLA
⚡ 1 🔧 7 📖 12 Jump-Start Reinforcement Learning with Vision-Language-Action Regularization
20 篇
VLA
⚡ 1 🔧 5 📖 11 Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting
17 篇
VLA
🔧 5 📖 12 AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly
17 篇
VLA
📖 2 LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation
2 篇
VLA
🔧 5 📖 23 LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation
28 篇
VLA
⚡ 1 🔧 5 📖 14 RichMap: A Reachability Map Balancing Precision, Efficiency, and Flexibility for Rich Robot Manipulation Tasks
20 篇
VLA
🔧 8 📖 18 ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving
26 篇
VLA
🔧 6 📖 31 Diffusion Policy with Bayesian Expert Selection for Active Multi-Target Tracking
37 篇
VLA
🔧 3 📖 12 F2F-AP: Flow-to-Future Asynchronous Policy for Real-time Dynamic Manipulation
15 篇
VLA
🔧 6 📖 19 Causal Scene Narration with Runtime Safety Supervision for Vision-Language-Action Driving
25 篇
VLA
🔧 5 📖 5 Multi-Camera View Scaling for Data-Efficient Robot Imitation Learning
10 篇
VLA
🔧 2 📖 6 Object Affordance Recognition and Grounding via Multi-scale Cross-modal Representation Learning
8 篇
VLA
🔧 5 📖 23 ViPRA: Video Prediction for Robot Actions
28 篇
— 2026 年 3 月 —
VLA
🔧 3 📖 14 MPDiT: Multi-Patch Global-to-Local Transformer Architecture For Efficient Flow Matching and Diffusion Model
17 篇