Memory-Augmented VLA

Research lead, 2026.02 – ongoing.

A vision-language-action (VLA) policy that stays coherent over long-horizon manipulation by carrying an explicit, bounded memory.

Architecture — hierarchical task decomposition, a sub-task boundary detector, a fixed-capacity Implicit Memory Bank, and a memory–action cross-attention interface.
Implementation — full PyTorch training & inference pipeline; multi-GPU distributed training; sim-to-real validation on a real arm.
Evaluation — RMBench / VLABench / LIBERO-Long against Mem-0, MEM, MemER, CronusVLA baselines.

研究负责人，2026.02 – 进行中。

一个视觉-语言-动作（VLA）策略，通过显式、有界的记忆，在长程操作任务中保持连贯。