Current end-to-end robotic policies, especially vision-language-action (VLA) models, typically work on single observations or very brief histories. This ‘memory deficit’ makes long-term tasks, such as cleaning the kitchen or following …
Tag:
