Microsoft launches vision-language-action model for robots

by
0 comments
Microsoft launches vision-language-action model for robots

Microsoft introduced Ro-Alpha, a new vision-language action model designed to make robots more adaptable, responsive, and able to work in real-world environments.

Tech giant unveils generative AI sight-language-action (VLA) model explained in a blog post earlier this month. This model is derived from Microsoft’s Phi open model series.

Rho-alpha translates natural language commands into control signals for robots performing manipulation tasks.

To train its model, Microsoft said it combined physical demonstrations and simulations with a multistage reinforcement learning process built on open source Nvidia Isaac SIM Framework.

For better perception, Microsoft also added tactile sensing capabilities, allowing robots to use touch to react to their environment instead of relying solely on visual input.

In future iterations, Microsoft said it plans to add force sensing and other modalities.

A video demonstration included in the blog post shows Rho-alpha interacting with busyboxA physical interaction benchmark recently introduced by Microsoft Research using natural language instructions.

The Microsoft model release comes as more industries are starting to use robots, moving from narrow, task-specific deployments to more dynamic, unstructured and frequent rollouts. human-centered environment.

Connected:Serv Robotics acquires hospital assistant robot company

This shift has focused on models that enable robots to reason and act with greater autonomy.

In this context, Microsoft is positioning Ro-Alpha as a more flexible and adaptable AI system for robots, enabling greater deployment opportunities in all areas than traditional models.

“The emergence of VLA models for physical systems is enabling systems to understand, reason, and act alongside humans with increasing autonomy,” Ashley Lawrence, corporate vice president and managing director of Microsoft’s Research Accelerator, said in a blog post introducing the model.

Rho-Alpha is currently being evaluated on dual-arm robotic systems and humanoid robots, with Microsoft planning to publish technical details of the models in the coming months.

This model will initially be available through the Early Access program, with wider availability planned in Microsoft Foundry in the future.

Related Articles

Leave a Comment