MIT researchers have created a “speech-to-reality” system that uses AI and robotics to let users verbally request an object, which is then ready within minutes.
The team said the project brings generative AI and robotics one step closer to “on-demand physical fabrication”, with systems so far making furniture such as stools, shelves, chairs and a small table.
In tests, the system also created more complex, decorative pieces, such as a dog statue.
The system works by combining speech recognition, a large language model, 3D generative AI, geometric processing and robotic assembly.
When a user says, for example, “I want a simple stool,” the AI system generates a 3D mesh, converts it into modular components and creates an assembly plan. A robotic arm is then guided to create the final object from a set of lightweight, stackable cubes.
one in paper Publishing on the system, the team also said the model helps make design and manufacturing accessible to those without experience in 3D modeling or robotic programming.
“We see this system as a step toward making it possible for non-experts to move from natural language descriptions to tangible physical objects by combining 3D generative AI with robotic assembly,” Alexander Htet Kyaw, an MIT graduate student and fellow at the Morningside Academy for Design, told AI Business.
“We exclusively use modular components so that each design proposed by the system can be assembled, disassembled, and reassembled for each user prompt,” he said.
The model is also presented as addressing the existing gap in 3D modeling between digital meshes and manufacturable structures.
Currently, AI-generated meshes are not suitable for robotic assembly because it does not take into account manufacturing constraints. The team’s new system modifies AI-generated designs for factors such as component count, overhang, and connectivity to ensure viable physical assembly.
“One of the biggest challenges was making sure the system respected manufacturing constraints,” Kyaw said. “The system had to guarantee that objects could be assembled without collisions, would stand up under gravity and could be taken apart and reused. We had to tightly connect the AI model with feedback from geometric reasoning, simulations, and actual assembly experiments.”
The system has already been successful and is producing a variety of items within five minutes. This model is much faster than normal 3D printing and avoids the material waste of traditional construction.
Next, the researchers plan to move beyond magnetic connectors to improve weight-carrying capacity and expand the system to mobile robots for larger structures. Kyaw is also exploring gesture-based control in conjunction with speech and motion to further simplify human-robot interaction.
Kyaw said, “I’m working toward a future where the essence of matter will actually be under your control.” “A place where reality can be generated on demand.”
