Our research direction: designing for accessibility
In our early research, we found that a significant barrier to digital equity is the “accessibility gap,” that is, the delay between releasing a new feature and building a supporting layer for it. To bridge this gap, we are shifting from reactive tools to agentic systems that are the core of the interface.
Research Pillar: Using Multi-System Agents to Improve Access
Multimodal AI tools offer one of the most promising routes to creating accessible interfaces. In specific prototypes, such as our work with web readability, we have tested a model where a central orchestrator acts as a strategic reading manager.
Instead of the user navigating a complex maze of menus, the Orchestrator maintains the shared context – understanding the document and making it more accessible by delegating tasks to expert sub-agents.
- Summary Agent: It masters complex documents by breaking down information and delegating key tasks to expert sub-agents, making even the deepest insights clear and accessible.
- Setting Agent: Handles UI adjustments dynamically, such as scaling text.
By testing this modular approach, our research shows that users can interact with the system more intuitively, ensuring that particular tasks are always handled by the right expert without the user having to look for the “right” button.
towards multimodal flow
Our research also focuses on moving beyond basic text-to-speech to multimodal fluency. By leveraging Gemini’s ability to process voice, vision, and text simultaneously, we have created prototypes that can transform live video into instant, interactive audio description.
It’s not just about describing a scene; It’s about situational awareness. In our co-design sessions, we have seen how allowing users to interrogate their surroundings interactively – asking for specific visual details as they happen – can reduce cognitive load and transform a passive experience into an active, conversational exploration.
