Cohear unveils open source speech model for edge devices

by ai-intensify
0 comments
Cohear unveils open source speech model for edge devices

Cohear is looking to take advantage of the enterprise trend of embedding automatic speech recognition into applications with an open source speech model containing 2 billion parameters.

Coher Transcribe, introduced Thursday, is trained in 14 languages, including Chinese, Japanese, Polish, French and Greek. Cohere released the model under the Apache 2.0 license and said the model outperformed alternatives on the Hugging Face Open ASR leaderboard, including ElevenLabs Scribe and Quen3. According to the company, the model will soon be integrated into Foghere’s AI agent orchestration platform, North.

Cohear Transcribe is an example of the evolution of speech recognition models. Previously, speech models were designed using deep learning techniques such as long short term memoryRecurrent neural networks, and later, transformer-based architectures, which struggled to achieve low latency due to model size.

Connected:Grammarly Rebrands as Superhuman, Introducing Productivity Agents

However, newer models like Transcribe are small enough to be deployed on edge devices. As technology, infrastructure, and capabilities have matured, ASR use cases have expanded, especially in customer service, banking, sales, and marketing, leading to a rise in ASR models from vendors such as IBM and Alibaba.

Even video conferencing company Zoom has joined the race. In 2025, video conferencing platform provider introduced AI Companion 3.0Which included real-time voice translation capability. It later introduced a separate feature that allowed participants to hear the exchanges in their own language.

“Speech will always be fundamental to AI,” said Lian Jae Su, analyst at Omdia, a division of Informa TechTarget. “That’s how the whole AI movement started – because humans started being able to interact with Siri.”

He cited some features of Cohear Transcribe as notable, including its small size and the company’s decision to make the model open source.

“When it’s open source, you get developers to test it and if they like the results good enough they’ll come back to you,” Su said. “Then you can obviously commercialize a better model.” Meta has found success with this business model, influencing others like Alibaba and Nvidia to follow suit.

“The fog is trying to copy that,” Su said. But the company is focusing on the area where it excels — speech recognition and speech-to-text models, he said.

While Cohear has traditionally focused on text generation, it may find an opportunity within speech recognition, especially as some enterprises look to upgrade traditional speech models that use Transformers into a growing line of smaller ASR models that can be used on edge devices, Su continued.

Connected:Adobe introduces custom generative AI service

Related Articles

Leave a Comment