NVIDIA recently released its new streaming English transcription model (Nemotron Speech ASR) built specifically for low-latency voice agents and live captioning. outpost nvidia/nemotron-speech-streaming-en-0.6b On Hugging Face combines a cache-aware FastConformer …
Tag:
speech
-
-
Generative AI
Google Health AI releases MedASR: a conformer-based medical speech to text model for clinical dictation
The Google Health AI team has released MedASR, an open source medical speech to text model that targets clinical dictation and doctor patient conversations and is designed to plug directly …
-
The latest update to Google Translate brings live speech translation with support for over 70 languages, which is basically only available on the Pixel Buds, any headphones you want. Its …
-
AI News
Microsoft AI releases VibeVoice-RealTime: a lightweight real-time text-to-speech model that supports streaming text input and robust long-form speech generation.
Microsoft has released vibevoice-realtime-0.5bA real-time text to speech model that works with streaming text input and long form speech output, aimed at agent style applications and live data narration. The …
Older Posts