Apple’s biggest acquisition to date is still its $3 billion purchase of Beats in 2014, but now the second-largest deal is bringing P.A.IFour year old AI audio startup. However, Apple …
speech
-
-
Leaders don’t need to write every word themselves, but they do need to make sure it clearly represents their vision.
-
Future Tech
NASA gears up for Artemis II mission, AI-powered speech gives hope to stroke patients, and researchers discover oldest cave art ever discovered
Center Pierre-Louis: For scientific American‘S science quicklyI’m Kendra Pierre-Louis on behalf of Rachel Feltman. You’re listening to our weekly science news roundup. First up, we have an update on humans …
-
AI Tools
FlashLabs Researchers Release Chroma 1.0: A 4B Real Time Speech Dialogue Model with Personalized Voice Cloning
Chroma 1.0 is a real-time speech-to-speech dialogue model that takes audio as input and returns audio as output while preserving speaker identity in multi-turn conversations. It is presented as the …
-
AI Basics
Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR
Improved performance for medical imaging use cases MedGemma was designed from the beginning as a multimodal model, reflecting the multimodal nature of therapy. MedGemma 1 included support for the interpretation …
-
Elon Musk has accused the UK government of suppressing free speech after ministers threatened to fine and potentially ban his social media site X after its AI tool Grok was …
-
AI News
NVIDIA AI releases Nemotron Speech ASR: a new open source transcription model designed from the ground up for low-latency use cases like voice agents
NVIDIA recently released its new streaming English transcription model (Nemotron Speech ASR) built specifically for low-latency voice agents and live captioning. outpost nvidia/nemotron-speech-streaming-en-0.6b On Hugging Face combines a cache-aware FastConformer …
-
Generative AI
Google Health AI releases MedASR: a conformer-based medical speech to text model for clinical dictation
The Google Health AI team has released MedASR, an open source medical speech to text model that targets clinical dictation and doctor patient conversations and is designed to plug directly …
-
The latest update to Google Translate brings live speech translation with support for over 70 languages, which is basically only available on the Pixel Buds, any headphones you want. Its …
-
AI News
Microsoft AI releases VibeVoice-RealTime: a lightweight real-time text-to-speech model that supports streaming text input and robust long-form speech generation.
Microsoft has released vibevoice-realtime-0.5bA real-time text to speech model that works with streaming text input and long form speech output, aimed at agent style applications and live data narration. The …
