Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR

by January 14, 2026

by January 14, 2026 0 comments

Improved performance for medical imaging use cases

MedGemma was designed from the beginning as a multimodal model, reflecting the multimodal nature of therapy. MedGemma 1 included support for the interpretation of two-dimensional medical images, including chest X-rays, dermatology images, fundus images, and histopathology patches.

With MedGemma 1.5, we are expanding support for high-dimensional medical imaging, starting with three-dimensional volume representation ct imaging And MRIAlso full-slide Histopathology Imaging. Developers can create applications in which multiple slices (for CT or MRI) or multiple patches (for histopathology) are provided as input along with a signal describing the task.

On internal benchmarks, the baseline absolute accuracy of MedGemma 1.5 improved by 3% compared to MedGemma 1 (61% vs 58%) on the classification of pathological CT findings and by 14% (65% vs 51%) on the classification of pathological MRI findings, an average improvement over the findings. Additionally, the fidelity of MedGemma 1.5’s predictions, based on internal diverse benchmarks of histopathology slides and related findings, rouge-l The score on cases with exactly one histopathology slide improved by 0.47 compared to MedGemma 1 (0.49 vs. 0.02), which matches the 0.498 score obtained by task-specific polypath model.

This new high-dimensional support is a natural evolution of CT Foundation, our previous API-based tool for the generation of CT embeddings. To our knowledge, MedGemma 1.5 is the first public release of an open multimodal large language model that can interpret high-dimensional medical data while retaining the ability to interpret normal 2D data and text. Although these capabilities are in their early stages and imperfect, developers will achieve better results by fine-tuning the MedGemma model on their own data, and we expect the MedGemma model to continually improve over time. We have released tutorial notebooks that explain how to use this high dimensional imaging capability for CT (hugging face, Model Garden) and histopathology (hugging face, Model Garden).

Next generation medical image interpretation with MedGemma 1.5 and medical speech to text with MedASR

Improved performance for medical imaging use cases

Anthropic shakes up C-suite to expand its internal incubator

Royal Society president reignites Elon Musk controversy by defending lack of action Royal Society

Related Articles

Leave a Comment Cancel Reply