Gemini 3.1 Flash-Lite offers options for how it processes input

by March 4, 2026

by March 4, 2026 0 comments

Gemini 3.1 Flash-Lite offers options for how it processes input

Enterprise developers can now choose the level of thinking needed for a specific task with Google’s recently released Gemini 3.1 Flash-Lite, its latest reasoning model.

On Tuesday, the cloud provider launched Gemini 3.1 Flash-Lite in preview, calling it its fastest and most cost-effective. gemini 3 Series models, designed for high developer workloads.

With the model, enterprises can select different depths of thinking – minimal, low, medium or high – depending on the work being done. The model runs in AI Studio, a platform developers can use to build, test, and deploy applications using Gemini models, as well as Vertex AI, a platform for building and scaling machine learning models. According to Google, this model is suitable for high-volume translation, content moderation, and complex workloads, such as building user interfaces and dashboards, following instructions, and creating simulations.

Connected:OpenAI unveils $110B funding, expands AWS partnership

With its new model, Google aims to target the challenge that many enterprise developers face when working logic model. Often, thinking models take time to perform a task, which can be costly and wasteful if the developer does not need a deep level of analysis for a particular task. By enabling enterprise developers to choose the level of thinking, Google is also helping enterprises develop multi-purpose agents.

“It’s a perfect model for agents,” said Mark Becque, an analyst at Omdia, a division of Informa TechTarget.

He said that while other AI model providers focus on reasoning models and agents, Google’s strategy is enterprise driven. In this case, Google’s approach focuses on reducing the number of tokens used by a model, giving businesses a good but cheaper model option.

“If you can make something two and a half times faster and almost halve the price, that’s a huge play,” said Futurum Group analyst Bradley Shimin.

Enterprise developers are also starting to distribute tasks across multiple models instead of relying on one, he said. For example, a developer building AI agents may need Gemini 3.1 Pro for planning and building, while 3.1 Flash-Lite can be used for basic documentation or code generation.

“This is not a game of being overwhelmed by greatness,” Shimin said. “It’s a game of adaptation with greatness.”

Developers have begun to realize with other model releases such as Quen 3.5-9b earlier this week that it may be better for models to turn off the ability to process their own tasks as turning it on slows down the model and limits optimization.

Connected:Google releases Nano Banana 2 with additional AI features

“As you get into more complex interactive sessions or longer context windows, you’re sometimes no better off with the extra thinking time,” Shimin said.

The Gemini 3.1 Flash-Lite is an example of how models continue to grow and evolve in the AI market, Becque said.

“We are moving at a rapid pace to bring models that will be more efficient, better,” he said. “They’re getting better and more efficient.”

Google said its new model is priced at $0.25 per million input tokens and $1.50 per million output tokens.

Gemini 3.1 Flash-Lite offers options for how it processes input

Is that message spam or genuine? This Android trick helps you identify scams

Are prime numbers hidden inside black holes?

Related Articles

Leave a Comment Cancel Reply