Gemini 2.5: Updates to our thinking model family

Today we're excited to share updates to the entire Gemini 2.5 family:

  • Gemini 2.5 Pro is generally available and stable (no changes from preview 06-05)
  • Gemini 2.5 Flash is generally available and stable (no changes from Preview 05-20, see pricing updates below)
  • Gemini 2.5 Flash-Lite is now available in preview

Gemini 2.5 models are thinking models, able to think through their thoughts before responding, resulting in increased efficiency and greater accuracy. Each model has control over its thinking budget, giving developers the ability to choose when and how much the model “thinks” before generating a response.

An overview of our Gemini 2.5 family of thinking models

Introducing Gemini 2.5 Flash-Lite

Today we're introducing the 2.5 Flash-Lite preview, the lowest latency and lowest cost in the 2.5 family. It is designed as a cost-effective upgrade to our previous Flash 1.5 and 2.0 models. It also offers better performance on most evals and faster time to first token, while achieving higher tokens per second of decoding. This model is ideal for high-throughput tasks such as classification or large-scale summarization.

Gemini 2.5 Flash-Lite is a reasoning model that allows dynamic control of the thinking budget via an API parameter. Because Flash-Lite is optimized for cost and speed, “thinking” is disabled by default, unlike our other models. 2.5 Flash-Lite also supports all our native tools such as grounding with Google Search, code execution, and URL context, in addition to function calling.

Benchmarks for Gemini 2.5 Flash-Lite

Benchmarks for Gemini 2.5 Flash-Lite

Gemini 2.5 Flash Updates and Pricing

Over the past year, our research teams have continued to push the pareto frontier with our series of Flash models. When 2.5 Flash was first announced, we had not yet finalized the capabilities of 2.5 Flash-Lite. We also introduced “thinking” and “non-thinking” prices to the market, which confused developers.

With the release of stable Gemini 2.5 Flash (which is the same 05-20 preview we released on Google I/O) and the incredible performance of 2.5 Flash, we are updating the price of 2.5 Flash:

  • Input tokens worth USD 0.30 / 1 million (*up from USD 0.15 contribution)
  • $2.50 / 1 million output tokens (*down from $3.50 production)
  • We have removed the difference in the thinking and non-thinking price
  • We have maintained one price level regardless of the size of the input token

While we strive to maintain consistent pricing between preview and stable releases to minimize disruption, this is a specific adjustment to reflect the unique value of Flash while still offering the best cost per intelligence available.

With Gemini 2.5 Flash-Lite, we now have an even cheaper option (with or without thought) for cost- and latency-sensitive applications that require less model intelligence.

Pricing updates for our Gemini Flash family

Pricing updates for our Gemini Flash family

If you are using Gemini 2.5 Flash Preview 04-17, existing preview pricing will remain in effect until the scheduled retirement of the release on July 15, 2025, at which time the model endpoint will be disabled. You can upgrade to the generally available “gemini-2.5-flash” model or upgrade to 2.5 Flash-Lite Preview as a lower-cost option.


Further development of Gemini 2.5 Pro

The growth and demand for the Gemini 2.5 Pro continues to be the highest we have ever seen for any of our models. To enable more customers to use this model in production, we are making version 06-05 of this model stable, with the same pareto frontier price as before.

We expect that in cases where you need the highest intelligence and most capabilities, a professional will shine, such as coding and agent tasks. Gemini 2.5 Pro is at the heart of many of our most beloved development tools.

Top development tools using Gemini 2.5 Pro including Cursor, Bolt, Cline, Cognition, Windsurf, GitHub, Lovable, Replit and Zed Industries

The best development tools using Gemini 2.5 Pro

If you are using 2.5 Pro Preview 05-06, the model will remain available until June 19, 2025, after which it will be disabled. If you are using 2.5 Pro Preview 06-05, you can simply update the model string to “gemini-2.5-pro”.

We're excited to see even more domains take advantage of the intelligence of 2.5 Pro, and we look forward to sharing more about scaling beyond Pro in the near future.

LEAVE A REPLY

Please enter your comment!
Please enter your name here