Gemini 2.5 Flash-Lite is now stable and generally available

Today we release the stable version of Gemini 2.5 Flash-Lite, our fastest and lowest cost (0.10 USD per 1 m, 0.40 USD per 1 m) in the family of the Gemini 2.5 model. We have built 2.5 flash-lite to push the intelligence limit to the dollar, with native reasoning possibilities that can be optionally switched in the case of more demanding cases of use. Based on the 2.5 Pro and 2.5 Flash shoot, this model complements our set of 2.5 models, which are ready for scaled production use.


Our most profitable and fastest Model 2.5 so far

Gemini 2.5 Flash-Lite provides a balance between performance and costs, without prejudice for quality, especially in the case of delay-sensitive tasks, such as translation and classification.

Here's what stands out:

  • Best speed in class: Gemini 2.5 Flash-Lite has a lower delay than both 2.0 Flash-Lite and 2.0 Flash on a wide sample of hints.
  • Profitability: This is our lowest 2.5 model so far, valued at $ 0.10 / 1 million input tokens and $ 0.40 tokens, which allows at an affordable price of large amounts of demands. We have also reduced the audio input prices by 40% compared to launching the preview.
  • Intelligent and small: It shows a versatile higher quality than 2.0 flash-lite in a wide range of comparative tests, including coding, mathematics, science, reasoning and multimodal understanding.
  • Fully presented: When you build with 2.5 Flash-Lite, you will gain access to the context window by 1 million tokens, controlled thinking budgets and operating native tools, such as grounding with Google search, code making and URL context.


Gemini 2.5 Flash-Lite in action

Since the start of 2.5 Flash-Lite, we have already seen some extremely successful implementation, here are some of our favorites:

  • Satlyt Builds a decentralized spatial computing platform that will transform the method of processing and using satellite data to summarize telemetry in orbit, autonomous task management and analyzing satellite communication to satellite. 2.5 Flash-Lite speed enabled 45% delay for critical on -board diagnostics and 30% decrease in energy consumption compared to their output models.
  • Heygen Uses artificial intelligence to create avatars for video content and uses Gemini 2.5 Flash-Lite to automate video planning, analysis and content optimization and Translate movies into over 180 languages. This allows them to provide users with global, personalized experiences.
  • Docshound It turns the demo of products into documentation using Gemini 2.5 Flash-Lite Process long movies and bring out thousands of screenshots with a low delay. This transforms the film material into comprehensive documentation and training data for AI agents much faster than traditional methods.
  • Evertune Marek helps to understand how they are represented in AI models. Gemini 2.5 Flash-Lite is for them changing the game, dramatically accelerating the analysis and generation of reports. Its quick performance allows them to quickly scan and synthesize the large volume of the model output to provide customers dynamic, timely observations.

You can start using 2.5 Flash-Lite, specifying in the code “Gemini-2.5-Flash-Lite”. If you use the preview version, you can go to “Gemini-2.5-Flash-Lite”, which is the same model. We plan to remove the Flash-Lite preview alias on August 25.

Ready to start building? Try the stable version of Gemini 2.5 Flash-Lite now Google to learn AND Vertex AI.

LEAVE A REPLY

Please enter your comment!
Please enter your name here