The last few months were sometimes exciting for Gemma family from open models. We presented Gemma 3 AND Gemma 3 KatProviding the latest performance for single and computer accelerators. Then we announced the full edition Gemma 3NMobile architecture, which provides powerful, multimodal artificial intelligence in real time directly to Edge devices. Our goal was to provide programmers with useful tools for building AI, and we are still surprised By vibrant Gemmaverse, you help to create by celebrating together when downloading exceeded 200 million last week.
Today we add a new, highly specialized tool to the set of Gemma 3 tools: Gemma 3 270mA compact, 270 million parameter model designed from scratch for specific to tuning tasks with strong possibilities related to the instructions and structure of the text already trained.
Gemma 3 270 m provides strong options for telling about the instructions for the small foot model. As shown in Ifeval Benchmark (which tests the model's ability to follow verifiable instructions), sets a new level of performance for its size, thanks to which the sophisticated possibilities of artificial intelligence are more accessible to the application for device and research.
The basic capabilities of Gemma 3 270m
- Compact and talented architecture: Our new model has a total of 270 million parameters: 170 million deposition parameters due to the large vocabulary and 100 million for our transformer blocks. Thanks to the large vocabulary of 256 tokens, the model can support specific and rare tokens, thanks to which it is a strong basic model, which is additionally adapted in specific domains and languages.
- Extreme energy efficiency: The key advantage of Gemma 3 270 m is its low energy consumption. Internal tests for Pixel 9 Pro SOC show that the int4 model used only 0.75% of the batteries up to 25 conversations, which makes it our most efficient Gemma model.
- The following instructions: The instructions for adjusting the instructions are issued together with a previously trained control point. Although this model is not designed for complex cases of conversational use, it is a strong model that occurs after the instructions immediately after removing from the box.
In engineering, success is defined by performance, not just raw power. You wouldn't use a hammer to hang the frame. The same rule applies to construction from AI.
Gemma 3 270m embodies this philosophy of “the right tool for work”. This is a high -quality foundation model that warns the instructions, and its true power is unlocked by tuning. After specializing, it can perform tasks such as text classification and data extraction with extraordinary accuracy, speed and profitability. Starting from a compact, talented model, you can build production systems that are slim, fast and dramatically cheaper to use.
A real success plan
The power of this approach has already brought amazing results in the real world. A great example is the work done by the adaptive ML with SK Telecom. In the face of the challenge of refined, multilingual moderation of content, they decided to specialize. Instead of using a massive general purpose model, the adaptive ML has refined the Gemma 3 4B model. The results were stunning: the specialized Gemma model not only fulfilled, but exceeded the performance of much larger reserved models in its specific task.
Gemma 3 270m aims to enable programmers to take this approach even more, unlocking even greater performance for well -defined tasks. It is an ideal starting point for creating a fleet of small, specialized models, each of which is an expert in their own task.
But this power of specialization does not only apply to corporate tasks; It also enables powerful creative applications. For example, check This internet application on the stories:
GEMMA 3 270M is used to supply the online application of a bedtime story with transformers.js. The size and performance of the model make it suitable for offline, online, creative tasks. (Credit: joshua (@xenovacom na x) from the Hulging Face team)
When to choose Gemma 3 270m
Gemma 3 270m inherits advanced architecture and solid preliminary training of the Gemma 3 collection, providing solid foundations for custom applications.
Here's when it is the perfect choice:
- You have a high, well -defined task. Ideal for functions such as sentimental analysis, entity extraction, query routing, unstructured for structured text processing, creative writing and compliance control.
- You must make every millisecond and micro-centrs. Drastic reduce or eliminate the costs of applying in production and provide response to users faster. The adapted 270 m model can work on a light, inexpensive infrastructure or directly on the device.
- You need to harden and implement quickly. The small Gemma 3 270m size allows for quick experiments with tuning, helping to find the perfect configuration when using in hours, not days.
- You must ensure user privacy. Since the model can work completely on the device, you can build applications that support confidential information without sending data to the cloud.
- You want fleet of specialized task models. Build and implement many custom models, each professionally trained to a different task, without breaking the budget.
Start by tuning
We want it to be as easy as possible to transform Gemma 3 270 m into your own non -standard solution. It is built on the same architecture as the rest of the Gemma 3 models, with recipes and tools to start quickly. You can find our guide Full tuning Using Gemma 3 270 m as part of the DOC Gemma.
Gemmaverse is based on the idea that innovations appear in all sizes. Thanks to Gemma 3 270 m, we authorize programmers to build smarter, faster and more efficient AI solutions. We can't wait to see the specialized models you create.