Experiment with native image generation in Gemini 2.0 Flash

IN December For the first time, we introduced native image output in Gemini 2.0 Flash to trusted testers. Today we're making it available for experimentation by developers around the world all regions currently powered by Google AI Studio. You can test this new feature using the experimental version of Gemini 2.0 Flash (gemini-2.0-flash-exp) in Google AI Studio and via the Gemini API.

Gemini 2.0 Flash combines multimodal input, enhanced reasoning and natural language understanding to create images.

Here are some examples where Flash 2.0's multimodal outputs shine:


1. Text and images together

Use Gemini 2.0 Flash to tell a story and it will illustrate it with images, keeping characters and settings consistent throughout. Give him feedback and the model will tell a story or change the style of his drawings.

Sorry, your browser does not support playing this video

Generate stories and illustrations in Google AI Studio

2. Conversational image editing

Gemini 2.0 Flash helps you edit images through multiple stages of natural language dialogue, which is great for iterating towards the perfect image or for exploring different ideas together.

Sorry, your browser does not support playing this video

Multi-rotation conversation image editing while maintaining context throughout the conversation in Google AI Studio

3. Understanding the world

Unlike many other image generation models, Gemini 2.0 Flash uses global knowledge and improved reasoning to create Normal picture. This makes it ideal for creating detailed images that are realistic and resemble an illustration of a recipe. Although it strives for accuracy, like all language models, its knowledge is broad and general, not absolute or complete.

Sorry, your browser does not support playing this video

Interleaved text and image of a recipe in Google AI Studio

4. Text rendering

Most image generation models have trouble accurately rendering long sequences of text, often resulting in poorly formatted or unreadable characters or spelling errors. Internal benchmarks show that Flash 2.0 delivers better rendering compared to leading competitive models and is perfect for creating ads, social posts and even invitations.

Sorry, your browser does not support playing this video

Output image with long text rendering in Google AI Studio

Start creating images with Gemini today

Get started with Gemini 2.0 Flash via the Gemini API. Read more about image generation in our documents.

from google import genai
from google.genai import types

client = genai.Client(api_key="GEMINI_API_KEY")

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=(
        "Generate a story about a cute baby turtle in a 3d digital art style. "
        "For each scene, generate an image."
    ),
    config=types.GenerateContentConfig(
        response_modalities=("Text", "Image")
    ),
)

Python

Whether you're creating AI agents, building apps with beautiful visuals like illustrated interactive stories, or brainstorming in conversations, Gemini 2.0 Flash lets you add text and image generation with just one model. We can't wait to see what developers create with the native output image and yours feedback will help us finalize a production ready version soon.

LEAVE A REPLY

Please enter your comment!
Please enter your name here