Bringing speed and strong cost performance with Gemini Omni Flash and Nano Banana 2 Lite
Great creative happens when your tools move at the speed of your ideas.
In this edition, we’re bringing two new models to Gemini Enterprise Agent Platform. Nano Banana 2 Lite (Gemini 3.1 Flash-Lite Image) is available for everyone, and Gemini Omni Flash is available in public preview. Both models provide some of the best price-performance among market-leading frontier models for image and video generation and editing.
Great creative happens when your tools move at the speed of your ideas. To help you create rich, reliable experiences while reducing regeneration time and costs, we’re adding two new models to Gemini Enterprise Agent Platform.
First, we’re announcing the general availability of Nano Banana 2 Lite (Gemini 3.1 Flash-Lite Image). This model is the fastest and most cost-efficient image generation and editing model within the Nano Banana model family. Whether you're rapid-firing ideas, A/B testing ad variations, or powering social apps for millions of users, this model gives you the power to explore, iterate, and scale with speed.
We’re also releasing Gemini Omni Flash in public preview. Grounded in Gemini's real-world knowledge, it powers high-quality video generation and conversational editing. Whether you're executing character or product swaps, performing dynamic style transfers, or adding objects and relighting scenes, this model gives you precise control to edit and refine video assets.
Both models provide some of the best price-performance among market-leading frontier models for image and video generation and editing. To read more about how leading enterprises are already using these models, read the full blog here.
Gemini Omni Flash: High-quality video generation and editing
Gemini Omni Flash brings conversational video generation and editing directly into your applications. Users can easily embed powerful media models into their agentic workflows to create, remix, and refine video without ever switching platforms.
We built Gemini Omni Flash with a focus across these four key areas:
- Conversational editing: Swap characters, relight scenes, or alter angles using natural language while natively maintaining original audio and video tracks.
- Multimodal input: Combine text, images, and video inputs to guide video generation. Gemini Omni Flash natively generates audio with every video output, while maintaining character, object, and style consistency.
- World knowledge and simulation: It combines an intuitive understanding of physics with Gemini's knowledge of history, science and cultural context, bridging the gap from photorealism to meaningful storytelling.
- Text and action synchronization: Render legible text and graphics directly into video, syncing kinetic typography and explainer text with on-screen movements.
Note: Support for audio references, video references, last frame, scene extension and higher resolutions for the Gemini Omni Flash via Gemini Enterprise Agent Platform API will be available soon.
To see the full list of model capabilities and how to integrate it check out the documentation and pricing.
Nano Banana 2 Lite: Built for cost and speed
Nano Banana 2 Lite can generate an image in as little as four seconds. You can generate and iterate on design concepts in seconds, taking you from a blank page to the perfect layout instantly.
Significant improvements over Nano Banana (Gemini 2.5 Flash Image)
Nano Banana 2 Lite blends fast image generation with a significant leap in visual quality and capability compared to our legacy model, Nano Banana. We enhanced core capabilities so you can execute complex tasks at high speeds:
- World knowledge: Quickly draft accurate contextual scenes, rough data visualizations, and location-specific mockups.
- Character consistency: Maintain character identities and object fidelity across multiple swift generations to easily build out storyboarding tools or embed virtual try-ons for ecommerce.
- Quick text and localization: Draft copy on the fly by rendering legible text directly into rapid generations to see how typography works across localized ad variations.
To see the full list of model capabilities and how to integrate it check out the documentation and pricing.
Note: Image generation offers the fastest latency. Image editing may experience slightly higher response time.
Start building today
Embed these image and video generation and editing capabilities into your applications and creative workflows today. Explore these resources to start building:
- Try the models: Agent Studio within Gemini Enterprise Agent Platform
- API documentation: Nano Banana 2 Lite, Gemini Omni Flash
- Access Colab notebooks: Nano Banana 2 Lite, Gemini Omni Flash
- Pricing: Agent Platform Pricing for both models
- Prompting guides: Nano Banana, Gemini Omni Flash
- Gemini Omni Flash Prompting Agent Skills
The pace of capability improvement is remarkable, and announcements like this show how quickly Enterprise AI is moving from experimentation into everyday operational workflows. What interests me just as much is the organisational challenge that follows. As increasingly capable AI agents become embedded within business services, technical performance alone won't demonstrate that an organisation is ready to operate them safely. That's where Operational Readiness becomes critical. The capability to deploy AI and the capability to own, govern and support it as a live operational service are different questions. Increasingly, AI Governance (Infrastructure Governance) is about defining the decision authority and governance needed before those capabilities become part of day-to-day operations.
Send to me new updates
Conversational editing on video generation is the harder problem than initial generation quality, since maintaining consistency across edit turns requires tracking scene state, not just producing a good single frame. Worth checking how well Omni Flash preserves continuity across multiple edit passes versus regenerating from scratch each time.
🍌
Thank you for sharing