We’re shipping two major updates to streamline your creative workflow, allowing you to generate high-speed images with one model and then instantly animate them with the other—all at a fraction of the cost 🍌⚡️ goo.gle/4bcThNt 1️⃣ Introducing Nano Banana 2 Lite: Our fastest and most cost-efficient Gemini Image model yet delivers text-to-image outputs in under 4 seconds. Now available via the Gemini API and Google AI Studio, and rolling out soon across NotebookLM, Google Flow, Gemini App, Stitch by Google, Google Search and Google Photos. 2️⃣ Gemini Omni Flash in Public Preview: Our natively multimodal model for cost-efficient video generation and conversational editing. Now available via the Gemini API, Google AI Studio, and Gemini Enterprise Agent Platform so you can integrate the model into your workflow. While exciting on their own, the real magic happens when you build using these models together. Watch how our interior design demo integrates Nano Banana 2 Lite and Omni to instantly reimagine any space. Upload a photo, swipe through tailored design concepts, and see Omni bring the details to life in cinematic motion. Try out the demo app in AI Studio: goo.gle/443xPqw
About us
Our goal is to equip developers with the most advanced models to build new applications, helpful tools to write better and faster code, and make it easy to integrate across platforms and devices.
- Website
-
https://goo.gle/ai-devs
External link for Google AI
- Industry
- Technology, Information and Internet
- Company size
- 10,001+ employees
Updates
-
Today, we released Gemini 3.5 Live Translate, our latest audio model for live speech-to-speech translation: https://goo.gle/4uVlyA9 It supports over 70 languages and starts translating as soon as you start talking, streaming translations while listening to what you say next. No awkward pauses or choppy audio, just real connection without language barriers. So, how does it work? 🤔 The model is able to make split-second decisions to juggle speed and translation quality so conversations actually feel fluid, human, and natural. In order to do this, the model must receive and contextualize the input while simultaneously outputting the translated speech. Through this process, Gemini 3.5 Live Translate manages to stay mere seconds behind each speaker and can even maintain pacing, pitch, and intonation across extended sessions. See it in action below, or try it yourself in the Google Translate app on iOS & Android.
-
Hear the architects of Gemini reflect on their journey to continue pushing the frontier of AI, on this episode of Release Notes: https://goo.gle/4aj07R8 Jeff Dean, koray kavukcuoglu, Oriol Vinyals, and Noam Shazeer sit down on camera together to share a behind-the-scenes look at the people behind the model, and how they saw the vision come together.
-
Gemini 3.5 Flash’s real-world agentic capabilities are already driving meaningful progress for leading organizations: https://goo.gle/3RSz20J Check out how our partners are transforming workflows with 3.5 Flash: 🛍 Shopify is running subagents in parallel to analyze complex data over a long horizon for more accurate merchant growth forecasts at a global scale. 🔍 Ramp is using Gemini 3.5 Flash to enable smarter, more reliable OCR through multimodal understanding of complex invoices combined with reasoning over historical patterns. ☁️ Salesforce is integrating Gemini 3.5 Flash into Agentforce to reliably automate complicated enterprise tasks by deploying multiple subagents that retain context and execute complex, multi-turn tool calling. 💡 Databricks is using agentic workflows to monitor and retrieve real-time information, reason across massive datasets to diagnose issues, identify fixes and propose solutions for data scientists. 📈 Macquarie Group is piloting how Gemini 3.5 Flash can accelerate customer onboarding by reasoning over complex 100+ page documents, retrieving relevant information and making reliable recommendations with low latency. 📃 Xero is deploying agents to autonomously manage complex, multi-week workflows, such as identifying suppliers and gathering information for 1099 tax forms, enabling small businesses to automate tedious admin tasks.
-
Our new model, Gemini Omni, is designed to create anything from any input, starting with video. Here’s how it works 👇 - World understanding: Gemini Omni is built on Gemini's vast knowledge of history, science, and culture, so it can produce videos that are grounded in how the world actually works. - Reference anything: Gemini Omni extends Gemini's native multimodality, allowing you to blend combinations of text, audio, image, and video inputs into a high-quality, consistent video. - Conversational editing: Gemini Omni allows you to edit your videos using natural language (like Nano Banana, but for video). So you can easily change your characters, settings, and styles by just describing what you want. Google AI Plus, Pro, and Ultra subscribers can access Gemini Omni Flash today on the Gemini App, Google Flow, Google Flow Music, and for no cost on YouTube Shorts and the YouTube Create app.
-
We’ve launched a brand-new intelligent Search box! Here's what that means: + An upgrade to the Search experience with our most advanced Gemini 3.5 models, bringing with them our latest agentic capabilities + You can ask across modalities (text, images, files, and videos) and Search can reason across them all + We're combining AI Overviews and AI Mode into one, seamless AI Search experience. So you can ask follow-up questions, build context, and received even more tailored and personalized responses This new AI Search experience is now live across desktop and mobile, worldwide.
-
✨ Introducing Gemini 3.5, our latest family of models combining frontier intelligence with action. The series sets a new standard for agentic models that don't just reason, they execute. Today, we are launching Gemini 3.5 Flash ⚡️ → The model accelerates rapid prototyping and exploration by dynamically testing alternate paths and solutions. It also runs large-scale systems for tasks like document extraction and classification. → Gemini 3.5 Flash executes multi-step workflows autonomously. Spin up code, execute, and iterate in seconds while scaling long-horizon tasks with reliability. → With latency tuned for the speed of thought, the model delivers quick responses that keep you in a 'flow state' during intense, real-time coding sessions. Access Gemini 3.5 Flash via the Gemini API, Google Antigravity, Google AI Studio, and Android Studio. And stay tuned for Gemini 3.5 Pro coming next month!
-
Build production-ready solutions with Google DeepMind's Gemini for Developers course. Registration is officially open for this specialization series from Coursera that teaches you how to: - Reason & Act: Build AI apps that don't just generate text, but reason through complex tasks - Connect & Automate: Use function calling to connect Gemini with real-world tools - Scale with Confidence: Build, test, and deploy scalable AI systems Start building with Gemini today → https://goo.gle/4nIacgh
-
We’re expanding the Gemini API File Search tool 🔍 with 3 new updates that enable developers to more easily build multimodal RAG systems with enhanced precision → goo.gle/4tR5bnD Check out the new features ⬇️ + Multimodal Support: By leveraging our Gemini Embedding 2 model, File Search can now reason across image and text simultaneously. + Custom Metadata Filtering: Bring structure to unstructured data by tagging files with custom key-value labels. This pre-filters your data and boosts search speed. + Exact citations: File Search can now capture and return the exact source (down to the page number) for every piece of information indexed. And see multimodal File Search in action with our example app in Google AI Studio. Chat with your entire image and doc library, ask questions, and trace answers back to the source: goo.gle/4tKSz1k
-
Speed up your Gemma 4 workflows by up to 3x with Multi-Token Prediction (MTP) drafters: https://goo.gle/42beiDF Standard LLM inference is fundamentally memory-bandwidth bound, creating a latency bottleneck as billions of parameters travel from VRAM just to generate a single token. We're working to ease this bottleneck with MTP drafters for Gemma 4. A drafter is a tiny, hyper-efficient model that runs alongside your “target” (or main) Gemma 4 model. By using a specialized speculative decoding architecture to decouple token generation from verification, these drafters deliver a 3x speedup without any degradation in output quality or reasoning logic. By pairing the model with its drafter, developers are able to achieve: — Improved responsiveness — Supercharged local development — Faster on-device performance — Frontier-class reasoning without degradation MTP drafters for Gemma 4 are available today under the same open-source Apache 2.0 license. Download the weights today: Kaggle - https://goo.gle/42T5oe1 Hugging Face - https://goo.gle/3QLsmkN
-