Gemini 2.5 Flash-Lite

Gemini 2.5 Flash-Lite is our most balanced Gemini model, optimized for low latency use cases. It comes with the same capabilities that make other Gemini 2.5 models helpful, such as the ability to turn thinking on at different budgets, connecting to tools like Grounding with Google Search and code execution, multimodal input, and a 1 million-token context length.

2.5 Flash-Lite

Try in Agent Studio Deploy example app

Note: "Deploy example app" requires a Google Cloud project with billing and Agent Platform API enabled.

Model ID	`gemini-2.5-flash-lite`
Modalities	Text Input and output Image Input only Audio Input only Video Input only
Token limits	Context window	1,048,576
Token limits	Maximum output tokens	65,535 (default)
Capabilities	Supported Grounding with Google Search Code execution Supervised fine-tuning Continuous tuning Preference tuning Tuning checkpoints System instructions Function calling Count Tokens Structured output Thinking Implicit context caching Explicit context caching Not supported Gemini Live API Chat completions Content Credentials (C2PA)
Consumption options	Provisioned Throughput Supported Batch inference Supported Pay-as-you-go Priority PayGo Supported Fixed quota Not supported
Consumption options	See Consumption options for more information.
Input size limit	500 MB
Technical specifications	Image	Maximum images per prompt: 3,000 Maximum file size per file for inline data or direct uploads through the console: 7 MB Maximum file size per file from Google Cloud Storage: 30 MB Maximum number of output images per prompt: 10 Supported MIME types: `image/png`, `image/jpeg`, `image/webp`, `image/heic`, `image/heif`
	Text	Maximum number of files per prompt: 3,000 Maximum number of pages per file: 1,000 Maximum file size per file for the API or Cloud Storage imports: 50 MB(application/pdf) or 7 MB(text/plain) Maximum file size per file for direct uploads through the console: 7 MB Supported MIME types: `application/pdf`, `text/plain`
	Video	Maximum video length (with audio): Approximately 45 minutes Maximum video length (without audio): Approximately 1 hour Maximum number of videos per prompt: 10 Supported MIME types: `video/x-flv`, `video/quicktime`, `video/mpeg`, `video/mpegs`, `video/mpg`, `video/mp4`, `video/webm`, `video/wmv`, `video/3gpp`
	Audio	Maximum audio length per prompt: Approximately 8.4 hours, or up to 1 million tokens Maximum number of audio files per prompt: 1 Supported MIME types: `audio/x-aac`, `audio/flac`, `audio/mp3`, `audio/m4a`, `audio/mpeg`, `audio/mpga`, `audio/mp4`, `audio/ogg`, `audio/pcm`, `audio/wav`, `audio/webm`
	Parameter defaults	Temperature: 0.0-2.0 (default 1.0) topP: 0.0-1.0 (default 0.95) topK: 64 (fixed) candidateCount: 1–8 (default 1)
Supported regions	Model availability	Global global United States us-central1 us-east1 us-east4 us-east5 us-south1 us-west1 us-west4 Europe europe-central2 europe-north1 europe-southwest1 europe-west1 europe-west4 europe-west8 europe-west9
Supported regions	See Deployments and endpoints for more information.
Knowledge cutoff date	January 2025
Versions	`gemini-2.5-flash-lite` Launch stage: GA Release date: July 22, 2025 Retirement date: October 16, 2026
Security controls	Online prediction	Data residency CMEK VPC-SC AXT
	Batch inference	Data residency CMEK VPC-SC AXT
	Tuning	Data residency CMEK VPC-SC AXT
	Context caching	Data residency CMEK VPC-SC AXT
	RAG Engine	Data residency CMEK VPC-SC AXT
	Grounding with Google Search and Grounding with Google Maps	Data residency CMEK VPC-SC AXT
	See Security controls for more information.
Pricing	See Pricing.

2.5 Flash-Lite

Caution: gemini-2.5-flash-lite-preview-09-2025 will be discontinued on July 9, 2026. Update your application to use gemini-2.5-flash-lite or other supported model.

Try in Agent Studio Deploy example app

Note: "Deploy example app" requires a Google Cloud project with billing and Agent Platform API enabled.

Model ID	`gemini-2.5-flash-lite-preview-09-2025`
Modalities	Text Input and output Image Input only Audio Input only Video Input only
Token limits	Context window	1,048,576
Token limits	Maximum output tokens	65,535 (default)
Capabilities	Supported Grounding with Google Search Code execution System instructions Function calling Count Tokens Structured output Thinking Implicit context caching Explicit context caching Not supported Supervised fine-tuning Continuous tuning Preference tuning Tuning checkpoints Gemini Live API Chat completions Content Credentials (C2PA)
Consumption options	Provisioned Throughput Supported Batch inference Not supported Pay-as-you-go Standard PayGo Supported Fixed quota Not supported
Consumption options	See Consumption options for more information.
Technical specifications	Image	Maximum images per prompt: 3,000 Maximum file size per file for inline data or direct uploads through the console: 7 MB Maximum file size per file from Google Cloud Storage: 30 MB Maximum number of output images per prompt: 10 Supported MIME types: `image/png`, `image/jpeg`, `image/webp`, `image/heic`, `image/heif`
	Text	Maximum number of files per prompt: 3,000 Maximum number of pages per file: 1,000 Maximum file size per file for the API or Cloud Storage imports: 50 MB(application/pdf) or 7 MB(text/plain) Maximum file size per file for direct uploads through the console: 7 MB Supported MIME types: `application/pdf`, `text/plain`
	Video	Maximum video length (with audio): Approximately 45 minutes Maximum video length (without audio): Approximately 1 hour Maximum number of videos per prompt: 10 Supported MIME types: `video/x-flv`, `video/quicktime`, `video/mpeg`, `video/mpegs`, `video/mpg`, `video/mp4`, `video/webm`, `video/wmv`, `video/3gpp`
	Audio	Maximum audio length per prompt: Approximately 8.4 hours, or up to 1 million tokens Maximum number of audio files per prompt: 1 Supported MIME types: `audio/x-aac`, `audio/flac`, `audio/mp3`, `audio/m4a`, `audio/mpeg`, `audio/mpga`, `audio/mp4`, `audio/ogg`, `audio/pcm`, `audio/wav`, `audio/webm`
	Parameter defaults	Temperature: 0.0-2.0 (default 1.0) topP: 0.0-1.0 (default 0.95) topK: 64 (fixed) candidateCount: 1–8 (default 1)
Supported regions	Model availability	Global global
Supported regions	See Deployments and endpoints for more information.
Knowledge cutoff date	January 2025
Versions	`gemini-2.5-flash-lite-preview-09-2025` Launch stage: Public preview Release date: September 25, 2025 Retirement date: July 9, 2026
Pricing	See Pricing.

Gemini 2.5 Flash-Lite Stay organized with collections Save and categorize content based on your preferences.

2.5 Flash-Lite

2.5 Flash-Lite

Gemini 2.5 Flash-Lite