Gemini Embedding 2 Reaches General Availability
Google announced the general availability of Gemini Embedding 2 via the Gemini API and Gemini Enterprise Agent Platform on April 22, 2026. The model β Google's first natively multimodal embedding model, previewed in March 2026 β can process text, images, video, audio, and PDF documents and map all modalities into a unified 3,072-dimensional vector space. The GA milestone moves the model from preview to production-ready status, enabling enterprises to commit to building search, retrieval, and classification applications that combine multiple data types in a single pipeline. Pricing is confirmed at $0.20 per million text input tokens, with the Batch API available at 50% off for offline workloads.
Sources & Mentions
5 external resources covering this update
Gemini Embedding 2 Is Now Production-Ready
Google announced on April 22, 2026, that Gemini Embedding 2 has reached general availability β transitioning from the public preview launched in March 2026 to a stable, production-supported model available through the Gemini API and the Gemini Enterprise Agent Platform.
What Is Gemini Embedding 2?
Gemini Embedding 2 (gemini-embedding-2) is Google's first natively multimodal embedding model. Unlike traditional embedding models that handle only text, Gemini Embedding 2 accepts text, images, video, audio, and PDF documents as input and maps all modalities into a single unified numerical vector space. This means developers can run cross-modal semantic searches β for example, finding images that are conceptually similar to a text query, or ranking video clips against a written description β without building separate embedding pipelines for each modality.
Model Specifications
The model produces 3,072-dimensional embeddings by default, with flexible Matryoshka Representation Learning (MRL) dimensions available at 768, 1,536, and 3,072 β allowing developers to trade off storage cost against retrieval accuracy. Automatic normalization is handled by the model for non-default dimensions, eliminating the manual normalization step that older embedding models required.
Input limits per request:
- Text: up to 8,192 tokens
- Images: up to 6 per request
- Audio: up to 180 seconds
- Video: up to 120 seconds
- PDFs: up to 6 pages
The model supports over 100 languages for text inputs.
Pricing
Standard API pricing is $0.20 per 1 million text input tokens and $0.012 per image. The Batch API is available at 50% of standard pricing β $0.10 per million text tokens β for workloads that do not require real-time responses, making large-scale offline indexing pipelines significantly more accessible.
From Preview to Production
During the preview phase, developers built prototypes ranging from advanced e-commerce discovery engines that match product images to natural-language queries, to video analysis tools that index content across modalities. The GA release provides the stability and service-level commitments required to move those projects into production. Applications using the preview model ID (gemini-embedding-2-preview) should migrate to gemini-embedding-2 to use the GA version.
Relevance for Gemini CLI and API Users
Gemini Embedding 2 GA is directly relevant for developers who build RAG pipelines, semantic search systems, or agent memory backends powered by Gemini. The GA milestone removes the uncertainty of preview-phase pricing and behavior changes, making it safe to commit engineering resources to production deployments that incorporate multimodal retrieval.