Gemini Embedding 2 Reaches General Availability

Gemini CLI

Google announced the general availability of Gemini Embedding 2 via the Gemini API and Gemini Enterprise Agent Platform on April 22, 2026. The model β€” Google's first natively multimodal embedding model, previewed in March 2026 β€” can process text, images, video, audio, and PDF documents and map all modalities into a unified 3,072-dimensional vector space. The GA milestone moves the model from preview to production-ready status, enabling enterprises to commit to building search, retrieval, and classification applications that combine multiple data types in a single pipeline. Pricing is confirmed at $0.20 per million text input tokens, with the Batch API available at 50% off for offline workloads.


Gemini Embedding 2 Is Now Production-Ready

Google announced on April 22, 2026, that Gemini Embedding 2 has reached general availability β€” transitioning from the public preview launched in March 2026 to a stable, production-supported model available through the Gemini API and the Gemini Enterprise Agent Platform.

What Is Gemini Embedding 2?

Gemini Embedding 2 (gemini-embedding-2) is Google's first natively multimodal embedding model. Unlike traditional embedding models that handle only text, Gemini Embedding 2 accepts text, images, video, audio, and PDF documents as input and maps all modalities into a single unified numerical vector space. This means developers can run cross-modal semantic searches β€” for example, finding images that are conceptually similar to a text query, or ranking video clips against a written description β€” without building separate embedding pipelines for each modality.

Model Specifications

The model produces 3,072-dimensional embeddings by default, with flexible Matryoshka Representation Learning (MRL) dimensions available at 768, 1,536, and 3,072 β€” allowing developers to trade off storage cost against retrieval accuracy. Automatic normalization is handled by the model for non-default dimensions, eliminating the manual normalization step that older embedding models required.

Input limits per request:

  • Text: up to 8,192 tokens
  • Images: up to 6 per request
  • Audio: up to 180 seconds
  • Video: up to 120 seconds
  • PDFs: up to 6 pages

The model supports over 100 languages for text inputs.

Pricing

Standard API pricing is $0.20 per 1 million text input tokens and $0.012 per image. The Batch API is available at 50% of standard pricing β€” $0.10 per million text tokens β€” for workloads that do not require real-time responses, making large-scale offline indexing pipelines significantly more accessible.

From Preview to Production

During the preview phase, developers built prototypes ranging from advanced e-commerce discovery engines that match product images to natural-language queries, to video analysis tools that index content across modalities. The GA release provides the stability and service-level commitments required to move those projects into production. Applications using the preview model ID (gemini-embedding-2-preview) should migrate to gemini-embedding-2 to use the GA version.

Relevance for Gemini CLI and API Users

Gemini Embedding 2 GA is directly relevant for developers who build RAG pipelines, semantic search systems, or agent memory backends powered by Gemini. The GA milestone removes the uncertainty of preview-phase pricing and behavior changes, making it safe to commit engineering resources to production deployments that incorporate multimodal retrieval.