Gemini CLI: Generalist Agent and Parallel Sub-agent Execution
Gemini CLI v0.32.0 introduces first-class support for a built-in generalist sub-agent and enables parallel execution of agent-type tools, allowing the orchestrator to delegate and run multiple sub-tasks concurrently. The release formalizes the Kind.Agent classification that differentiates agent tools from standard tools within the scheduler, enabling contiguous parallel admission specifically for agent calls.
Sources & Mentions
5 external resources covering this update
Generalist Agent Now Active
With v0.32.0, Google has enabled the built-in generalist agent within Gemini CLI's core scheduling architecture. The generalist sub-agent is, by design, a full-capability peer of the main orchestrating agent — it inherits the same tools, model configuration, and context window. This makes it a flexible fallback for open-ended delegation when a specialized custom sub-agent (defined in ~/.gemini/agents/) does not exist for a given task domain.
Previously, sub-agents in Gemini CLI were exclusively user-defined: developers wrote Markdown files with YAML frontmatter to describe a persona, capability scope, and instructions. While powerful, this model required explicit authoring for every delegation target. The generalist agent fills the gap — the orchestrator can now hand off a task to a capable agent without needing a domain-specific agent definition in place.
Kind.Agent Classification and Parallel Execution
A key architectural addition in this release is the introduction of Kind.Agent as a distinct tool classification within the scheduler. Prior to this change, all tool calls — whether invoking a shell command, reading a file, or spawning a sub-agent — were treated uniformly by the execution pipeline. The new classification allows the scheduler to apply different admission logic to agent calls.
Concretely, this release enables contiguous parallel admission for Kind.Agent tools. This means when the orchestrating agent emits multiple concurrent sub-agent calls in a single turn, the scheduler can admit and run them in parallel rather than queuing them sequentially. The practical effect is significant for complex, multi-step agentic workflows: a task that previously required sub-agents to execute serially (e.g., running a code analysis agent, then a security review agent, then a documentation agent) can now execute all three simultaneously, collapsing wall-clock time proportionally.
This directly addresses a long-standing limitation in Gemini CLI's multi-agent architecture. Community implementations like Maestro-Gemini had worked around the sequential constraint with custom orchestration layers, but native parallel admission removes the need for such workarounds in standard agent-to-agent workflows.
A2A Streaming and Task Tracker Foundation
The v0.32.0 release also ships improvements to A2A (Agent-to-Agent) streaming reassembly. Remote agent calls — where Gemini CLI delegates to an external A2A-compatible service over HTTP — previously had edge cases around streaming content extraction and task continuity across reconnections. These have been hardened in this release.
A task tracker foundation has been added as a service layer within the core. While not yet exposed as a user-facing feature, the tracker infrastructure is designed to maintain persistent state about ongoing agentic work — a prerequisite for features like mid-session task resumption and detailed audit logging of multi-agent operations.
Model Steering and Workspace Support
Model steering — the ability to instruct the CLI to use a different model mid-session — is now functional within workspace environments (Gemini Code Assist enterprise configurations). Combined with the generalist agent and parallel execution capabilities, this extends the v0.32.0 multi-agent improvements to users operating in managed workspace deployments.
Other Notable Changes
Beyond the agent architecture improvements, v0.32.0 includes several reliability fixes: HTTP 499 responses are now mapped to retryable quota errors, preventing spurious session failures under transient backend pressure. Quota error fallback logic has been extended across all authentication types, and code assist retry logic now applies to all user tiers rather than a subset.
The release represents contributions from 37 developers across 81 merged pull requests.