Gemini API: Computer Use Now Built Into Gemini 3.5 Flash

Gemini CLI

Google launched public preview support for the Computer Use tool natively integrated into Gemini 3.5 Flash, enabling developers to build agents that can see, reason, and act across browser, mobile, and desktop environments without requiring a separate model. Previously available only as a standalone Gemini 2.5 Computer Use model, the capability is now a built-in tool accessible via the Gemini API and Gemini Enterprise Agent Platform. The release includes simplified intent-based actions, configurable enterprise safety policies, and advanced prompt injection detection. Gemini 3.5 Flash scores 78.4% on OSWorld-Verified, matching frontier models while running 4x faster.


Computer Use Becomes a Native Gemini 3.5 Flash Capability

Google has launched public preview support for Computer Use built directly into Gemini 3.5 Flash. Developers can now build agents that see, reason, and act across browser, mobile, and desktop environments using a single model, without wiring up a separate specialized model.

From Standalone Model to Built-In Tool

Computer Use was previously available only as a standalone Gemini 2.5 Computer Use model. With this release, the capability becomes a built-in tool accessible directly through the Gemini API and the Gemini Enterprise Agent Platform. Folding it into the general-purpose Flash model means agents no longer have to route between a reasoning model and a separate action model.

Simpler Actions and Enterprise Safety

The release introduces simplified intent-based actions that make it easier to express what an agent should accomplish. For enterprise deployments, Google added configurable safety policies and advanced prompt injection detection, addressing two of the biggest concerns with agents that can control real interfaces.

Frontier Accuracy at 4x the Speed

Gemini 3.5 Flash scores 78.4% on OSWorld-Verified, a benchmark for computer-use agents, matching frontier models while running approximately 4x faster. The combination of accuracy and speed makes real-time, interactive agent workflows far more practical.