Gemini 3.5 Flash Gets Computer Use as a Built-In Tool
On June 24, Google DeepMind announced that Gemini 3.5 Flash now supports Computer Use natively — no longer as a separate model, but as a built-in capability integrated directly into the main Flash model.
Previously, using Gemini for screen control required calling the standalone Gemini 2.5 Computer Use model. Now 3.5 Flash ships with this ability alongside function calling, Search, and Maps Grounding in the same toolkit.
What It Can Do
In short, 3.5 Flash can now "see" the screen, understand what's happening, and operate it with mouse and keyboard actions. Browser, desktop applications, and mobile are all covered.
This means developers can build agents that open web pages, fill out forms, click buttons, read results, and complete multi-step tasks without human intervention. Google highlights continuous software testing and cross-application knowledge work automation as typical use cases.
Safety Measures
Running Computer Use in open environments carries significantly more risk than closed API calls. Google has implemented several safeguards in 3.5 Flash:
- Targeted adversarial training to help the model recognize prompt injection attacks
- Optional confirmation mechanisms for sensitive or irreversible actions
- Automatic detection of indirect prompt injection with task termination
Google recommends combining these safety features with sandboxing, human-in-the-loop verification, and strict access controls.
Partner Feedback
Google shared early feedback from several partners:
- Browserbase provides an online demo environment for hands-on testing
- Browser Use's CEO praised the model's performance in browser scenarios
- UiPath's Senior Director highlighted value in enterprise automation
How to Access
Available through both the Gemini API and the Gemini Enterprise Agent Platform. Google provides reference implementations and documentation.
Context
Computer Use — letting AI directly control computer screens — has been one of the hottest areas in the agent space over the past year. Anthropic's Claude was first to market with Computer Use in late 2024, and OpenAI followed with Operator. Google now embeds this capability into Flash, a lighter-weight model, lowering the barrier to entry.
Previously, Gemini 2.5's Computer Use was a separate model, meaning you had to sacrifice Flash's speed and cost advantages to get screen control. With this integration, developers no longer have to choose between features and efficiency.




