Beta
AgentsReading · ~3 min · 55 words deep

Operator

OpenAI Operator is a computer-use agent · controls a cloud browser to book, shop, research, and fill forms on behalf of the user.

Operator on agents list
TL;DR

OpenAI Operator is a computer-use agent · controls a cloud browser to book, shop, research, and fill forms on behalf of the user.

Level 1

Operator launched January 2025 as a ChatGPT Pro feature ($200/month). It runs in a sandboxed cloud browser, watches a virtual screen, and decides clicks, scrolls, and keystrokes via a custom "CUA" (Computer-Using Agent) model. Tasks: book a restaurant, order groceries, compile a research doc.

Level 2

Operator's model (internally called CUA) is fine-tuned from GPT-4 on screen-capture + action-label pairs. It receives a screenshot, predicts a coordinate click or keystroke, and iterates. Human handoff triggers on login, CAPTCHAs, and payments (Operator never enters credit cards autonomously). Benchmarks: 38.1% on WebArena, 87% on WebVoyager at launch. Pricing is bundled into Pro tier.

Level 3

The CUA model outputs actions in a structured schema: {"action":"click","coordinate":[x,y]} or type/scroll. The inference loop screenshots at ~1Hz, burning significant tokens per task · Operator tasks can cost $0.50-$3.00 per session. Memory persists per task but not across tasks. Guard rails: blocked domain list, payment halts, prompt-injection defenses via screen-content filtering. OpenAI opened Operator API in late 2025 for enterprise via Responses API + CUA tool.

Why this matters now

Operator turned web automation into a first-party AI product · triggered Anthropic Computer Use, Google Mariner, and Microsoft Copilot Vision.

The takeaway for you
If you are a
Researcher
  • ·Screen-pixel input → action output · generalist web agent
  • ·38% WebArena, 87% WebVoyager at launch
  • ·Coordinate-based action prediction · not DOM-aware
If you are a
Builder
  • ·ChatGPT Pro $200/month · no per-task billing (for now)
  • ·API access via Responses API · enterprise-oriented
  • ·Best for one-shot web tasks · reliability drops on complex flows
If you are a
Investor
  • ·First mainstream computer-use agent · category leader
  • ·Drives Pro-tier upgrades for ChatGPT
  • ·Competitive with Anthropic Computer Use, Google Project Mariner
If you are a
Curious · Normie
  • ·An AI that uses websites for you · like a remote assistant
  • ·Inside ChatGPT · you describe the task, it does it in a cloud browser
  • ·Stops for you on payments and logins
Gecko's take

Operator is the template every major lab will ship. Computer-use is the next agentic frontier after coding.

ChatGPT answers in text. Operator actually clicks, types, and navigates real websites in a cloud browser. Far more autonomous.