Codex Computer Use: What Desktop Control Means for Teams

TL;DR: OpenAI Codex can now control your desktop autonomously. What it does, the security surface it creates, and what CTOs need to decide before deploying.

On April 17, 2026, OpenAI updated Codex with background computer use on macOS. Codex can now see your screen, move its own cursor, click buttons, and type text, operating apps just like a human, but autonomously in the background.

This is not screen sharing or remote assistance. It is an AI agent with independent desktop control. For developers, it opens up workflows that were previously impossible to automate, interacting with apps that have no API, pasting data between applications, and navigating multi-step UI workflows. For CTOs, it creates a security and governance surface that most organisations have never had to manage before.

What Computer Use Actually Does

Codex's computer use capability works by interpreting your screen visually and executing mouse and keyboard actions through its own cursor. It operates in the background, so your own mouse and keyboard remain active while Codex works alongside you.

What it can do today:

Navigate desktop apps. Open applications, click through menus, fill in forms, and interact with any UI element on screen.
Move data between apps. Copy a value from a spreadsheet, paste it into a web form, take a screenshot of the result, and log it, all without an API.
Interact with internal tools. Admin panels, CRM systems, internal dashboards, and enterprise apps that have no API integration are now accessible to the agent.
Execute multi-step workflows. A sequence like "open Figma, export the latest design as PNG, open Slack, upload it to the #design channel, and post a status update" can run as a single instruction.

What it cannot do (yet):

Access apps that require authentication it does not have
Operate on Windows or Linux (macOS only at launch)
Run without the Codex desktop app open on the machine
Bypass system-level permission prompts (accessibility permissions required)

The Security Surface This Creates

Computer use introduces a category of risk that AI coding tools have never created before: ambient desktop access. An agent with coding capabilities can read and write code. An agent with desktop control can read and interact with everything on your screen.

Five Questions Every CTO Should Answer Before Enabling

1. What can the agent see?

When computer use is active, Codex can see the contents of any application window on the user's screen. If a developer has a password manager, internal document, or customer database open in another window, the agent can potentially read it.

OpenAI states that UI interpretation uses local processing where possible, but "where possible" is not a guarantee. Until the detailed permission model is published, assume that anything on screen is in scope.

2. What can the agent click?

Codex operates with its own cursor. It can click any button, link, or UI element that a human could click. This includes "Delete", "Deploy", "Approve", and "Send" buttons. The human-in-the-loop verification triggers for actions that "impact system stability or data privacy," but the criteria for what triggers verification are not yet documented.

3. Who is accountable for the agent's actions?

If Codex clicks "Approve" on a PR, sends a Slack message, or submits a form in an internal tool, who approved that action? The developer who set up the automation? The CTO who enabled computer use? The agent itself? Accountability chains for autonomous desktop actions are not established in most organisations.

4. How do you audit what happened?

Traditional audit trails assume human actions. When Codex fills in a form or clicks through a workflow, is that logged? Where? In what format? Can your compliance team reconstruct what the agent did on a specific screen at a specific time?

5. Does this comply with your data handling policies?

In European jurisdictions, GDPR and the EU AI Act impose obligations on how AI systems process personal data and interact with users. Desktop control that can see customer records, employee data, or financial information may trigger compliance requirements that your current AI governance does not cover.

What Developers Can Do With It Right Now

Despite the governance questions, computer use is genuinely useful for workflows that previously required manual UI interaction:

Developer-Adjacent Tasks

Cross-app data movement. Export test results from one tool, import into a reporting dashboard, without writing an integration.
UI testing assistance. Navigate a staging environment, click through user flows, screenshot results for QA documentation.
Design-to-code feedback. Open Figma, see the design, open your code editor, make adjustments, screenshot the rendered result for comparison.

Where It Breaks Down

Authentication boundaries. Apps behind SSO or MFA will block the agent unless credentials are pre-loaded, which creates its own security issue.
Rate and context limits. Complex multi-step workflows with many screen transitions can exceed the agent's visual context window.
Unpredictable UI. Dynamic interfaces, modals, loading states, and non-standard UI components can confuse the visual interpretation layer.

How This Compares to Claude Code

Claude Code does not have computer use. The comparison:

Capability	Claude Code	Codex
Code editing	Terminal + file editor	Terminal + file editor
Desktop control	No	Yes (macOS)
Scheduled automation	Routines (cloud)	Automations (local + cloud)
Plugin ecosystem	MCP servers (3000+)	90+ plugins + computer use
Where it runs	Local + cloud (Routines)	Local (computer use) + cloud
Security model	Repo-scoped, explicit permissions	Desktop-scoped, visual access

Claude Code's approach is narrower but more governable. Codex's approach is broader but harder to audit. For teams with strict AI security posture requirements, Claude Code's repo-scoped model is easier to approve. For teams that need cross-app automation, Codex's computer use is the only option that does not require building custom integrations.

Frequently Asked Questions

Can Codex computer use access my passwords?

If a password is visible on screen (e.g., in a password manager window), the agent can potentially see it. Keep sensitive applications closed or minimised while computer use is active. Use a dedicated desktop user or virtual desktop for agent sessions if your organisation requires strict separation.

Does computer use work with all macOS apps?

It works with any app that renders standard UI elements. Apps with heavy custom rendering (games, some creative tools), DRM-protected content, and apps that block accessibility APIs may not work reliably.

Can I limit what Codex can see or click?

Not yet at a granular level. The current model is all-or-nothing: when computer use is enabled, the agent can see and interact with everything on the active desktop. Finer-grained permission controls are expected but not yet available.

Should I enable computer use for my team?

Only after your organisation has answered the five questions above (what can it see, click, who is accountable, how do you audit, does it comply). If you cannot answer all five, do not enable it yet. If you can, start with a pilot: one developer, one workflow, documented results.

Decide Whether Computer Use Is Right for Your Team

Desktop control is a powerful capability with a governance cost. If you are evaluating whether to enable Codex computer use for your engineering team, the decision should be informed by your current security posture, not just the feature's potential.

Our AI Readiness Assessment evaluates whether your governance framework is ready for desktop-level agent capabilities, and identifies the gaps to close first.

If you need help designing the approval and audit process for computer use, our AI Consulting services can help.

Codex Computer Use: What Desktop Control Means for Developers and Why Your CTO Should Care

What Computer Use Actually Does

The Security Surface This Creates

Five Questions Every CTO Should Answer Before Enabling