Case Study — Anthropic Computer Use as a Controlled-Rollout Architecture

FlowRidge

COMPEL Specialization — AITE-ATS: Agentic AI Systems Architect Expert Case Study 3 of 3

Why this case

When a model provider ships a capability whose blast radius is, by design, much larger than text generation, the provider’s disclosure choices and the deployer’s architectural choices interact. Anthropic’s 2024 release of Claude Computer Use — a capability allowing Claude to observe a screen, move a cursor, click, type, and navigate through applications on behalf of the user — is the clearest recent example.

The deliberateness of the rollout is itself the teaching material. Anthropic did not ship computer-use as a default capability of Claude; it shipped it as a beta capability with explicit safety discussion, a recommended sandboxed deployment pattern, and framing as a public-preview feature rather than a production-ready one. Reading the disclosure and the recommended-deployment guidance together gives the architect a template for rolling out any similar capability inside the organisation: the AI-assisted browser agent, the RPA-replacing desktop agent, the cross-application coordinator.

This case study is not a product review. It is an analysis of the controlled-rollout posture and what that posture requires from the downstream architect.

Sources:

Anthropic’s public materials on Claude 3.5 models and Computer Use. https://www.anthropic.com/news/3-5-models-and-computer-use
Anthropic’s Responsible Scaling Policy. https://www.anthropic.com/responsible-scaling-policy
Anthropic’s public safety discussion on Computer Use (within the above).

The capability in brief

Claude Computer Use, at the provider level, gives a model the ability to observe the pixels of a screen and emit actions that a controlling harness translates into cursor movements, key presses, and clicks. The model does not directly drive hardware; a separate runtime receives the model’s action descriptions and executes them against a virtual screen (typically a container or VM). The feedback loop is: the runtime captures the screen, sends it to the model, the model reasons about what to do next, the model emits an action, the runtime executes it, repeat.

Architecturally this is a canonical agentic loop with a very wide tool surface — effectively the entire graphical interface of whatever applications run in the environment. This width is the capability’s power and its risk.

What Anthropic disclosed and what the disclosure implies

Three features of the disclosure are architecturally load-bearing.

1 — The beta framing

Anthropic released Computer Use as a beta, not as a general-availability feature. The framing carries architectural weight: deployers were explicitly told this capability is experimental, the model’s performance is variable, and mis-clicks, wrong actions, or goal drift are expected. The architect receiving this signal treats the capability as research-preview for internal use, not as a building block for customer-facing features.

2 — The sandboxing recommendation

Anthropic recommended running Computer Use in a sandboxed virtual environment — a container or VM with no access to the host filesystem, no outbound network egress beyond what the task required, and no persistent state. The recommendation is architecturally specific: the sandbox is not a suggestion for experienced deployers; it is the default posture.

3 — Explicit acknowledgement of indirect injection

The disclosure acknowledged that an agent operating by reading screens is vulnerable to indirect prompt injection through content on those screens — a malicious webpage, a crafted document, an attacker-controlled input field could steer the agent’s behaviour. The vector is the same one Lab 4 rehearses, applied to a capability where the attack surface is every pixel the agent sees.

What the disclosure does not absolve the deployer from

The provider’s disclosure sets a ceiling on liability for the provider and a floor on responsibility for the deployer. Three responsibilities the disclosure does not shift.

Autonomy classification

The deployer is responsible for classifying the autonomy of whatever they build on top of Computer Use. A Computer-Use-driven agent that operates unattended for hours is Level 4. One that proceeds per-action with user approval is Level 2. The capability does not dictate the level; the deployment does. Article 2 of this credential remains the architect’s reference.

Tool-scope design

Computer Use’s “tool” is the entire UI of the sandboxed environment. Within that sandbox the deployer still decides what applications are installed, what accounts are logged into what, what files are present, and what network destinations are reachable. A Computer-Use agent with a sandboxed browser and a single logged-in test account has a very different blast radius from one with a corporate SSO session, access to a production environment, and writeable shared drives.

Observability and kill-switch

The disclosure does not provide the deployer’s observability stack. The architect must still ship the six SLIs from Lab 3, the replay tool that reconstructs a Computer-Use session from screenshots and action logs, and the kill-switch from Lab 5 that halts the session when triggers fire. The capability’s controlled-rollout posture at the provider level does not substitute for the deployer’s controlled-rollout posture at the application level.

Reading the rollout as a staged-deployment template

The architect’s take-away is a template for rolling out any high-blast-radius capability, with Computer Use as the worked example. Five stages.

Stage 0 — Internal preview, tightest sandbox

The capability is enabled inside the organisation, in a hermetically sealed environment, for a small set of expert users. The sandbox is a VM with no network, no credentials, no data, and no ability to write outside a disposable area. The goal is not to do useful work but to learn the capability’s failure modes. The observability is full-fidelity recording — every screenshot, every action, every reasoning step. The rollout deliverable is a failure-mode catalogue.

Stage 1 — Bounded internal use

The sandbox opens slightly: one test account for one external service, a restricted set of websites, still no access to production or to sensitive internal systems. A small group of internal power users runs real tasks and records outcomes. The rollout deliverable is a measured task-completion rate and a measured incidence of the failure modes from Stage 0’s catalogue.

Stage 2 — Internal production for low-consequence tasks

The sandbox integrates with internal test environments only; the agent performs tasks whose worst-case outcome is a rollback of test-data state. HITL gates are present on any action the system cannot reverse. The rollout deliverable is an incident rate per task-hour, a red-team record from Lab 4 applied to this specific capability, and an updated kill-switch specification.

Stage 3 — Constrained customer-facing pilot

A narrow, explicitly-scoped customer-facing feature ships to a small cohort under an opt-in agreement. Every action that affects customer data is gated; every action that commits externally (payment, publication, submission) is pre-authorised. The rollout deliverable is a customer-feedback channel, a disclosure surface under EU AI Act Article 50 if applicable, and an SLO for HITL-gate latency (because the customer is watching the latency of approvals now).

Stage 4 — Broad availability within defined boundaries

The capability is available to named user populations for named task classes, with the full stack of controls in place: sandbox, guardrails, HITL where required, observability, replay, kill-switch, rehearsal cadence. The rollout deliverable is the ongoing operational posture — quarterly rehearsals, monthly incident reviews, continuous observability.

At no stage does the capability reach “ubiquitous internal use on sensitive systems without gating.” That destination is, for high-blast-radius capabilities, not a rollout stage; it is a category error.

Architectural decisions the architect owns

The template above implies a set of named decisions the architect should document per capability, per deployment stage.

Decision	Stage where committed	Artefact
Sandbox profile (VM vs. container, network, filesystem, capabilities dropped)	Stage 0	Sandbox spec
Installed applications and logged-in accounts	Stage 1	Environment inventory
Allowed websites or tool destinations	Stage 1	Egress policy
HITL gates and their triggers	Stage 2	Escalation matrix (Lab 1 pattern)
Observability SLIs and retention	Stage 2	Observability plan (Lab 3 pattern)
Red-team evidence and residual risk	Stage 2 → Stage 3	Red-team evidence pack (Lab 4 pattern)
Disclosure wording and placement	Stage 3	User-facing disclosure spec
Kill-switch scope, mechanism, latency targets, rehearsal cadence	Stage 3	Kill-switch spec (Lab 5 pattern)
Agent governance charter	Stage 3	Template 1 of this credential
Ongoing operational posture	Stage 4	Runbook, incident playbook, rehearsal schedule

The decisions are cumulative. A stage cannot be entered until the prior stage’s decisions are documented and the prior stage’s deliverables pass.

Contrasting posture — deployer vs. provider

Anthropic’s controlled-rollout posture applies at the provider level: which customers get the capability, what the documentation says, how fast to iterate the capability. The deployer’s posture applies at the application level: which users get the capability, what the application promises them, how fast to broaden the scope.

The architect is the deployer’s posture, not the provider’s. The temptation to treat the provider’s beta framing as a reason to delay governance work — “we’ll figure it out when it goes GA” — is the exact failure mode the Moffatt and Replit cases warn against. Deploy a beta capability with production-grade governance, or do not deploy it.

Lessons for the specialist

Lesson 1 — capability framing is architectural information

When a provider ships a capability as a beta with explicit safety warnings, the architect should read the framing as an explicit contract: the provider will iterate; the deployer will sandbox. Adopting the capability without the sandbox is accepting the downside of the beta framing without its offsetting posture.

Lesson 2 — the sandbox scales the risk down; the HITL matrix scales the consequence down

These are different controls and must both be present. A perfect sandbox does not help if the sandbox contains a production credential; a perfect HITL matrix does not help if the sandbox contains the user’s home directory. Computer-use-class capabilities make this duality visible because the sandbox is so explicit.

Lesson 3 — indirect injection is the class of attack to rehearse

Any capability that involves the agent observing external content — screens, webpages, emails, documents — is indirect-injection-exposed. The architect’s evidence pack from Lab 4 applies directly. Its absence is the deployment’s first exploitable gap.

Lesson 4 — staged rollout is a protocol, not a ceremony

The five stages above are not process theatre; each stage produces a deliverable the next stage depends on. An organisation that skips stages arrives at broad availability without the failure-mode catalogue, the red-team evidence, or the rehearsed runbook. It will, predictably, regenerate those artefacts through incidents.

Lesson 5 — the provider’s controlled-rollout posture is a template, not a ceiling

If the capability were to ship as GA tomorrow, the deployer’s staged-rollout posture would not change. The architect’s discipline is anchored to the capability’s blast radius, not to the provider’s marketing stage. The controlled-rollout template applies to any capability whose misuse is consequential — Computer Use, autonomous web browsing, tool-using agents with production credentials, RPA-replacing agents. The worked example happens to be the one with the clearest provider-side disclosure.

Sources

Anthropic. 3.5 models and Computer Use. 2024. https://www.anthropic.com/news/3-5-models-and-computer-use
Anthropic. Responsible Scaling Policy. 2024 revision. https://www.anthropic.com/responsible-scaling-policy
OWASP Top 10 for Agentic AI. 2025 revision. https://genai.owasp.org/
MITRE ATLAS. https://atlas.mitre.org/
NIST AI RMF Generative AI Profile (NIST AI 600-1). https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf

All characterisations of the capability and its rollout are drawn from the cited public materials. Where the record is silent, this case study says so and does not fill the silence with inference.