Claude Cowork Is Having a Moment. Your Attack Surface Is Too.

Anthropic just shipped Claude Cowork as a research preview – a “Claude Code-like” experience for general productivity work, delivered through the Claude macOS app and aimed at making Claude feel less like a chatbot and more like a real coworker.

That framing is…accurate. And it’s the security problem.

When an AI crosses the line from “talking about work” to “doing work” – reading folders, editing files, and taking actions inside third‑party tools – the threat model changes fast. Anthropic even says the quiet part out loud: Cowork can take “destructive actions” (including deleting files) if you aren’t specific, and it can be affected by prompt injection.

So let’s treat this like we treat any new “power tool” in the enterprise: capabilities first, controls second is how you end up with incident response playbooks in your Slack bookmarks.

At Chiri, our AI security stance is simple: if you can’t answer the “must haves” with evidence, you’re carrying too much risk.

Claude Cowork is the perfect case study for why.


Cowork isn’t “another AI feature.” It’s a permission model.

From the public descriptions, Cowork is built around three big capabilities:

  1. Local access: you can grant Claude access to folders on your Mac so it can work with your files (organize, extract, draft, create).
  2. Tool access: you can connect Claude to external services (connectors) so it can read and write in tools like productivity, business, automation, and developer platforms.
  3. Agentic execution: it can run multi‑step tasks with less hand‑holding, i.e., “do the thing,” not “tell me how to do the thing.”

That’s not a UI upgrade. That’s a new operator sitting in your environment.

And operators need controls.


The real risk: your AI coworker is a “confused deputy” with your credentials

The UK’s National Cyber Security Centre (NCSC) has been blunt about where this is heading: prompt injection should be treated less like “SQL injection” and more like a confused deputy problem – where a privileged system can be coerced into doing something on an attacker’s behalf.

That matters because Cowork (by design) becomes a privileged system:

  • It can see things you can see (folders, docs, screenshots).
  • It can act where you can act (connectors with your account permissions).
  • It can be exposed to content you should not trust (web pages, documents, messages, tickets). And Anthropic explicitly calls out prompt injection as a live risk.

Simon Willison’s “lethal trifecta” for agents is the cleanest way to remember the danger: private data + untrusted content + external communication. Combine all three and you get exploitation opportunities, even if the model is “pretty good” at resisting attacks.

Cowork’s entire product value is moving you closer to that trifecta.


The five security risks that matter (and why they’re easy to miss)

Risk #1: Over‑permissioning becomes the default “setup step”

Cowork works when you give it access. That’s the point.

But here’s how this goes in real life:

  • A user selects “Documents” instead of “/Cowork‑Workspace”
  • Or connects a tool with broad org permissions (because that’s what their account already has)
  • Or adds “just one more connector” because it’s convenient

Anthropic’s own connectors guidance is clear: when you connect a tool, you’re granting Claude permission to access and potentially modify data in that service based on your account permissions.

That’s the same mistake companies made for a decade with OAuth apps and browser extensions – except now the “app” is an agent that can chain actions together.

Translation: the “blast radius” is not Cowork. The blast radius is your identity.


Risk #2: Prompt injection turns into real‑world data loss and exfil

OWASP ranks prompt injection as the #1 risk category for LLM systems for a reason: it’s not just “bad answers.” It’s manipulated behavior.

Anthropic itself says prompt injection remains a major security challenge for agents operating on untrusted content, and explicitly notes it’s “far from a solved problem,” even with improved robustness and safeguards.

Now connect that to Cowork:

  • Cowork can read untrusted content (web, tickets, docs)
  • Cowork can access private data (your folder + connectors)
  • Cowork can take actions (create/edit/delete/share)

Anthropic warns that unclear instructions can lead Claude to delete files, and also warns about prompt injection risk.

And this is not theoretical. SafeBreach researchers demonstrated “promptware” attacks against Gemini for Workspace – using something as simple as a malicious calendar invite to trigger harmful behavior and sensitive data exposure.

Different vendor, same pattern: an assistant that reads untrusted content and is wired into privileged workflows becomes a new exploitation layer.


Risk #3: “Consumer plan” data handling + enterprise files is a dangerous mix

Cowork is positioned (right now) as a Max‑subscriber feature on the macOS app.

That matters because Anthropic’s consumer plans (Free/Pro/Max) have a data‑use decision point and different retention behavior depending on whether you allow data usage for model improvement.

From Anthropic’s own update:

  • If you allow data to be used for model training, retention can extend (e.g., up to five years for new/resumed interactions in that policy update).
  • If you do not allow it, retention is described as continuing under their existing 30‑day period for those interactions.

There are legitimate reasons vendors do this. The security point is simpler:

If an employee points a consumer‑tier tool at sensitive corp folders, you’ve just created a shadow data pipeline that most orgs aren’t tracking, classifying, or governing.

Which brings us to…


Risk #4: Auditability lags behind autonomy

When something goes wrong with an agent, the first question is not “why did the model do that?”. It’s:

  • What exactly did it touch?
  • What did it send out?
  • What did it change?
  • Can we prove it?
  • Can we roll it back?

Most agent experiences are still catching up on:

  • Durable, queryable activity logs
  • Per‑action approvals for risky operations
  • Forensic retention of prompts, tool calls, retrieval snapshots, versions

And that’s not a knock. It’s a maturity curve.

But if you’re deploying this into regulated workflows, “it asked before doing something significant” is not an audit trail.


Risk #5: Connectors create a supply chain – and custom connectors raise the stakes

The connectors directory is explicitly designed to extend Claude’s capabilities with both local and remote connectors, including automation and business tools, plus custom connectors.

Two key security realities:

  • A connector is effectively code + permissions + data path
  • Each connector expands your attack surface and dependency chain

OWASP’s LLM risk work repeatedly points to the idea that injection-style attacks can ride through the system because instructions and data get mixed, and output/tool handling becomes the downstream exploit path.

Anthropic’s help center explicitly warns that custom connectors may connect to services not verified by Anthropic and should only be connected to trusted organizations with carefully reviewed authentication permissions.

Again: not a flaw. Just a reminder that connectors are a supply chain, and supply chains need governance.


The Chiri lens: “AI security must‑haves” applied to Cowork

Our CTO Mark Aklian’s “critical must haves” checklist is the right frame here because it forces the uncomfortable questions early – before the pilot becomes production.

If you’re evaluating Cowork (or any “all‑access agent”), you should be able to answer, with evidence:

  • Architecture transparency: Where does the model run? Which agents/tools are active? Any third‑party AI dependencies?
  • Data flow & retention: Are prompts/outputs logged, how long, and used for training/evals – what are the controls?
  • Guardrails against abuse: Prompt injection defenses, retrieval allow‑lists, output handling, and abstention behaviors.
  • Tool/agent safety: Sandboxing, controlled egress, scoped credentials, least‑privilege function calling.
  • Governance & change management: Version control, approvals, rollback, audit trails on safety rules.
  • Testing & red‑teaming: Not once. Continuous. With real playbooks.
  • AI incident response: Playbooks + forensics retention designed for injection and exfil scenarios.

This is why Chiri keeps hammering the same message: you don’t “bolt on” AI security later. It’s circuit breakers, not seatbelts.


If you’re going to use Cowork, here’s how not to become a case study

This is not vendor‑specific advice. It’s “agent hygiene.”

For individuals and small teams

  • Create a dedicated, non‑sensitive workspace folder and only grant that folder (not “Documents,” not “Desktop,” not “Downloads”).
  • Start read‑only whenever possible (or at least do a dry run: ask it to propose changes before applying).
  • Connect the minimum set of tools you need. Remember: connectors inherit your account permissions.
  • Treat untrusted content as hostile (random PDFs, tickets, web pages). Prompt injection is specifically a risk for agents operating on content they can’t trust.

For enterprises

  • Do not let “Max on a personal Mac” become the default deployment model.
  • Apply standard identity and device protections (Zero Trust basics): MFA, conditional access, device management, and data protection – because AI tools become “just another app” with privileged access.
  • Require a governance layer: approved connectors, scoped accounts, least privilege, logging, and review gates for high-impact actions.
  • Run a red-team playbook that explicitly targets: prompt injection, data exfil paths, destructive actions, and connector abuse.

If you want an outside, non-vendor baseline: CISA/NSA/FBI guidance emphasizes that data security is foundational to trustworthy AI outcomes and should be secured across the AI lifecycle. And NIST’s AI RMF + GenAI profile are solid reference points for mapping governance to controls.


Bottom line

We’re in the early innings of “all‑access AI agents.”

Claude Cowork is a glimpse of where productivity is going. It’s also a glimpse of where incidents are going.

The question isn’t “Is Cowork secure?”

The question is:

Can your organization prove it has the controls to make an agent with file access + connector access + autonomy safe enough for your workflows?

If the answer is “not sure,” start with the must‑haves checklist – and demand evidence, not vibes.

This is just the start of making sure in the moment you’re thinking of security with agents. 

For the long term, we’re building Chiri Brain which solves not only for the agentic issue – but much more. Check it out at https://chiri.ai/chiri-brain 

Scroll to Top