Anthropic’s Claude Sonnet 5: Powerful Agentic AI, Poor Cybersecurity

I was halfway through a call when the client dropped the line: “Are they trying to sneak a risky model past regulators?” You could feel the question hanging over the room like the last person to leave a meeting, unsure whether to hit mute. I told them to hold that thought—because Sonnet 5 is simpler, cheaper, and deliberately less curious about digital locks than the models that came before it.

You and I have watched these product announcements enough to sense the choreography: performance, price, and a carefully worded list of what the model won’t do. Anthropic’s Claude Sonnet 5 fits that pattern, and it arrives at a moment when every token you burn costs real money and real scrutiny.

At my desk last week, I opened Anthropic’s blog and started tallying the claims.

Sonnet 5 promises a lot without demanding a king’s ransom. Accessed through Claude Code, it charges $2 per million input tokens (€2) and $10 per million output tokens (€9). That’s cheaper than Anthropic’s more capable Opus 4.8, and far less than what many agents would have eaten three months ago.

In practice, Sonnet 5 can plan, call tools like browsers and terminals, and run with autonomy that would have required larger models not long ago. That makes it a pragmatic option for teams that want agents who can do the heavy lifting without bankrupting engineering budgets.

How much does Claude Sonnet 5 cost?

Short answer: pennies compared to prior generation agents when you scale. Long answer: if your workflows are agentic—chained API calls, long histories, lots of tool use—Sonnet 5’s per-token savings matter. You’ll see the math in every invoice, and product leaders at companies using Claude’s free, Pro, Max, Team, or Enterprise tiers will be doing that math now.

On the day the government asked Anthropic to take models offline, the hallway conversations went quiet.

The public line from the administration centered on cybersecurity. Anthropic had earlier flagged two models—Mythos 5 and Fable 5—as powerful enough that the company limited their release; the government then requested both be pulled. Mythos, in particular, reportedly highlighted vulnerabilities in systems like the NSA’s defensive stack—not by breaking into them, but by pointing out gaps. That wasn’t theoretical showmanship; it was a spotlight on fragile software.

Anthropic’s new messaging about Sonnet 5 is explicit: it “shows substantially poorer performance” on cybersecurity tasks than Opus 4.8 or Mythos 5. They even say they didn’t deliberately train Sonnet 5 for cyber offense, and that the model’s limited exploit-writing success was likely a side effect of general improvements rather than focused instruction.

The political theater matters. After Amazon CEO Andy Jassy suggested a jailbreak was possible for Fable, regulators began to view certain capabilities not as product features but as national risks. Anthropic appears to be steering Sonnet 5 away from that collision course—both by architecture and by message.

Is Sonnet 5 safe for businesses?

If you ask CTOs, the answer will split by appetite for risk. For most enterprise workflows—automation, summaries, tool orchestration—Sonnet 5 offers a tidy cost-performance ratio. If you want a model to abuse as a vulnerability scanner, it’s not your friend. Anthropic priced and positioned Sonnet 5 so it’s useful, not weaponized.

I walked into a security team meeting this month and watched a developer shrug at the idea of handing certain scans to an LLM.

That shrug is telling. Cybersecurity tasks are binary in a way: either a tool finds a flaw that can be exploited, or it doesn’t. Anthropic’s public position reduces the probability of finding dangerous flaws by design. Think of Sonnet 5 like a Swiss Army knife with a blunt blade—many tools, but one they don’t want you carving open.

For customers, that trade-off is explicit. You get cheaper, more capable agentic behavior for product tasks. You lose the kind of aggressive vulnerability-detection that made Mythos both feared and regulated. Companies that need the latter will have to keep tighter, narrower testbeds or stick with partners who can responsibly run offensive security tooling under controlled conditions.

At a conference panel last month, the audience asked whether price cuts change the power balance between Anthropic and OpenAI.

Both companies have been nudged toward lower prices as agents proliferate. OpenAI reportedly considered dramatic cuts to retain users during this surge. Anthropic’s Sonnet 5 is a strategic play: deliver the agentic capabilities people want at a survival-friendly cost, while signaling to regulators, the NSA, and large customers that it’s not shipping a cyber swiss-army saw.

This strategy is political as much as commercial. Anthropic wants to be the safety-first voice in the industry; that reputation is its shield. By making Sonnet 5 intentionally weak at cyber offense, the company buys breathing room with federal actors and keeps product momentum with customers.

Two metaphors in one paragraph rule: Sonnet 5 behaves like a public park with a fenced playground—useful and visible, but with fixed boundaries—and like a smoke alarm that only chirps when the kitchen’s on fire, not when someone leaves a window open.

You should care because these choices shape who builds your tools and who audits them. When a model is deliberately less capable in one domain, it shifts responsibility elsewhere: to vendors, security teams, and regulators. I’ll tell you how to think about that trade-off if you’re buying or embedding agents.

Anthropic wants you to take two things away: Sonnet 5 is cheaper and competent for agentic work, and it’s been neutered on cyber offense by design and perhaps by politics. Those are fair selling points for busy product teams—but they’re also a signal to the market about what kinds of AI the company will risk to ship.

If Anthropic can sell an agent that’s cheaper and intentionally weak at cybersecurity, are you safer—or just outsourcing the risk?