I was staring at a $200 invoice the other night and felt my throat tighten: a single long-running build had chewed through months of credits. You might have had the same small panic—one token here, another charge there—and suddenly the numbers stop making sense. That moment is where most of us stop scrolling and start worrying.
At a product Slack channel this week, developers were arguing about exploding bills. How Much Does It (Really) Cost to Use Claude Fable, GPT-5.5, and Gemini 3.5 Flash?
I’ll walk you through the prices, the tricks, and the traps. You’ll learn where a million tokens actually lands you, which plans hide costs, and which services are worth using for simple chores versus long-running automation. Read this like a short field guide: grab what you need and decide.
At my desk I watched an Anthropic blog post land and a dozen DMs light up. Fable 5
You’ve probably seen Anthropic’s new Fable 5, the streamlined sibling to the secretive Mythos, and noticed the sticker shock. Anthropic is shifting Fable 5 toward pay-as-you-go on June 23, which means subscription token allowances will pause for most users until capacity improves.
How much does Claude Fable 5 cost per token?
Short answer: Fable 5 charges $10 per million input tokens (€9) and $50 per million output tokens (€47). If you’re on a Claude Max plan—say the Max 5x at $100/month (€93)—you’ll keep paying the same monthly fee but you’ll burn your token allotment faster because Fable consumes more tokens than older models like Opus 4.8.
Concrete math: feed Fable 10 million input tokens and 5 million output tokens and you’re looking at $350 (€326). If you’re only using it for emails or recipes, the bill looks small; using Fable for long-running code generation or autonomous agents will cost far more. Using Fable to answer simple chats is like driving a McLaren to the corner shop—you’re paying for performance you won’t use.
Anthropic promises to restore subscription allowances “when sufficient capacity allows us to do so.” What they haven’t clarified is whether unused tokens in paid plans will survive the change on June 23. I asked Anthropic and I’ll update when they reply.
On April launch day the office lit up with chatter about raw speed and token efficiency. GPT-5.5 Pro
GPT-5.5 Pro powers ChatGPT’s higher tiers and is available on OpenAI’s Pro plan and business/enterprise offerings. It’s a cheaper play for many workloads, particularly when developers use the API.
How expensive is GPT-5.5 Pro for developers?
Via the API, OpenAI charges $5 per million input tokens (€5) and $30 per million output tokens (€28). OpenAI’s Pro consumer tier is $200/month (€186); Business is $30/user/month (€28) and Enterprise is negotiated. There’s also a 50% cheaper batch tokenization option that groups similar requests to reduce compute—useful for bulk tasks but it can slow responses.
Reports in the Wall Street Journal say OpenAI has considered significant token-price cuts to hold market share against Anthropic. If OpenAI follows through, developers who make many, many small requests could see much lower bills.
At a hackathon I watched a team swap to Google’s free tier when costs spiked. Gemini 3.5 Flash
Google positioned Gemini 3.5 Flash as a blend of speed and agent capability. It ships with a usable free tier (limits apply) and a very competitive API price for developers.
Is Gemini 3.5 Flash the cheapest option?
Yes, at current published rates. Gemini 3.5 Flash costs $1.50 per million input tokens (€1) and $9 per million output tokens (€8). That makes it by far the least expensive among these three models for API-heavy development.
If your workload is batch inference, statistical analysis, or lots of short prompts, Gemini’s price will matter a lot. If you need the particular behavior of Claude or the integrations of OpenAI’s ecosystem, the cheaper token price may not tip the decision the way you want.
Scrolling a forum thread last night reminded me: price is only half the story. The bottom line
If you use AI as a glorified search assistant, the free tiers of Claude, ChatGPT, or Gemini will usually be fine. If you’re running agents, long code builds, or heavy research jobs, expect bills to climb under pay-as-you-go models. Subscriptions smooth the month-to-month pain but don’t always cover high-token tasks—watch the fine print on “usage limits” and “pay as you go.”
Here are quick heuristics I use and tell teams I mentor: choose Gemini for low-cost, high-volume API runs; pick GPT-5.5 Pro if you want OpenAI’s ecosystem and developer tools; use Fable 5 when your work needs long, autonomous chains even if that means higher token spend. Tools like OpenAI’s API dashboard, Anthropic’s console, and Google Cloud Billing can help you monitor consumption—and I recommend setting hard alerts before you hit expensive thresholds.
Two practical notes: convert high-volume plans into project-level budgets, and instrument your pipelines to reuse context windows instead of resending entire histories. Pay-as-you-go tokens are a slow leak in your wallet if you don’t patch the pipeline.
This market is volatile—Anthropic is moving toward an IPO, OpenAI is testing price strategy, Google is pushing aggressive pricing—and the smartest move you can make is to match the model to the task, not the brand. So what are you going to change in your stack tomorrow?