Claude Fable 5 is the most powerful model Anthropic has ever released to the public. And most people have no idea what to actually do with it.
It isn't just a smarter chatbot. It's built for enormous, long-running work, the kind of projects that used to need a whole team and weeks on the calendar. Give it a big enough goal and it will plan, build, test itself, and keep going for hours or even days until it's done.
To see what that means in practice: Stripe recently handed it a migration across 50 million lines of code, a job estimated at two months by hand. Fable 5 finished it in a single day. That's the ceiling this model is reaching for.
This guide breaks down everything you need to actually use it: what Fable 5 is, how to turn it on, when to reach for it (and when not to), how to put it to work on real builds, and the catches nobody mentions. It's back as of July 1 after almost three weeks offline, so now's the time to learn it.
What Fable 5 actually is
Fable 5 is Anthropic's most capable public model, the first of a new tier called Mythos that sits above Opus.
Fable 5 and Mythos 5 are the same model with the same raw power. The difference is access: Mythos 5 is the less-restricted version, locked to government and vetted cybersecurity orgs. Fable 5 is what everyone else gets, the same engine with stronger safety layers that can decline or reroute risky requests. Even the names tell the story: Fable comes from the Latin fabula, a deliberate nod to the Greek mythos. Same word, different language.
What sets it apart isn't chat. It's built for long-horizon agentic work: huge, multi-stage tasks that run for hours or days on their own. It plans across stages, delegates to sub-agents, writes its own tests, and uses vision to check its output against the goal. It tops SWE-bench Verified at 95% (worth knowing: most scores there are self-reported) and leads Cognition's FrontierCode eval. It ships with a 1M-token context window and up to 128k output tokens per request. GitHub, testing it early, reported the strongest results of any Claude model they'd tried.
The launch was rocky: it dropped June 9, got pulled worldwide three days later by an export-control order, and came back July 1, unchanged.
Right now, Fable 5 is included for up to 50% of your weekly usage limits on Pro, Max, and Team plans, but only until July 7. After that it moves to paid credits. Heavy users can burn through that 50% cap days early, so the real window may be even shorter. The free plan doesn't get Fable at all.
How to turn it on
Three ways, depending on where you work:
- On Claude.ai, pick Fable 5 in the model selector.
- In Claude Code, type /model and switch to Fable 5.
- On the API, use the model string claude-fable-5. It's also live on AWS Bedrock, Google Cloud, and Microsoft Foundry.
When to use it (and when you shouldn't)
Fable 5 costs exactly double Opus 4.8: $10 per million input tokens, $50 per million output. To make that concrete: one planning pass that reads 200k tokens and writes 40k costs about $4 on Fable, and $0.80 on Sonnet 5. Same tokens, five times the price. Using Fable for everything is how you burn money fast.
Reach for it when the task is genuinely big: a large migration, a multi-stage build, a refactor across a huge codebase, deep research over hundreds of documents. When you want to hand off a whole project and review the finished result. When other models keep failing on something hard.
Skip it when the task is small or quick, a fast script, a simple edit, everyday chat. Sonnet or Opus handles those for a fraction of the cost. Skip it when you need speed: Fable thinks deeply, which makes it slower. And be careful with anything security-adjacent, its safety filter flags more borderline requests than other models (more on that below).
The rule: Fable is a specialist for your hardest, longest work, not a daily driver.
How to put it to work
The core workflow that gets the most out of Fable without wasting it:
Step 1. Plan with a cheaper model first. Don't hand Fable the raw idea, planning burns tokens you don't need to spend at $50 per million. Open Opus 4.8 (or Sonnet) and have it write the spec:
Research this task and write a detailed implementation plan for another AI agent to execute: [your task]. Include: the goal, constraints, tech stack, file structure, edge cases, and a definition of done. Be specific enough that the agent never has to guess.
Step 2. Hand the plan to Fable 5. Switch models (/model in Claude Code) and give it the whole thing up front, the plan, the codebase, the docs. The 1M window is the point: Fable performs best when it sees everything at once instead of discovering context mid-task.
Here's the full plan and codebase. Execute it end to end: plan your stages, delegate to sub-agents where it helps, write tests for each stage, and don't stop until the definition of done is met. Flag anything ambiguous instead of guessing.
Step 3. Make it run until it's actually done. For long tasks, use /goal so Fable keeps working until a verifiable condition is true, not until it feels finished:
/goal all tests in tests/ pass, lint is clean, and the app builds with no errors
A separate model checks the condition, so the agent that wrote the code isn't the one grading it.
Step 4. Review the output, not the process. Come back to finished work: read its test results, check the diff, spot-check against the goal. If you find yourself steering every step, the task was either too small for Fable or the plan was too vague.
What this looks like in practice: say you're migrating an old codebase to a new framework. Opus writes the migration spec. You load the repo into Claude Code, switch to Fable, paste the spec, and set a /goal on passing tests. Fable breaks the job into stages, spins up sub-agents for separate modules, writes tests as it goes, and checks its own work with vision where UI is involved. You come back to a finished migration and review it, instead of babysitting every file.
Set it up to burn fewer tokens
Fable's price punishes sloppy setups. Three settings make the difference:
Cache your context. Prompt caching cuts cached input costs by 90%, and on long sessions input is most of your bill. On the API, mark your big stable blocks (system prompt, codebase, docs) as cacheable:
1{"type": "text", "text": big_context, "cache_control": {"type": "ephemeral"}}
Structure prompts so the stable part comes first and the changing part last, that's what makes the cache hit.
Dial the effort down where you can. Fable has an effort parameter that controls how deeply it thinks. Max effort for the hard reasoning stages, lower effort for mechanical ones (renames, formatting, boilerplate). You're paying for thinking, so spend it where thinking matters.
Handle refusals so they don't cost you. Fable's safety filter flags more borderline requests than other models. A refusal comes back as a normal response with stop_reason: "refusal", not an error, and it tells you which classifier fired. Wire in a fallback so your pipeline retries on Opus instead of dying:
1response = client.messages.create(model="claude-fable-5", ...)2if response.stop_reason == "refusal":3 response = client.messages.create(model="claude-opus-4-8", ...)
Rerouted and refused requests aren't billed at Fable prices, so the fallback costs you nothing extra.
And one habit that compounds: put your project's conventions in CLAUDE.md so Fable doesn't re-learn them every session. Every instruction you stop repeating is tokens you stop paying for.
The honest catches
It refuses more than before. After the jailbreak that got it pulled, Anthropic tightened the safety classifier. It blocks the original exploit in over 99% of cases, but it also flags more innocent requests: debugging, security research, anything that looks like vulnerability work can get declined or quietly rerouted to Opus 4.8. Refusals aren't errors, the API returns them as a normal response and tells you which classifier fired, and rerouted requests aren't billed at Fable prices. Set up a fallback so a refusal doesn't stop your work.
The free window is smaller than it looks. Up to 50% of weekly limits until July 7, and heavy users can hit that cap early.
One for teams with strict data rules: Fable runs with 30-day data retention, and zero-retention isn't available on it at all. For some companies, that alone decides whether it can be used.
The bottom line
Fable 5 isn't a better chatbot. It's a specialist built for the work that used to need a team and a calendar, huge migrations, multi-day builds, deep research. Stripe turned two months into one day with it. That's the ceiling it's reaching for.
But it's not a daily driver. It's expensive, it's slower, and its tighter safety filter gets in the way of some coding work. Use it for your hardest problems, lean on lighter models for the rest, and it becomes one of the most powerful tools you have.
The free window closes July 7. If you want to see what it can do, now's the time.
If this was useful, head to my profile and follow. I write about AI, Claude, and systems that actually run.





