Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Korean Stocks
SK Hynix
Real Korean stocks and top assets
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
Claude Fable 5 Pay-Per-Use Countdown: How to Use the Most Powerful Model Without Burning Through Your Bill?
After Claude Fable 5 reopened, saving money amid high token costs became the main topic of discussion among users. This flagship model—called by Anthropic the “most capable widely released model”—is designed for high-intensity reasoning and long-term agentic tasks. It supports a 1-million-token context window and up to 128,000 tokens of output. The direct issue brought by the increased capability is that, in Claude Code, Managed Agents, or long sessions, users may end up letting the model think continuously, call tools, and repeatedly check, which in turn magnifies billing pressure.
According to Anthropic’s official page, Claude Fable 5 resumed access on July 1, 2026, for Pro, Max, Team, and Enterprise users, as well as channels including the Claude Platform, AWS, Google Cloud, and Microsoft Foundry. Official pricing is $10 per million input tokens and $50 per million output tokens. The prompt caching read price is equivalent to an input price discount of up to 90%.
In its “Redeploying Fable 5” announcement, Anthropic said that Pro, Max, Team, and some Enterprise users can use the model within up to 50% of their weekly usage limits before July 7. After that, continued usage will be billed via usage credits.
Fable 5 is therefore not suitable to be casually opened as a default chat model. It is more like an expensive architect and reviewer: useful for setting direction at the start of a task and enforcing quality before the task ends, while the bulk of execution work should be handled by cheaper models.
The most expensive part isn’t a single response—it’s long tasks running automatically
Fable 5’s cost pressure starts with its unit price.
At $10 per million input tokens and $50 per million output tokens, it is already a high-priced model. With short questions and brief back-and-forth answers, users may not notice much. But once you enter long-chain scenarios—such as code changes, organizing reference materials, product proposals, research tasks, and automated agent workflows—the output tokens, context, tool calls, and multiple rounds of revisions all stack up.
What makes consumption easier to amplify is Fable 5’s strengths.
The official documentation positions it as suitable for long-horizon agentic work—long-term agent-style tasks. It can break tasks into multiple phases, proactively check for gaps, and when necessary continue calling tools or subtasks to move things forward. For complex tasks, this is valuable: users don’t have to prompt every step manually, and the model can iterate toward the goal on its own.
However, if the goal is unclear, the boundaries are too broad, or the time horizon is too long, the model may keep running to make the task more complete. The original author said that during their first few hours of testing, they nearly exhausted the usage limits, even though they didn’t carry out particularly extravagant tasks. Experiences like this are more like user feedback than an official cost estimate, but they point to a real risk: long sessions, automatic loops, and default misuse will become credits consumption more directly after July 7.
“10-80-10”: Use Fable only at the critical ends
The core method proposed in the original article is to change Fable 5 from an “end-to-end executor” into a “front-and-back reviewer.”
The so-called “10-80-10” roughly corresponds to three stages of an AI project.
Use Fable for the first 10% to plan. Have it define the task structure, execution path, success criteria, constraints, and the delivery format. Its best use is not mechanical execution, but establishing a clear plan before a complex task begins.
Replace the middle 80% with cheaper models for execution. A large amount of tokens is typically consumed in repeated edits, format adjustments, small code fixes, organizing reference materials, routine generation, and back-and-forth iterations. This portion doesn’t necessarily require Fable 5 to be involved end-to-end; it can be delegated to Opus, Sonnet, Haiku, or other lower-cost models.
Bring Fable back for the final 10% to perform a review. After the cheaper model completes the main execution, have Fable compare the results against the original plan: whether the work has deviated from the goal, whether anything is missing, what needs patching, and whether it meets release standards. Since at that point it’s reviewing existing deliverables rather than generating everything from scratch, token consumption is usually much lower.
This method is not an official money-saving formula. The original author mentioned that in some scenarios, replacing the execution layer with a cheaper model can reduce token spend by more than 50%, but that should be understood as usage experience. The truly replicable idea is that high-end models don’t need to handle all the token-intensive labor—they are best placed in judgment, architecture, and error detection.
/goal and /loop make agents more usable—and also make costs harder to notice
Another change with Fable 5 is that it is better suited to agent-style workflows.
In traditional prompting, users ask questions and the model responds. After users check, they ask follow-ups, and the loop is driven by people. Whether to continue, revise, or stop at each step is decided by the user.
In the Claude Code environment, /goal and /loop turn this kind of process into a more automated execution workflow.
Anthropic documentation shows that /goal will keep running until the conditions are met or the user clears it, and it can display token spend. The official guidance also recommends adding time or round boundaries like “stop after 20 rounds.” A better goal should not be only “help me fix code,” but should specify what must be accomplished, how to verify the results, what limitations cannot be exceeded, and when to stop.
/loop is used to repeat a prompt at intervals—for example, checking deployment status every 5 minutes—and Claude can also dynamically choose the interval. The official documentation shows that loop-type tasks follow a 7-day expiration rule. These features are well suited for monitoring, iteration, checking, long-term fixes, and agent tasks, allowing the model to keep moving forward without the user repeatedly prompting it.
There’s also a cost risk here.
Automatic loops change “humans manually confirm the next step” into “the model continues running according to the plan.” If the goal is too broad, the end conditions are vague, the interval is set too frequently, or the duration is too long, Fable 5 may continue consuming tokens even after the user leaves. The better the model is at finding problems, adding steps, and doing self-checks, the more users need to set hard boundaries in advance.
Therefore, 10-80-10 and loop engineering are best used together: Fable 5 designs the loop, sets goals, and defines acceptance criteria; the execution layer is delegated to cheaper models as much as possible; only when the loop is closing, results need to be judged, or quality must be enforced at critical points should Fable 5 step in.
After July 7, you need to recheck both model selection and spending caps
For ordinary users, the most direct risk isn’t complex workflows—it’s misuse.
The original article reminds that when opening Claude Code or the Claude app, the model may default to Fable. This sounds more like user experience, and official materials do not present it as a universal rule. However, during the period when the new model is reopened and the platform encourages users to test it, some users may indeed accidentally use the most expensive model for routine chats, simple sorting, or low-value tasks.
Once credits billing begins, this kind of misuse becomes more sensitive. Simple conversations, lightweight rewrites, formatting cleanup, and ordinary summaries don’t necessarily require Fable 5. Checking the model selector before each session may become a basic habit for frequent users.
Another practical reminder is to set a spending cap.
Anthropic support documentation shows that usage credits need to be enabled in Settings > Usage. Users can set payment methods and purchase or pre-load credits, and also configure a monthly spending cap, auto-reload, and usage alerts. Claude Code also supports usage credits.
If there’s no monthly limit, long tasks, automatic loops, and agent-style execution can accumulate noticeable costs in a short period of time. For high-frequency users, setting monthly spending limits, enabling reminders, and clearly writing stop conditions in /goal or /loop is no longer just a financial setting—it’s part of using an agent model.
The new habit brought by models like Fable 5 is to allocate models based on task value and difficulty. Planning, complex judgment, and final review are worth using Fable; repetitive execution, ordinary generation, and light edits are better suited to cheaper models. High-end models are moving from “smarter chatbots” to “automatically working agents.” The stronger the capability, the more users need to set goals, boundaries, time, and budgets in advance. Otherwise, uncontrolled billing may appear earlier than task failure.
Click to learn about the job openings at BlockBeats
Welcome to join the official BlockBeats community:
Telegram subscription group: https://t.me/theblockbeats
Telegram discussion group: https://t.me/BlockBeats_App
Twitter official account: https://twitter.com/BlockBeatsAsia