Claude Token Counter, now with model comparisons

Simon Willison's tool reveals that Claude Opus 4.7's new tokenizer inflates token counts by ~46% for text and up to 3x for images compared to its predecessor, leading to higher real-world costs despite unchanged official pricing.

Large Language Models Developer Tools 成本优化分词器模型对比

KEY POINTS

Claude Opus 4.7 is the first model to change its tokenizer, resulting in significantly higher token counts for identical text.
Officially stated token inflation is 1.0-1.35x, but a real-world system prompt test showed 1.46x inflation.
Image processing token inflation is even more severe, with a high-res image costing 3.01x more tokens than before.
Despite unchanged API pricing, the token inflation means real-world usage costs could increase by over 40%.

ANALYSIS

The Spark: A Simple Tool Reveals the Hidden Cost of Model Upgrades On the surface, this is just a developer (Simon Willison) adding a "model comparison" feature to his handy tool. But the reason it's worth discussing is that it perfectly captures a subtle yet wallet-impacting change in major model iterations: the tokenizer update. When Anthropic launched Claude Opus 4.7, they casually mentioned an "updated tokenizer," estimating a token increase of 1.0 to 1.35x. Simon's tool turned that vague range into a specific, somewhat alarming number with a simple paste test: 1.46x. It's like your favorite café switching to new coffee beans, claiming "better flavor," but not telling you that the grounds for an espresso increased by nearly half—while the price stayed the same. Unpacked: What Does Token Inflation Really Mean? First, it's crucial to understand that tokens are the "basic currency" for large models. The text you input, images, and the model's output are all broken down into tokens for billing and computation. The tokenizer is the "translator" responsible for this breakdown. Opus 4.7 has a new "translator" that segments the same content (like a system prompt) into more tokens. The key point is: Anthropic kept the same API pricing for Opus 4.7 and 4.6 ($5 per million input tokens, $25 per million output). But because the token count increased, you end up paying more for processing the same content. Simon's test revealed that for text, costs could rise by about 40%; for high-resolution images, the cost increase could be as high as 200%! This unveils a strategy of "implicit price hikes" in model upgrades: changing the unit of measurement without changing the unit price. Trend Insight: The "B-Side" of Model Optimization and the Warning Value of Dev Tools This incident reveals a deeper trend: model optimization is multi-dimensional, and sometimes improvements in one area (e.g., image understanding, long-context handling) come at a "cost" in another (e.g., computational efficiency, token economics). Opus 4.7 raised the image resolution limit (supporting up to 2576 pixels), which is great, but the trade-off is that processing a standard photo now consumes 3x the tokens. For developers building AI applications, this means re-evaluating cost structures while enjoying new model capabilities. It also highlights the unique value of lightweight, focused developer tools like Simon's Token Counter. It's not a complex framework, yet it provides insights based on real data that official documentation cannot offer. In an era of rapid model iteration, such tools help us quickly validate vendor claims and make more informed technical and cost decisions. Practical Value: What Should You Do as a Developer/User?

Re-evaluate Costs: If you're using the Claude Opus series and plan to upgrade to 4.7, don't assume costs remain unchanged. Use similar tools to test your typical prompts and content to quantify the actual token inflation rate. 2. Watch for Tokenizer Changes: When other model vendors (like OpenAI, Google) update their models in the future, pay attention to whether they've changed the tokenizer. This could be a critical yet understated variable affecting cost and performance. 3. Leverage Comparison Tools: When evaluating new models, don't just look at benchmarks. "Engineering metrics" like token counts, latency, and real-world task costs are equally, if not more, important. Counterintuitive/Unexpected Most attention is drawn to improvements in a model's "intelligence" (like reasoning or creative writing). But this event reminds us that changes at the infrastructure level (like the tokenizer) can have a more direct and quantifiable impact on real-world applications. You might think upgrading a model means "more for the same price," but in reality, it might quietly change the definition of "more." For applications handling large volumes of images or multilingual content, this token inflation could have an exponential impact and must be taken seriously.

Analysis by BitByAI · Read original

Originally from Simon Willison · Analyzed by BitByAI