Claude's Chinese tokens: Asking the same content costs 65% more tokens than English, while OpenAI only costs 15% more.

robot
Abstract generation in progress

According to Beating Monitoring, AI researcher Aran Komatsuzaki translated Rich Sutton’s well-known paper “The Bitter Lesson” into 9 languages, and fed them into the tokenizer tools of six models—OpenAI, Gemini, Qwen, DeepSeek, Kimi, and Claude 6. Using the token count of the original English text on OpenAI’s tokenizer as a 1x baseline, he compared how many times more tokens each language used on each model. The results: when the same content was asked in Chinese to Claude, token consumption was 1.65 times the baseline; with OpenAI it was only 1.15 times. Hindi was even more extreme on Claude, exceeding the baseline by more than 3 times. Among the six cross-model tests, Anthropic came last.

Translation will change the length of the text, so the multiples compared with English are not completely precise. But more convincing is how the same Chinese paragraph performs across different models (still using the same baseline): Kimi used only 0.81 times (even fewer than English), Qwen used 0.85 times, and on Claude it rose to 1.65 times. The text is completely identical; the gap is purely an issue of the tokenizer’s efficiency. Chinese models process Chinese more efficiently than English, which shows the problem is not with Chinese itself, but whether the tokenizer has been optimized for that language.

For users, more tokens mean the API becomes directly more expensive, the wait time before the model responds is longer, and the context window is used up faster. The efficiency of the tokenizer depends on each language’s share in the training data: more English data means English words can be compressed more efficiently; less non-English data means the text can only be cut into smaller, more fragmented pieces. Aran’s conclusion: whoever has the bigger market, the more tokens they save.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments