Third-Party Evaluation Released: Thinking Machines' New Model Ties with GPT-Realtime-2, Tops Audio Rankings

According to monitoring by Dongcha Beating, data platform Scale Labs has announced the latest Audio MC S2S rankings. The evaluation results show that Thinking Machines’ newly released TML-Interaction-Small model achieved an APR score of 43.4%, tying for first place with OpenAI’s GPT-Realtime-2 (xHigh). In terms of specific scores, GPT-Realtime-2 (xHigh) holds the absolute top score with 48.45 points, followed closely by TML-Interaction-Small with 43.36 points. Since the score difference falls within the statistical margin of error, both models are officially rated as tied for first place. The second tier follows with the standard version of GPT-Realtime-2 (37.61 points), the thinking mode-enabled Gemini 3.1 Flash Live (36.06 points), and the older GPT-Realtime-1.5. Scale Labs noted that this model demonstrates a rare long-context awareness capability among existing full-duplex models while maintaining a fast response speed in conversations.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin