Crypto.com News reports that Zhipu AI has released a technical report for GLM-5V-Turbo. In April, z.ai API and OpenRouter will go live. The report adds a methodology; the model is not open-sourced. This multimodal programming foundation supports a 200k context window and can be integrated with frameworks such as Claude Code and OpenClaw. Since pre-training, it has fused visual perception into reasoning, planning, tool calling, and execution. Key architectural components include the CogVit visual encoder, SigLip2/DinoV3 dual-teacher distillation, and MMTP that aligns 8 billion Chinese–English bilingual image-text pairs. It uses shared learnable tokens to replace visual embeddings, reducing cross-stage communication and improving stability, and jointly uses reinforcement learning to cover perception, reasoning, and execution. Design2Code scored 94.8, surpassing Claude Opus 4.6.

CoinNetwork

2026-05-08 02:31:34

Abstract generation in progress

CryptoWorld News reports that Zhipu AI has released the GLM-5V-Turbo technical report, and the model was launched on z.ai API and OpenRouter in early April. This report supplements the methodology, and the model has not been open-sourced. GLM-5V-Turbo is Zhipu’s first multimodal programming foundation model, supporting around 200k context length, and can connect to agent frameworks such as Claude Code and OpenClaw. From the pre-training stage, the model integrates visual perception into the entire process of reasoning, planning, tool invocation, and execution. The model architecture has three key design elements: a new visual encoder CogVit, using SigLip2 and DinoV3 for dual-teacher distillation pre-training, and contrastive learning alignment of multimodal multi-token prediction (MMTP) with 8 billion Chinese-English bilingual image-text data, replacing direct visual embedding transmission with a shared learnable special token to reduce communication complexity across pipeline stages, resulting in more stable joint reinforcement learning covering perception, reasoning, and agent execution at three levels. Specific benchmark results show that Design2Code achieves 94.8, surpassing Claude Opus 4.6.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
GateSquareMayTradingShare
737.19K Popularity
#
BitcoinFallsBelow80K
95.02M Popularity
#
IranUSConflictEscalates
86.16K Popularity
#
OilPriceRollerCoaster
305.4K Popularity
#
DailyPolymarketHotspot
853.22K Popularity

Sitemap

Zhipu GLM-5V-Turbo Technical Report: Design2Code super Claude Opus4.6, directly generate code from the screenshot

Trending Topics

GateSquareMayTradingShare

BitcoinFallsBelow80K

IranUSConflictEscalates

OilPriceRollerCoaster

DailyPolymarketHotspot

Pin