OpenAI technical staff point-by-point challenged the V4 hardware recommendations: the chapter that left the industry stunned with V3 this time is “unexpected.”

robot
Abstract generation in progress

ME News message: On April 24 (UTC+8), according to Beating monitoring, OpenAI technician Clive Chan said that the V4 technical report is still top-tier overall, but the chapter of hardware recommendations for chip manufacturers—“surprisingly mediocre, even containing errors”—contrasts sharply with V3. The Q&A in the V3 hardware section was once one of the most popular discussion segments at the academic conference ISCA. The recommended content was specific to interconnect standards that the industry is in the process of formulating, whereas V4 is much more vague. Chan questioned it point by point.

Regarding power consumption, the report states that software optimization enables the chip’s computation, storage, and communication to run at full load simultaneously, and it suggests that chip manufacturers leave more power headroom. Chan believes this is “exactly backfiring”: the chip’s total power consumption is constrained by physical process limits, so leaving more power headroom would mean lowering operating frequency, resulting in less computing power.

Regarding the data transmission method between GPUs, the report says that it chooses to let GPUs actively read data (pull) rather than having the other side push it, because the notification overhead of push is too high. Chan challenges this conclusion, arguing that pull is actually slower and that the network interface card’s data-handling capability should be improved. However, the two may be discussing issues at different levels: the report is about the overhead of the notification mechanism, while Chan is about the latency of the transmission itself.

As for activation functions, the report suggests using a simpler function to replace SwiGLU in order to reduce the computational burden. Chan believes this is unnecessary, because Sonic MoE has already proven that using SwiGLU can still achieve optimal performance. Chan suspects that DeepSeek may have “intentionally weakened this chapter.” (Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned