Open source! Nous has moved the filtering logic outside the core, so there's no need to modify the underlying CUDA or add training objectives—it's plug-and-play. The pain points of long-text infrastructure have been addressed.

View Original
MeNews
Nous开源Lighthouse Attention:单B200跑512K提速17倍
AIMPACT states that Nous Research has open-sourced the long-context pretraining mechanism Lighthouse Attention. Processing 512K text on a single B200 card is approximately 17 times faster, and at 98K it achieves an end-to-end speedup of 1.4–1.7 times. This mechanism first performs rough screening and then precise computation: it uses multi-level summaries to filter out the core segments, stitches them into short text, and then passes them to FlashAttention for processing. The filtering logic is outside the kernel, eliminating the need to modify low-level code and additional training objectives. To prevent the model from losing its character-by-character reading ability due to skipping text, during training it completes most of the work in an accelerated mode, and briefly switches back to full attention at the end. In experiments with 530 million parameters and 500亿 tokens, the time consumption is significantly reduced, and the final performance is comparable to traditional baselines or even surpasses them.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned