Nous Open Source Lighthouse Attention: Single B200 Runs 512K Speedup of 17x


Nous Research has open-sourced the Long Context Pretraining Mechanism Lighthouse Attention. When processing 512K length texts on a single B200 GPU, this approach is approximately 17 times faster than traditional mechanisms, and achieves 1.4 to 1.7 times end-to-end training speedup at 98K length.
Traditional attention mechanisms require calculating pairwise relationships between all words, and as the text lengthens, the computational cost increases quadratically. Lighthouse Attention adopts a coarse-to-fine approach. It quickly scans the compressed summaries of the text at different levels, scores and selects core segments to form a short text, then directly hands it over to the efficient operator FlashAttention for processing. Because the filtering logic is completely separated from the core kernel, developers are directly freed from the hassle of writing low-level code and do not need to add extra training objectives.
$AI
{spot}(AIUSDT)
View Original
post-image
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned