Hugging Face is directly available for download; clone it tonight with git to test the latency.

View Original
MeNews
PrismML launches 1.58-bit model Ternary Bonsai, with parameters reduced by 9 times, surpassing peers in intelligence
PrismML releases the Ternary Bonsai series, using 1.58-bit weights {-1, 0, +1}, with VRAM only one-ninth of a 16-bit model. The 8B/4B/1.7B sizes are open-sourced on Hugging Face and natively run on Apple devices. The 8B weights are approximately 1.75 GB, with a benchmark score of 75.5, leading among peers. On the iPhone 17 Pro Max, the 8B model runs at 27 tokens/sec, with a 3–4 times improvement in energy efficiency. The weights are distributed under Apache 2.0 and run natively on Apple devices via the MLX framework.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned