PrismML releases the Ternary Bonsai series, using 1.58-bit weights {-1, 0, +1}, with VRAM only one-ninth of a 16-bit model. The 8B/4B/1.7B sizes are open-sourced on Hugging Face and natively run on Apple devices. The 8B weights are approximately 1.75 GB, with a benchmark score of 75.5, leading among peers. On the iPhone 17 Pro Max, the 8B model runs at 27 tokens/sec, with a 3–4 times improvement in energy efficiency. The weights are distributed under Apache 2.0 and run natively on Apple devices via the MLX framework.

MeNews

2026-05-21 05:44:03

Abstract generation in progress

ME News Report, April 17 (UTC+8), according to Dongcha Beating monitoring, PrismML released the Ternary Bonsai series language models, which use 1.58-bit (ternary weights) technology to reduce model memory usage to one-ninth of a 16-bit model while maintaining high performance. The series includes 8B, 4B, and 1.7B parameter sizes, now open-sourced on Hugging Face and supporting native operation on Apple devices.
The so-called 1.58-bit model refers to limiting the weights in neural networks to three values: {-1, 0, +1}. Compared to the previous ultra-compressed 1-bit models (weights only {-1, +1}), introducing the "0" value can effectively eliminate redundant connections, allowing the model to retain complex reasoning capabilities at a very small size.
The released Ternary Bonsai 8B weight file is only 1.75 GB, with an average benchmark score of 75.5, not only 5 points higher than their own 1-bit version but also significantly surpassing similar dense models like Qwen3 in "intelligent density" (performance per GB of VRAM).
Energy efficiency and speed are another core advantages of this series. On the iPhone 17 Pro Max, the 8B version can run at 27 tokens/sec, with an energy efficiency improvement of about 3 to 4 times.
For developers needing to deploy high-performance AI on mobile, laptop, and other edge devices, this means achieving near-full-precision model intelligence at a minimal memory cost.
Currently, the Ternary Bonsai models are natively supported on Apple devices through the MLX framework. Model weights are distributed under the Apache 2.0 license.
(Source: BlockBeats)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

7 Likes

Reward
7
9
10
Share

Comment

Add a comment

OldKeyboardTraitor

· 5h ago

The three-value weighting is actually much more difficult than binarization; the presence of 0 allows for more flexible information retention, and PrismML's choice at this step is precise.

View OriginalReply0

BoredInBlockspace

· 5h ago

1.75GB fits 8B parameters; in the future, local LLMs will truly become the norm.

View OriginalReply0

0xLateDiner

· 5h ago

1.58-bit weights are too aggressive; the VRAM is directly reduced to one-ninth, and this compression ratio is quite impressive.

View OriginalReply0

GateUser-0f33f9ef

· 5h ago

{-1,0,+1} three-value quantization, mathematical elegance in engineering has also been realized.

View OriginalReply0

ProofOfSnack

· 6h ago

The name Ternary Bonsai is clever; the three values are like pruning a bonsai, simplifying by removing the unnecessary.

View OriginalReply0

BerryColdWallet

· 6h ago

Running the 8B model at 27 tokens/sec on iPhone? Apple users are ecstatic

View OriginalReply0

GateUser-e1cfc287

· 6h ago

The energy efficiency ratio increases by 3-4 times, and the power consumption anxiety of edge AI has been solved.

View OriginalReply0

L2Mailman

· 6h ago

MLX native support, adding another piece to the Apple ecosystem closed loop

View OriginalReply0

FoldedCosmosCat

· 6h ago

Open source + Apache 2.0, PrismML has opened up this pattern.

View OriginalReply0

Trending Topics
View More
#
TradfiTradingChallenge
221.9K Popularity
#
GrayscaleBuysAndStakesOver510KHYPE
8.91M Popularity
#
DailyPolymarketHotspot
1.01M Popularity
#
SpaceXOfficiallyFilesforIPO
744.35K Popularity
#
GateSquarePizzaDay
1.7M Popularity

Pinned

Sitemap

PrismML launches 1.58-bit model Ternary Bonsai, with parameters reduced by 9 times, surpassing peers in intelligence

Trending Topics

TradfiTradingChallenge

GrayscaleBuysAndStakesOver510KHYPE

DailyPolymarketHotspot

SpaceXOfficiallyFilesforIPO

GateSquarePizzaDay

Pinned