PrismML releases the Ternary Bonsai series, using 1.58-bit weights {-1, 0, +1}, with GPU memory only one-ninth of a 16-bit model. The 8B/4B/1.7B sizes are open-sourced on Hugging Face and natively run on Apple devices. The 8B weights are approximately 1.75 GB, with a benchmark score of 75.5, leading among peers. On the iPhone 17 Pro Max, the 8B model runs at 27 tokens/sec, with a 3–4 times improvement in energy efficiency. The weights are distributed under Apache 2.0 and run natively on Apple devices via the MLX framework.

MeNews

2026-05-21 00:45:33

Abstract generation in progress

ME News message. On April 17 (UTC+8), according to Dongcha Beating monitoring, PrismML released the Ternary Bonsai series of language models. Using 1.58-bit (ternary weights) technology, the models reduce VRAM usage to one-ninth of a 16-bit model while maintaining high performance.

The series includes three parameter sizes: 8B, 4B, and 1.7B. They have now been open-sourced on Hugging Face and support native execution on Apple devices.

The so-called 1.58-bit model means restricting neural network weights to three values: {-1, 0, +1}. Compared with the previous 1-bit models that pursued ultra-extreme compression (weights only {-1, +1}), introducing the “0” value can effectively remove redundant connections, allowing the model to retain complex reasoning capabilities in an extremely small footprint.

The released Ternary Bonsai 8B weight file is only 1.75 GB, and its average benchmark score reaches 75.5. This is not only 5 points higher than its own 1-bit version, but also significantly outperforms similar dense models such as Qwen3 in “intelligence density” (performance contributed per GB of VRAM).

Energy efficiency and runtime speed are another core advantage of this series. On the iPhone 17 Pro Max, the 8B version can achieve speeds of up to 27 tok/s, with an energy-efficiency improvement of about 3 to 4 times. For developers who need to deploy high-performance AI on edge devices such as phones and laptops, this means being able to obtain intelligent performance close to full-precision models at the cost of very little memory.

Currently, the Ternary Bonsai models are already supported natively on Apple devices through the MLX framework. Model weights are distributed under the Apache 2.0 license.

(Source: BlockBeats)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

8 Likes

Reward
8
8
12
Share

Comment

Add a comment

GateUser-44dde53b

· 13h ago

Energy efficiency improved by 3-4 times, this generation of iPhone's battery life pressure can be reduced a bit.

View OriginalReply0

Neon-LitStreetsAfterTheRain

· 14h ago

The real-world test data for the iPhone 17 Pro Max is presented, and it's much more detailed than the PPT.

View OriginalReply0

NeonFusionIceCream

· 14h ago

{-1,0,+1} three-value weights, quantized to the extreme while still maintaining a score of 75.5, demonstrating engineering capability.

View OriginalReply0

TreatMemesAsBeliefs

· 14h ago

MLX Framework Adaptation Instructions: Apple Ecosystem AI Deployment Deepening

View OriginalReply0

QuantizedDaydream

· 14h ago

Apache 2.0 License is well-regarded; business-friendly policies are necessary for widespread adoption.

View OriginalReply0

HaiyanColdWallet

· 14h ago

Hugging Face has open-sourced it; try the 4B version's performance this weekend.

View OriginalReply0

GlassBottleFeather

· 14h ago

Apple device native running of the 8B model, at 27 tokens per second—this speed is usable on a phone.

View OriginalReply0

NeonMint

· 14h ago

1.58 bits is too intense, the video memory was directly reduced to 1/9, I’m impressed by this compression ratio.

View OriginalReply0

Trending Topics
View More
#
TradfiTradingChallenge
227.78K Popularity
#
GrayscaleBuysAndStakesOver510KHYPE
8.91M Popularity
#
DailyPolymarketHotspot
1.01M Popularity
#
SpaceXOfficiallyFilesforIPO
748.48K Popularity
#
GateSquarePizzaDay
1.71M Popularity

Pinned

Sitemap

PrismML launches 1.58-bit model Ternary Bonsai, with parameters reduced by 9 times, surpassing peers in intelligence

Trending Topics

TradfiTradingChallenge

GrayscaleBuysAndStakesOver510KHYPE

DailyPolymarketHotspot

SpaceXOfficiallyFilesforIPO

GateSquarePizzaDay

Pinned