Pay with

USD

Supports Visa, Mastercard, SEPA & more

Flexible trading, zero fees

Use your crypto for payments worldwide

Basic

Trade crypto freely

Magnify your profit with leverage

Convert & Auto-Invest

Trade any size with no fees and no slippage

Get exposure to leveraged positions simply

Pre-Market Trading

Trade new tokens before listing

Advanced

Trade on-chain with Gate Wallet

Smart access to new on-chain tokens

Smart strategies with automated trading

Follow expert trading strategies

CrossEx Trading

One margin balance, shared across platforms

Access hundreds of perpetual contracts

One platform for global traditional assets

Trade European-style vanilla options

Unified Account

Maximize your capital efficiency

Introduction to Futures Trading

Learn the basics of futures trading

Join events to earn rewards

Use virtual funds to practice risk-free trading

Launch

Collect candies to earn airdrops

Quick staking, earn potential new tokens

Hold GT and get massive airdrops for free

Unlock full access to global stock IPOs

Trade on-chain assets and earn airdrops

Earn futures points and claim airdrop rewards

Investment

Earn interest with idle tokens

Auto-invest on a regular basis

Dual Investment

Profit from market volatility

Earn rewards with flexible staking

Pledge one crypto to borrow another

One-stop lending hub

Premium wealth growth plans

Private Wealth Management

Premium asset allocation

Top-tier quant strategies

Stake cryptos to earn in PoS products

No-liquidation leverage

Mint GUSD for RWA returns

Post, share, and explore crypto trends

Live crypto market analysis

Chat with crypto traders

What is happening in crypto

More

CoconutWaterBoy

2026-04-23 03:08:49

I just saw that PrismML released something quite interesting: the Ternary Bonsai series of language models. What caught my attention is that they managed to drastically reduce GPU memory consumption, dropping to one-ninth compared to 16-bit models. Basically, they use ternary weights of 1.58 bits that can only take three values: -1, 0, or +1. It sounds technical, but the idea is to eliminate redundant connections in the neural network to improve reasoning without sacrificing performance.

The interesting part about the price and accessibility is that the Bonsai 8B model only takes up 1.75 GB of weight storage, making it super practical for edge devices. Compared to heavier alternatives, the cost-benefit is quite favorable. They achieve an average of 75.5 on benchmarks, surpassing even their 1-bit predecessor and similar dense models. The best part is that it works natively on Apple devices, so you don’t need any weird workarounds.

In terms of speed, on an iPhone 17 Pro Max, they reach 27 tokens per second with 3 to 4 times better energy efficiency. That’s a significant leap for inference on mobile devices. Now they have models available with 8B, 4B, and 1.7B parameters, all open source on Hugging Face under Apache 2.0. For developers looking for high-performance AI solutions without spending a fortune on infrastructure, these Bonsai models seem like a pretty solid option.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

Add a comment

No comments

Trending Topics
View More
#
Gate13thAnniversaryLive
1.23M Popularity
#
WCTCTradingChallengeShare8MUSDT
799.71K Popularity
#
BitcoinBouncesBack
216.02K Popularity
#
EthereumMemeSeasonReturns
2M Popularity
#
USIranTalksProgress
848.57K Popularity

Pin