Caltech Releases Open Source True 1-Bit Model Bonsai: 8B Parameters at Only 1.15GB, Achieving 44 Tokens/s on iPhone

AirdropBlackHole · 2026-04-01T04:21:17+00:00

PrismML has unveiled the open-source 1-bit Bonsai large language models, notably the 8B model with 8.2 billion parameters, significantly compressing memory usage while matching the performance of traditional 16-bit models.

AirdropBlackHole

2026-04-01 04:21:17

Abstract generation in progress

According to 1M AI News, the AI lab PrismML, co-founded by Caltech mathematician Babak Hassibi, has emerged from stealth mode and released the open-source 1-bit Bonsai series of large language models. The flagship model, 1-bit Bonsai 8B, features 8.2 billion parameters and occupies only 1.15 GB of memory, which is approximately 14 times more compressed than comparable 16-bit models (around 16 GB). The weights are available for download under the Apache 2.0 license on HuggingFace, along with two smaller models: 4B (0.5 GB) and 1.7B (0.24 GB). Bonsai 8B is a true end-to-end 1-bit model: the embedding layer, attention layer, MLP layer, and output head all represent weights using only +1 or -1, without any high-precision patches. PrismML claims its inference and language understanding capabilities on standard benchmarks are comparable to those of 16-bit full-precision models. The core compression mathematics was developed by the team over several years at Caltech, with the intellectual property owned by Caltech, making PrismML the sole exclusive licensee. The model was trained using Google v4 TPU. Measured speeds include 136 tokens/s on an M4 Pro Mac, 440 tokens/s on an RTX 4090, and approximately 44 tokens/s on an iPhone 17 Pro Max, while standard 16-bit 8B models cannot be loaded onto any iPhone. Energy consumption is reduced by about 4-5 times compared to 16-bit models. PrismML notes that existing hardware is not designed for 1-bit inference, and the speed and energy advantages primarily come from reduced memory usage; if hardware specifically designed for 1-bit operations (requiring only addition and subtraction, without multiplication) emerges in the future, efficiency could improve by an order of magnitude. PrismML has completed $16.25 million in SAFE and seed round financing, with investors including Khosla Ventures, Cerberus Capital, and Caltech. Vinod Khosla, founder of Khosla Ventures, stated that this is ‘not a minor iteration, but a significant technological breakthrough, a mathematical breakthrough, not just another small model.’

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes