Google's move to fit large models into 1GB of memory, making AI run smoothly on mobile devices, finally eliminates lag, and developers are ecstatic.

View Original
CoinNetwork
CryptoWorld News reports that Google has released the ultra-lightweight Gemma 4 model, with mobile local runtime memory dropping below 1GB for the first time. The model uses quantization compression technology, reducing numerical precision to shrink the model size while maintaining a high level of intelligence. Google has also optimized it for mobile chips to ensure smooth performance. The new model weights have been open-sourced on Hugging Face; individual users can download and run them via Ollama and LM Studio, and mobile and web developers can also deploy quickly through supported engines.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned