Google Cloud Console displays gemini-3.2-flash-lite-live-preview, indicating the launch of a highly low-latency lite/live specialized version. Abacus.AI CEO Bindu Reddy stated that Gemini 3.2 Flash's inference capability reaches 92% of GPT-5.5, and after distillation and sparsification, the cost is only one-twentieth of the latter, with most queries having a latency below 200 milliseconds. The cloud API preview is ahead of schedule, with an official release expected at Google I/O on May 20.

MarsBitNews

2026-05-17 02:55:10

Abstract generation in progress

According to Beating Monitoring, a base model option named gemini-3.2-flash-lite-live-preview appears in the model filtering list in the Google Cloud Console. This is another exposure of this model series on the official platform after traces were revealed earlier this month in iOS app build packages and AI Studio. The new option comes with the suffixes lite and live, indicating that Google is splitting out specialized versions for ultra-low-latency real-time interactions. Abacus.AI CEO Bindu Reddy previously disclosed that Gemini 3.2 Flash’s encoding and reasoning capabilities reach 92% of GPT-5.5, but thanks to distillation and sparsification techniques, its reasoning cost is only one-twentieth of the latter, and most query latency is below 200 milliseconds. With cloud interfaces moving ahead early, industry insiders expect this ultra–cost-performance-focused lightweight model to be officially released at the Google I/O on May 20.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
GateSquareMayTradingShare
1.93M Popularity
#
CLARITYActPassesSenateCommittee
3.57M Popularity
#
DailyPolymarketHotspot
970.92K Popularity
#
BitcoinVShapedReversalBack
227.15M Popularity
#
WCTCTradingKingPK
800.16K Popularity

Pinned

Sitemap

Reasoning cost is only one-twentieth of GPT-5.5, Gemini 3.2 real-time model appears on Google Cloud

Trending Topics

GateSquareMayTradingShare

CLARITYActPassesSenateCommittee

DailyPolymarketHotspot

BitcoinVShapedReversalBack

WCTCTradingKingPK

Pinned