Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Pre-IPOs
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
I found it very interesting what Google announced this week about the new Gemini 3.1 Flash TTS. Basically, they managed to turn text-to-speech conversion into something much more sophisticated than what we saw before.
The main point here is that developers now have fine control over how the AI speaks. It’s not just that monotone robot generating audio. You can adjust tone, speed, accent, even the emotional expression of the voice. And the coolest part? All of this using natural language instructions through so-called "audio tags." You can change the style of expression in the middle of a sentence if you want.
Google made this available in several places: Gemini API, AI Studio with an intuitive "director’s chair" style interface, Vertex AI for businesses, and Google Vids for Workspace users. There are three levels of control that make the workflow much easier.
What caught my attention was the ranking. According to Artificial Analysis, this model ranked first among TTS with an Elo score of 1,211, entering the "most attractive quadrant." It supports over 70 languages and native multi-voice conversations, which opens up many possibilities.
And there’s an important detail: all generated audio comes with an integrated SynthID watermark to identify that it was AI-generated. This is very relevant given all the debate about authentic content.
For those working in content creation, this changes the game quite a bit. Gemini text-to-speech stops being just a conversion tool and becomes more of a programmable vocal performance engine. You can reuse vocal styles consistently across an entire product line, which was complicated before. It’s worth keeping an eye on this evolution.