Futures
Accédez à des centaines de contrats perpétuels
TradFi
Or
Une plateforme pour les actifs mondiaux
Options
Hot
Tradez des options classiques de style européen
Compte unifié
Maximiser l'efficacité de votre capital
Trading démo
Introduction au trading futures
Préparez-vous à trader des contrats futurs
Événements futures
Participez aux événements et gagnez
Demo Trading
Utiliser des fonds virtuels pour faire l'expérience du trading sans risque
Lancer
CandyDrop
Collecte des candies pour obtenir des airdrops
Launchpool
Staking rapide, Gagnez de potentiels nouveaux jetons
HODLer Airdrop
Conservez des GT et recevez d'énormes airdrops gratuitement
Pre-IPOs
Accédez à l'intégralité des introductions en bourse mondiales
Points Alpha
Tradez on-chain et gagnez des airdrops
Points Futures
Gagnez des points Futures et réclamez vos récompenses d’airdrop.
Investissement
Simple Earn
Gagner des intérêts avec des jetons inutilisés
Investissement automatique
Auto-invest régulier
Double investissement
Profitez de la volatilité du marché
Staking souple
Gagnez des récompenses grâce au staking flexible
Prêt Crypto
0 Fees
Mettre en gage un crypto pour en emprunter une autre
Centre de prêts
Centre de prêts intégré
Aimez-vous le son de Tesla ? xAI ouvre officiellement l'API vocale Grok, TTS à 4,2 dollars pour un million de caractères, avec un taux de reconnaissance surpassant ElevenLabs
xAI officially launches independent Grok speech-to-text (STT) and text-to-speech (TTS) APIs this week, with this tech stack already operational in Grok Voice, Tesla vehicles, and Starlink customer service systems. STT pricing is $0.10 per batch hour and $0.20 per streaming hour, supporting over 25 languages.
(Previous context: Grok 4.3 beta opens to Heavy subscribers! Musk: the true flagship version training completed after 5 days)
(Additional background: Google launches Gemini 3.1 Flash TTS: audio tags make AI voiceovers more lively, supporting 70+ languages, Google AI Studio free trial)
Table of Contents
Toggle
The same set of voice technologies that makes Tesla vehicles speak and Starlink customer service respond to users is now available via API. xAI announced on the 17th the launch of independent Grok speech-to-text (STT) and text-to-speech (TTS) APIs, allowing external developers to directly call this speech infrastructure already in use within xAI products.
STT: word-level timestamps + speaker diarization, batch transcription only $0.10 per hour
According to official details, Grok STT API offers two access modes: batch processing via REST API and low-latency real-time streaming via WebSocket API. Pricing-wise, batch processing is $0.10 per hour and streaming is $0.20 per hour. The official statement claims that compared to mainstream competitors like ElevenLabs and Deepgram, the pricing has a significant advantage.
Functionally, Grok STT supports over 25 languages, with word-level timestamps, speaker diarization, multi-channel audio, and intelligent reverse text normalization. Suitable for enterprise scenarios such as meeting transcription, legal and medical records, and customer call logs requiring high accuracy.
In entity recognition benchmarks, Grok STT shows an advantage. When identifying key entities like names, accounts, and dates in phone calls, Grok STT’s error rate is 5.0%, compared to 12.0% for ElevenLabs, 13.5% for Deepgram, and 21.3% for AssemblyAI.
TTS: 5 voice personalities + voice tags, $4.2 per million characters
Grok TTS API offers five distinct voice styles: Ara (female, warm and friendly), Eve (female, lively and proactive), Leo (male, authoritative and powerful), Rex (male, confident and clear), Sal (neutral, smooth and balanced).
The API automatically detects input language, natively supporting over 20 languages, and uses BCP-47 language codes to control pronunciation.
Audio output formats include MP3, WAV, PCM (Linear16), G.711 μ-law, and G.711 A-law. The latter two are common telephony codecs, indicating xAI’s layout for telecom integration.
A key feature of the TTS API is “voice tags,” allowing developers to embed commands within text to finely control pauses, laughter, whispers, intonation emphasis, speech rate, and pitch, making synthesized speech more natural and human-like. Pricing is $4.20 per million characters.
The same tech stack powers Tesla and Starlink
xAI emphasizes that these two APIs are not entirely new technologies but are based on the same infrastructure already deployed in Grok Voice, Tesla vehicle voice interactions, and Starlink customer support systems.
This infrastructure first appeared at the end of 2025 as the Grok Voice Agent API, providing real-time voice dialogue capabilities, and ranked first in the Big Bench Audio benchmark, with initial audio response times under 1 second—about five times faster than recent competitors.
The release of these independent STT and TTS endpoints effectively splits the integrated voice pipeline into modular components, allowing developers to assemble them as needed.