Futures
Access hundreds of perpetual contracts
CFD
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
CFD
U.S. stock CFD derivatives
US Stocks
Access real US stocks and ETFs
HK Stocks
Trade quality Hong Kong-listed stocks
Stock Futures
High leverage, 24/7 trading
Tokenized Stocks
Backed by real stock assets
IPO Access
Unlock full access to global stock IPOs
GUSD
Mint GUSD for Treasury RWA yields
Stocks Activities
Trade Popular Stocks and Unlock Generous Airdrops
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
IPO Access
Unlock full access to global stock IPOs
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Promotions
AI
Gate AI
Your all-in-one conversational AI partner
Gate AI Bot
Use Gate AI directly in your social App
GateClaw
Gate Blue Lobster, ready to go
Gate for AI Agent
AI infrastructure, Gate MCP, Skills, and CLI
Gate Skills Hub
10K+ Skills
From office tasks to trading, the all-in-one skill hub makes AI even more useful.
GateRouter
Smartly choose from 40+ AI models, with 0% extra fees
Runway integrates voice into videos for agents; the days are getting harder for independent TTS vendors.
Voice Embedded Directly into Video Agent, Accelerating Productization
RunwayML quietly added custom voice capabilities in the Characters API, directly integrating TTS into real-time video Agents. Developers no longer need to connect to separate voice services themselves.
This is a clear bundling strategy: Runway’s GWM-1 world model links “text-to-speech” with facial expression synthesis, enabling faster mass production of brand virtual avatars for customer service and game NPCs. The underlying technology uses ElevenLabs’ eleven_ttv_v3, which allows tone design via prompts and voice cloning with 10-second samples, with lip-sync and gestures automatically aligned.
An important signal to note: Almost no one discusses this on Twitter, but the team says this is the “highest-demand” feature. API-first release methods are inherently non-marketing, targeting those actively building rather than marketing to the masses.
Independent Voice Services Face Structural Pressure
This update positions TTS as “infrastructure layer,” no longer a standalone product. ElevenLabs provides backend support, but the bundling accelerates the trend of pure TTS being “integrated” into larger platforms.
ElevenLabs v3 excels in emotional expression and technical metrics, but Runway’s “video-first” approach is the watershed: enterprises want complete Agents, not parts. Developers will naturally migrate toward full-stack multimodal platforms.
Don’t be misled by claims like “revolutionary cloning”—mainstream vendors’ audio quality isn’t vastly different; the real edge lies in integration capabilities across multimodal scenarios.
| Role | Phenomenon | Implication | Judgment | |---|---|---|---| | Bundling platform | Runway documentation shows ElevenLabs-driven clones with GWM-1 avatars can run real-time video | Developer focus shifts from standalone TTS to full-stack Agents, squeezing voice-only vendors | Integrated platforms have an advantage; the lock-in effect from bundling is underestimated | | TTS specialist | ElevenLabs v3 quality is good but can’t be tied to video; market response to launch is lukewarm | Enterprises prefer one-stop API solutions, revenue from standalone TTS is being eroded | Without solving integration, the moat remains shallow | | Enterprise procurement | 2026 TTS evaluations still cite latency and prosody as pain points; Runway’s bundling directly addresses these | Faster deployment in customer service, gaming, and other scenarios; no new regulatory hurdles seen yet | Early movers benefit, those waiting will only compete on similar features | | Observers | Industry influencers react tepidly, but API is already live | Expectation is to anchor on real use cases, not hype | Low buzz doesn’t mean no progress; actual API usage is the key |
My view: Multimodal bundling lowers the barrier for non-professional users, giving Runway an advantage amid scattered, competing players.
From an investment perspective, the market has not fully priced in the “video-first + full-stack bundling” stickiness premium. For enterprises, reducing vendor connections is inherently cost- and hassle-saving.
In simple terms: Whoever bets early on integrated video Agents will gain first-mover advantage. Multimodal platforms benefit, while standalone TTS faces pressure. Companies ignoring bundling trends are likely to be passively caught up—when “voice” becomes a default capability, deployment speed depends on API accessibility and full-chain consistency, not just single-point audio quality.
Importance: Moderate
Category: Product Launch | Industry Trend | Developer Tools
Conclusion: Product teams and enterprise buyers are currently in an “early window,” making it worthwhile to validate and enter quickly. Investors and vendors focusing solely on speech are in a “defensive period,” needing to accelerate toward multimodal and integrated capabilities. Resources will flow toward all-in-one platforms and teams capable of rapid productization; pure TTS players will have short-term disadvantages.