google just quietly became the only company that can generate text, images, video, AND music inside one app.


lyria 3 dropped today. here's why that matters more than people think.
the multimodal scoreboard right now:
Google Gemini: text ✅ images ✅ video ✅ music ✅
OpenAI: text ✅ images ✅ video ✅ music ❌ (coming)
Meta: text ✅ images ✅ video ✅ music ❌
Anthropic: text ✅ images ❌ video ❌ music ❌
google just checked every box first.
lyria 3 does text to tracks, image to tracks, and video to tracks. 30 second songs with custom lyrics. you upload a photo of your dog and it writes a song about your dog.
dumb? maybe. but that's how adoption starts.
the dedicated music AI startups should be paying attention.
suno does $200M/yr revenue. raised at $2.45B. 100M users in 2 years. udio settled with universal and warner. elevenlabs launched music gen and hit $200M ARR.
but google just made music gen free inside an app billions of people already use.
this is the bundling play that kills startups.
remember when standalone image gen apps feared dall-e getting baked into chatgpt? same playbook. the feature always beats the product when the distribution is 1000x larger.
today lyria 3 is a 30 second toy. suno gives you stems, inpainting, 15 minute tracks, vocal personas. no comparison on quality right now.
but suno had the same edge over google's first music model. then v2. then v3. the gap closes every version.
openai is building one too. partnered with juilliard students to annotate training data. new audio model reportedly coming by march.
but "expected" and "shipped" are different words. google shipped today.
their play is the same as google's. bundle everything into one conversation.
"make me a video about X. now add music. now write the caption."
that's the product. not a music generator. an everything generator.
the modality timeline tells the whole story:
2022: text generation (everyone scrambles)
2023: image generation (midjourney explodes)
2024: video generation (sora, runway, kling)
2025: music generation (suno hits $200M)
2026: all of it. in one app. from one prompt.
the race isn't about who has the best music AI. or the best image AI. or the best video AI.
it's about who puts them all together first in a way that feels effortless.
google just took the lead.
what i'm watching next:
does openai ship music before Q2?
does suno's revenue hold when google bundles music gen for free?
how fast does "30 second toy" become "3 minute production tool"?
the multimodal race just got a new finish line.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
  • Pin

Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)