Midjourney meets a rival! Google AI painting 4 big cows start a business, try Imagen technology for free, and win 120 million angel financing

Original source: Qubit

Image source: Generated by Unbounded AI‌

MidJourney, on the throne of AI painting, finally ushered in a strong opponent.

The newest challenger, Ideogram, was born out of nowhere. At the beginning, he relied on free registration to attract a lot of attention.

The most eye-catching feature: Precisely generate text in the picture, NVIDIA scientist Fan Linxi directly used it to draw a “It’s over, Midjourney”.

The company behind it, Ideogram AI, is an entrepreneurial project of Google AI painting 4 big cows who left collectively, located in Toronto, and swept in with seed round financing of US$16.5 million (about 120 million yuan).

The first 4 members of the founding team are all the authors of the Google Vincent graph research Imagen paper, forming a top research team of diffusion models.

The advanced research results that Google has hidden for so long and has not been played by everyone have finally been released by them.

Ideogram AI seed round was led by a16z and Index Ventures.

There are also well-known figures among individual investors, such as OpenAI founding member Andrej Karpathy, reinforcement learning guru Pieter Abbeel, Node.js founder Ryan Dahl, GitHub co-founder Tom Preston-Werner et al.

Even the team’s old boss, former Google brain leader Jeff Dean also participated.

Regardless of the fact that the founding team are all technical backgrounds, Ideogram AI is also unambiguous in terms of publicity and promotion. It directly called on everyone to use the second creation tag on 𝕏, which started a wave of viral marketing.

AI learns to draw text accurately

It has always been a problem to let AI accurately draw text. Although SDXL and Midjourney’s new partial redrawing functions have improved, but the success rate of feedback from netizens has not been very high, and repeated attempts are required.

Once Ideogram solved this pain point, it was directly played by netizens.

It is no problem to let the text appear on the sign and adapt to the ambient light and shadow at the same time.

It’s okay to make a latte art.

Abstract style posters can also come up with fonts with appropriate styles.

In a word, the brand logo is full of productivity attributes.

It can also be seen from the prompt words shared by netizens that the “spell” to increase the success probability of drawing text is also very simple, just one word:

typography (printing typesetting)

But it is a pity that it is not very good at mastering Chinese.

Aside from text, Ideogram’s image generation capability and output quality are comparable to MidJourney and Stable Diffusion.

If you use the exact same technology as Imagen, then using Google T5 instead of OpenAI CLIP as the language encoding model means that Ideogram has a stronger understanding of the spatial relationship description in the prompt words.

Someone has successfully used it to generate a set of images with a consistent style.

Combined with the video generation tool Pika Labs, it can directly produce short films in the style of movie trailers.

Diffusion Model Top Research Team

The founding team of Ideogram AI consists of 7 people, 4 of whom are co-authors of Google Imagen.

Among them, the co-author Mohammad Norouzi is the CEO. He received the Google ML Ph.D. Scholarship during his Ph.D. in Computer Science at the University of Toronto.

After graduation, he joined Google Brain for 7 years. In addition to generating models, he was also an original member of the Google Neural Machine Translation team, and a co-author of the Hinton team’s self-supervised contrastive learning framework SimCLR.

The co-author William Chan (Chen Junle) is the CTO of the new company. He has studied at the University of Waterloo in Canada and Carnegie Mellon University.

When he joined Google in 2012, he first did a machine learning advertising project, and then transferred to Google Brain for NLP research.

The third co-author Chitwan Saharia graduated from Bombay Institute of Technology, joined Google in 2019, and is now the co-founder of Ideogram.

The fourth co-founder Dr. Jonathan Ho** graduated from UC Berkeley, worked in OpenAI for a year, and then joined Google.

In addition to being a core contributor to the Imagen paper, he is also the foundational work of the denoising diffusion model “Denoising Diffusion Probabilistic Models”. Pieter Abbeel, one of the co-authors of this paper, is also an investor in Ideogram AI.

The other three people in the founding team, Shayaan Abdullah was a machine learning engineer at Twitter, left in April this year, and then joined Ideogram AI.

Jacob Lu is a software engineer who worked at Amazon and other companies before joining Ideogram; Jenny Lei is a software engineering intern who worked at Google before joining Ideogram AI.

** still need to do video generation **

The four joint creators of Ideogram AI also completed the follow-up work of video generation Imagen Video during Google.

A year ago, it has realized the generation of high-definition video clips with 1280*768 resolution and 24 frames per second.

In fact, in March of this year, Qubit learned from the investment market that their angel round valuation reached 100 million US dollars, more VCs wanted to give money but couldn’t catch up, couldn’t invest, and more news about their entrepreneurial direction:

** Not only image generation, but also video generation in the future. **

Regardless of Imagen or Imagen Video, Google has never released demos, APIs or open source codes due to security and ethical considerations.

Research results cannot be transformed into applications, which is a common problem encountered by many entrepreneurs who left Google in recent years.

For example, among the eight authors of Transformer on the big model side, Aidan Gomez, the founder of Cohere, once said that the reason for leaving was “I didn’t see the real power of the big model at Google”.

The reason Ashish Vaswani and Niki Parmar left Google to start Adept AI and Essential AI is also “Google wants to use Transformer to optimize existing products, and we want to create new products”.

Later, what these researchers feared did indeed happen:

Although May 2021 (earlier than the ChatGPT training data deadline) Google has developed a LaMDA dialogue model and a chatbot, but there are too many concerns in launching the product, and finally** 18 months later, was directly opened to the public by the next doorChatGPT** and stole the limelight.

……

Having learned these lessons, the newly established Ideogram AI has also adopted a posture of being as open as possible and first attracting users to play.

A testing quota of 1,000 people was initially announced, but it filled up in no time.

It seems that some quotas have been opened today, and qubits did not encounter a queue when they registered in the morning.

In short, the number of seats should still be limited, and those who are interested should hurry up.

Trial address:

Reference link: [1] /launch [2] [3] [4]

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin