CEO of Google DeepMind, Nobel laureate in Chemistry Demis Hassabis visits Y Combinator, discussing key progress towards AGI, advice for entrepreneurs on how to stay ahead, and where the next major scientific breakthrough might occur.

A very practical judgment for deep tech entrepreneurs is that if you start a ten-year deep tech project today, you must include the emergence of AGI in your planning. Additionally, he revealed that Isomorphic Labs (a biotech AI spin-off from DeepMind) will have major news soon.

Highlights and Quotes

AGI Roadmap and Timeline

·「These existing technological components will almost certainly become part of the final AGI architecture.」

·「Problems like continual learning, long-term reasoning, and aspects of memory are still unsolved; AGI needs to master all of these.」

·「If your AGI timeline is around 2030 like mine, and you start a deep tech project today, you must consider that AGI might appear halfway through.」

Memory and Context Windows

·「The context window roughly equals working memory. Humans have an average of about seven items in working memory, while we have context windows of hundreds of thousands or even millions of tokens. But the problem is we stuff everything in, including unimportant or incorrect information, which is quite crude.」

·「Processing real-time video streams and storing all tokens means a million tokens are only enough for about 20 minutes.」

Limitations of Reasoning

·「I like playing chess with Gemini. Sometimes it realizes it’s making a bad move but can’t find a better one, so it circles around and ends up making that bad move. A precise reasoning system shouldn’t have this problem.」

·「It can solve IMO gold medal problems but makes elementary school math mistakes when asked differently. It seems to lack something in introspection of its own thinking process.」

Agent and Creativity

·「To achieve AGI, you need a system that can proactively solve problems for you. Agents are the way forward, and I think we are just getting started.」

·「I haven’t seen anyone using vibe coding to create a top-ranked AAA game. With current effort, it should be possible, but it hasn’t happened yet. This suggests something is missing in tools or processes.」

Distillation and Small Models

·「Our hypothesis is that a cutting-edge Pro model released for half a year to a year can be compressed into a very small model that runs on edge devices. We haven’t yet hit the theoretical information density limit.」

Scientific Discovery and the “Einstein Test”

·「Sometimes I call it the ‘Einstein Test,’ which is whether you can train a system with knowledge from 1901 and have it independently derive Einstein’s 1905 results, including special relativity. If it can do that, these systems are not far from inventing truly new things.」

·「Solving a Millennium Prize problem is already impressive. But even more difficult is proposing a new set of Millennium Prize problems that top mathematicians consider equally profound and worth a lifetime of research.」

Deep Tech Entrepreneurship Advice

·「Chasing hard problems and simple problems are actually quite similar, just approached differently. Life is short; it’s better to focus your energy on things that no one else will do if you don’t.」

Pathways to AGI

Gary Tan: You’ve been thinking about AGI longer than almost anyone. Based on current paradigms, how much of the final AGI architecture do you think we already have? What is fundamentally missing right now?

Demis Hassabis: Large-scale pretraining, RLHF, chain of thought, I’m quite sure they will be part of the final AGI architecture. These technologies have proven a lot already. I can’t imagine in two years we’ll find they’re dead ends—that doesn’t make sense to me. But on top of what we have, maybe one or two more things are needed. Continual learning, long-term reasoning, certain aspects of memory still have unresolved issues.

AGI needs to be fully solved. Maybe existing tech plus some incremental innovations can extend to that level, but there might still be one or two key breakthroughs needed. I’d say the probability of unresolved critical points is about fifty-fifty. So at DeepMind, we’re pushing both lines.

Gary Tan: I deal with many Agent systems, and what shocks me most is that the underlying weights are often the same across different instances. So the concept of continual learning is very interesting, because right now we’re basically patching things together with tape, like those “dream cycle” ideas.

Demis Hassabis: Exactly, those dream cycles are pretty cool. We’ve thought about this in the context of integrating episodic memory. My PhD research was on how the hippocampus elegantly incorporates new knowledge into existing schemas. The brain does this very well.

It does this during sleep, especially during REM sleep, replaying important experiences to learn from them. Our earliest Atari program DQN (DeepMind’s 2013 deep Q-network that first used deep reinforcement learning to reach human-level performance in Atari games) mastered Atari by experience replay.

This concept, learned from neuroscience, involves repeatedly replaying successful paths. That was 2013, quite ancient in AI terms, but it was crucial then.

I agree with you—right now we’re basically patching things together with tape. Stuffing everything into the context window. It doesn’t feel quite right. Even if we’re machines, not biological brains, theoretically we could have a million or ten million tokens in context, and perfect memory, but retrieval costs still exist. Finding truly relevant information at the moment of decision isn’t easy, even if you can store everything. So I believe there’s huge room for innovation in memory systems.

Gary Tan: Honestly, a million-token context window is bigger than I expected, and it enables many things.

Demis Hassabis: For most scenarios, yes, it’s large enough. But think of the context window as roughly equivalent to working memory. Humans have an average of about seven items in working memory, while we have context windows of hundreds of thousands or even millions of tokens. The problem is we stuff everything in, including unimportant or incorrect info, which is quite crude. And if you process real-time video streams and record all tokens, a million tokens only cover about 20 minutes. But if you want the system to understand your life over one or two months, it’s still far from enough.

Gary Tan: DeepMind has always invested heavily in reinforcement learning and search. How deeply is this philosophy embedded in your current development of Gemini? Is RL still underestimated?

Demis Hassabis: It probably is underestimated. Attention to RL has fluctuated over the years. From day one at DeepMind, we’ve been working on Agent systems. All Atari and AlphaGo work essentially belong to reinforcement learning agents—systems capable of autonomous goal achievement, decision-making, and planning. We chose games because of their controllable complexity, then gradually moved to more complex games, like AlphaStar after AlphaGo, covering most of what’s possible in game environments.

The next question is whether these models can generalize into world models or language models, not just game models. We’ve been working on this for years. Today’s leading models’ reasoning patterns and chain-of-thought are essentially a re-derivation of what AlphaGo pioneered.

I think much of what we did back then is highly relevant today. We’re re-examining those old ideas, scaling up, and making them more general, including Monte Carlo tree search and various reinforcement learning methods. The ideas behind AlphaGo and AlphaZero are highly related to today’s foundational models, and I believe much of the progress in the next few years will come from here.

Distillation and Small Models

Gary Tan: To be smarter now, we need bigger models, but distillation techniques are also improving, making small models quite fast. Your Flash models are very strong, reaching about 95% of the performance of state-of-the-art models at a tenth of the cost. Is that correct?

Demis Hassabis: I think that’s one of our core advantages. You need to build the largest models first to gain cutting-edge capabilities. One of our biggest strengths is quickly distilling and compressing those capabilities into smaller models. We invented the distillation method ourselves, and we’re still among the world’s top. Plus, we have strong business motivation to do this. We’re probably the largest AI application platform globally.

With AI Overviews, AI Mode, and Gemini, every Google product—maps, YouTube, etc.—is integrating Gemini or related tech. This involves billions of users and products serving hundreds of millions or billions of users. They must be extremely fast, efficient, low-cost, and low-latency. This drives us to optimize Flash and smaller Flash-Lite models to be highly efficient, aiming to serve various user needs.

Gary Tan: I’m curious how smart these small models can get. Is there an upper limit to distillation? Can 50B or 400B models be as smart as today’s largest frontier models?

Demis Hassabis: I don’t think we’ve hit the information-theoretic limit yet, at least no one knows if we have. Maybe someday we’ll reach a density ceiling, but our current hypothesis is that a cutting-edge Pro model released today can be compressed into a very small, nearly edge-device capable model within six months to a year.

You can see this in Gemma models—our Gemma 4 performs very strongly at similar sizes. This relies heavily on distillation and efficiency optimization techniques. So I really see no fundamental theoretical limit; we’re still far from it.

Gary Tan: Currently, there’s an astonishing phenomenon where engineers can do 500 to 1000 times the work they could six months ago. Some people in this room are doing work comparable to what a Google engineer did in the 2000s, but a thousand times more.

Demis Hassabis: I find that exciting. Small models have many uses. One is lower cost and faster iteration, which benefits code writing and other tasks, especially when collaborating with systems. Fast systems—even if not top-tier, say 90-95% of the frontier—are enough, and the speed advantage outweighs the small performance gap.

Another major direction is running these models on edge devices, not just for efficiency but also for privacy and security. Think of devices handling highly private data or robots. For your home robot, you’d want a local, efficient, powerful model, only delegating specific tasks to cloud-based large models when necessary. Processing audio and video streams locally, keeping data on device, could be the ultimate scenario.

Memory and Reasoning

Gary Tan: Back to context and memory. Currently, models are stateless. If they gain continual learning, what would the developer experience be like? How would you guide such models?

Demis Hassabis: That’s a very interesting question. The lack of continual learning is a key bottleneck preventing agents from completing full tasks. Today’s agents are useful for parts of tasks, and you can combine them to do cool things, but they can’t adapt well to your specific environment. That’s why they can’t truly “launch and forget.” They need to learn your specific context. To reach full general intelligence, this problem must be solved.

Gary Tan: How far along are we in reasoning? The current chain-of-thought models are strong, but they still stumble on mistakes that smart undergraduates wouldn’t make. What needs to change? What progress do you expect in reasoning?

Demis Hassabis: There’s still a lot of room for innovation in thinking paradigms. What we’re doing is still quite rough and brute-force. There are many improvements possible, like monitoring the thinking process and intervening mid-thought. I often feel that both our systems and competitors’ systems tend to overthink and get stuck in loops.

I like observing Gemini playing chess. All leading foundational models are quite weak at chess, which is interesting.

Watching their thought trajectories is valuable because chess is a well-understood domain. I can quickly tell if they’re going off track or if their reasoning is effective. Sometimes they consider a move, realize it’s bad, but can’t find a better one, so they circle back and make that bad move. A precise reasoning system shouldn’t do that.

This huge gap still exists, but fixing it might only require one or two adjustments. That’s why you see the so-called “jagged intelligence”—they can solve IMO gold medal problems but make elementary math errors when asked differently. In introspection of their own reasoning, it seems something is still missing.

The True Capabilities of Agents

Gary Tan: Agents are a big topic. Some say it’s hype. I personally think we’re just at the beginning. What’s DeepMind’s real assessment of agent capabilities, and how big is the gap with public perception?

Demis Hassabis: I agree—we’re just starting. To reach AGI, you need a system that can proactively solve problems for you. That’s always been clear to us. Agents are the way forward, and I think we’re just getting started.

Everyone is exploring how to better integrate agents into workflows. We’ve done a lot of personal experiments, and many here probably have too. How to make agents part of the workflow, not just a toy but truly transformative? We’re still in the experimental stage. Only in the last two or three months have we started to find particularly valuable scenarios. The technology is just reaching that point, no longer just a demo but actually adding value to your time and efficiency.

I often see people launching dozens of agents running for dozens of hours, but I’m not sure if the output justifies the effort.

We haven’t yet seen anyone using vibe coding to create a AAA game topping app store charts. I’ve done some prototypes myself, and many here have made good demos. I can now make a “Theme Park” prototype in half an hour, whereas I spent six months at 17.

I have a feeling that if you spend an entire summer, you could create something truly incredible. But it still requires craftsmanship, human soul, and taste—you must bring these into whatever product you build. In fact, no kid has yet made a blockbuster game selling ten million copies, but with current tools, that should be possible. Something is still missing—maybe process, maybe tools. I expect in the next 6 to 12 months, we’ll see such results.

Gary Tan: To what extent will that be fully automated? I don’t think it will be fully automatic from the start. The more likely path is that people first achieve 1000x efficiency, then some use these tools to make best-selling apps and games, and only later will more steps be automated.

Demis Hassabis: Exactly, that’s what you should expect first.

Gary Tan: Part of it is that some people are already doing this, but they’re reluctant to say how much Agent helped.

Demis Hassabis: Maybe. But I want to talk about creativity. I often cite AlphaGo’s famous move 37. I’ve been waiting for that moment to happen, and once it did, I started projects like AlphaFold. We began working on AlphaFold the day after returning from Seoul—ten years ago. I went to Korea to celebrate AlphaGo’s tenth anniversary.

But just making move 37 isn’t enough. It’s cool and useful, but can this system invent the game of Go itself? If you give it a high-level description, like “a game learnable in five minutes, but impossible to master in a lifetime, elegant aesthetically, finished in an afternoon,” and the system returns Go, why?

Gary Tan: Maybe someone here can do that.

Demis Hassabis: If someone does, then the answer isn’t that the system is missing something, but that we’re using it wrong. Maybe that’s the right answer. Perhaps today’s systems already have that capability, just needing a highly talented creator to drive it, infusing the project with soul, and working in close harmony with the tools. If you immerse yourself in these tools and have deep creativity, maybe you can create something beyond imagination.

Open Source and Multimodal Models

Gary Tan: Switching topics to open source. Recently, Gemma’s release allows very powerful models to run locally. What’s your view? Will AI become something users control themselves, rather than mainly staying in the cloud? Will this change who can build products with these models?

Demis Hassabis: We are strong supporters of open source and open science. For example, we fully open-sourced AlphaFold. Our scientific work is still published in top journals. With Gemma, we aim to create world-leading models at similar scale. So far, Gemma has been downloaded about 40 million times in just two and a half weeks.

I also think having Western tech stacks in open source is important. Chinese open source models are excellent and currently leading in open source. We believe Gemma is very competitive at similar scale.

For us, resource constraints are a big issue—no one has excess compute to develop two full-scale frontier models. Our current decision is to use edge models for Android, glasses, robots, etc., ideally open models, because once deployed on devices, they are exposed. It makes sense to open them fully. We’ve unified our open strategy at the nanoscale, which aligns with our strategic goals.

Gary Tan: Before the presentation, I demoed my AI operating system, where I could interact with Gemini via voice. I was nervous about showing it, but it worked. Gemini has been multimodal from the start. I’ve used many models, but the deep integration of voice interaction, tool invocation, and contextual understanding in Gemini is unmatched.

Demis Hassabis: Exactly. One underappreciated advantage of Gemini is that we built it from the start with multimodality. This makes initial development harder than just text, but we believe it will pay off long-term, and we’re already seeing results.

For example, in world modeling, we built Genie (DeepMind’s generative interactive environment model) on top of Gemini. In robotics, Gemini Robotics will be based on multimodal foundation models, creating a competitive moat. We’re also increasingly using Gemini in Waymo (Alphabet’s autonomous driving company).

Imagine a digital assistant that follows you into the real world, perhaps on your phone or glasses, understanding your physical environment. Our system is very strong in this area. We will continue investing here, and I believe our lead in these kinds of problems is significant.

Gary Tan: Reasoning costs are dropping rapidly. When reasoning becomes nearly free, what becomes possible? Will your team’s focus shift because of this?

Demis Hassabis: I’m not sure reasoning will truly become free—Jevons’ Paradox (efficiency improvements leading to increased total consumption) is still there. I think everyone will eventually use all the compute they can get.

Imagine millions of agents collaborating or a small group thinking along multiple paths and integrating results. We’re experimenting with these directions, and all will consume reasoning resources.

In terms of energy, if we solve issues like controlled nuclear fusion, room-temperature superconductors, and optimal batteries, I believe we can get close to zero energy costs through materials science. But physical manufacturing of chips still has bottlenecks, at least for decades. So reasoning will still have quotas and need to be used efficiently.

The Next Scientific Breakthrough

Gary Tan: Fortunately, small models are getting smarter. Many founders in biotech and related fields are here. AlphaFold 3 has surpassed proteins, extending to broader biomolecules. How far are we from modeling complete cellular systems? Is this a fundamentally different level of difficulty?

Demis Hassabis: Isomorphic Labs is making great progress. AlphaFold is just one step in drug discovery. We’re working on adjacent biochemical research, designing compounds with correct properties, and will soon have major releases.

Our ultimate goal is to create a full virtual cell—a comprehensive, perturbable cell simulator that produces results close to experimental data and has practical applications. It can skip many search steps, generate synthetic data to train other models, and predict real cell behavior.

I estimate about ten years to a complete virtual cell. We’re starting from the nucleus, which is relatively self-contained. The key is to find a complex enough slice that is self-contained and can be reasonably approximated in input and output, then focus on that subsystem. The nucleus is a good candidate.

Another challenge is data scarcity. I’ve spoken with top scientists in electron microscopy and other imaging tech. If we could image live cells without killing them, that would be revolutionary—turning it into a visual problem we know how to solve.

But currently, no technology can image live cells at nanometer resolution without damaging them. Static images at that resolution are very detailed but not enough to turn into a visual reasoning problem.

So there are two paths: hardware and data-driven solutions, or building better learnable simulators to model these dynamic systems.

Gary Tan: You’re not only looking at biology. Materials science, drug discovery, climate modeling, mathematics—if you had to rank, which scientific field will be most thoroughly transformed in the next five years?

Demis Hassabis: Every field is exciting—that’s why I’ve been passionate about AI for over 30 years. I see AI as the ultimate scientific tool for advancing understanding, discovery, medicine, and our knowledge of the universe.

Our initial mission statement was twofold: first, solve intelligence—build AGI; second, use it to solve everything else. We later refined this because some asked, “Do you really mean to solve all problems?”

We do. That’s what it means. Specifically, I mean solving what I call “root node problems” in science—those breakthroughs that unlock entirely new branches of discovery. AlphaFold is a prototype of what we want to do.

Over three million researchers worldwide, almost every biologist, now use AlphaFold. I hear from pharma executives that nearly every new drug discovery will involve AlphaFold at some stage. We’re proud of this impact, and it’s just the beginning.

I can’t think of any scientific or engineering field where AI can’t help. The fields you mentioned are still in the “AlphaFold moment”—promising, but the big challenges remain. In the next two years, we’ll see progress across all these areas, from materials science to mathematics.

Gary Tan: It feels like Prometheus giving humanity a new power.

Demis Hassabis: Exactly. And as the myth warns, we must be cautious about how this power is used, where it’s applied, and the risks of misuse of the same tools.

Success Stories

Gary Tan: Many here are trying to start companies applying AI to science. In your view, what’s the difference between truly cutting-edge startups and those just layering APIs on basic models claiming “AI for Science”?

Demis Hassabis: I imagine if I were in your shoes, looking at Y Combinator projects, I’d consider how to predict the future of AI. It’s hard. But I believe combining AI’s trajectory with other deep tech fields offers huge opportunities. The intersection—whether in materials, medicine, or other tough sciences—especially at the atomic level, will have no shortcuts in the foreseeable future. These fields won’t be overtaken just by the next model update. If you want a resilient direction, that’s what I’d recommend.

I’ve always favored deep tech. Real, lasting value comes from hard problems. I’ve been attracted to deep tech since early days. When we started in 2010, AI was considered a niche with little hope—investors said “we know it won’t work,” and academia thought it was a failed 90s experiment.

But if you believe in your ideas—why this time is different, your unique background—ideally you’re an expert in machine learning and applications, or can assemble such a founding team—there’s enormous impact and value to be created.

Gary Tan: That’s very important. Once something works, it seems obvious, but everyone was against it before.

Demis Hassabis: Exactly. You must pursue what you’re truly passionate about. For me, I’ll keep doing AI no matter what. Since I was young, I decided it’s the most impactful thing I can do. It’s proven right, but maybe I was early—by 50 years.

And it’s also the most interesting thing I can think of. Even if today we’re still in a garage and AI isn’t fully realized, I’d keep pushing forward. Maybe I’d return to academia, but I’d find a way to continue.

Gary Tan: AlphaFold is an example of a direction you pursued and bet right on. What makes a scientific field suitable for breakthroughs like AlphaFold? Are there patterns, like certain objective functions?

Demis Hassabis: I should write this down someday. From AlphaGo and AlphaFold and all Alpha projects, I’ve learned that our current tech works best when:

First, the problem has a huge combinatorial search space—bigger is better, so large that brute-force or special algorithms can’t solve it. The move space in Go and the conformational space of proteins far exceed the number of atoms in the universe. Second, the goal function is clearly defined, like minimizing free energy in proteins or winning in Go, so the system can optimize via gradient ascent. Third, there’s enough data or a simulator that can generate large amounts of synthetic data within the distribution.

If these three conditions are met, current methods can go far enough to find that “needle in the haystack.” Drug discovery follows the same logic: if a compound can treat a disease without side effects, and physics allows it, the only challenge is how to find it efficiently. I believe AlphaFold proved that such systems can find these needles in vast search spaces.

Gary Tan: I want to elevate the discussion. We’ve created AlphaFold with human effort, but there’s also a meta-level—humans using AI to explore the hypothesis space. How far are we from AI systems doing real scientific reasoning (not just pattern matching in data)?

Demis Hassabis: I think it’s very close. We’re building such general systems. We have an AI co-scientist, and algorithms like AlphaEvolve that go beyond basic Gemini. All top labs are exploring this.

But so far, I haven’t seen a truly major scientific discovery made solely by these systems. I believe it’s coming soon. It might relate to creativity—breaking known boundaries. At that level, it’s no longer just pattern matching, because there are no patterns to match. It’s more like analogy reasoning—I think these systems currently lack that, or we’re not using them correctly.

A standard I often mention in science is: can it generate a truly interesting hypothesis, not just verify one? Verifying a hypothesis can be a big breakthrough—like proving the Riemann Hypothesis or solving a Millennium Prize problem—but maybe we’re only a few years away from that.

Even harder is proposing a new set of Millennium Prize problems that top mathematicians consider equally profound and worth a lifetime of research. I think that’s an order of magnitude more difficult, and we don’t yet know how to do it. But I don’t see it as magic. I believe these systems will eventually do it, maybe missing one or two pieces.

A way to test this is what I call the “Einstein Test”: can you train a system with knowledge from 1901 and have it independently derive Einstein’s 1905 results, including special relativity? I think we should run this test repeatedly, see when it’s achievable. Once it is, these systems are not far from inventing truly new things.

Entrepreneurship Advice

Gary Tan: Final question. Many here have deep technical backgrounds and want to build something at your scale. You’re one of the world’s leading AI research organizations. From your front-line experience with AGI research, what’s one thing you now know but wish you knew at 25?

Demis Hassabis: We’ve actually touched on part of this. You’ll find that chasing hard problems and simple problems are quite similar—just approached differently. Different challenges have different difficulties. Life is short; it’s better to focus your energy on things that no one else will do if you don’t.

Also, I think cross-disciplinary combinations will become more common in the coming years. AI will make crossing fields easier.

Finally, it depends on your AGI timeline. Mine is around 2030. If you start a deep tech project today, it’s usually a ten-year journey. You must include the possibility of AGI appearing midway. What does that mean? Not necessarily bad, but you must consider it. Can your project leverage AGI? How will AGI systems interact with your project?

Referring back to AlphaFold and general AI systems, I foresee a scenario where general systems like Gemini or Claude call upon specialized systems like AlphaFold as tools. I don’t think we’ll put everything into a single giant system.

Xem bản gốc

Trang này có thể chứa nội dung của bên thứ ba, được cung cấp chỉ nhằm mục đích thông tin (không phải là tuyên bố/bảo đảm) và không được coi là sự chứng thực cho quan điểm của Gate hoặc là lời khuyên về tài chính hoặc chuyên môn. Xem Tuyên bố từ chối trách nhiệm để biết chi tiết.

Phần thưởng
Thích
Bình luận
Đăng lại
Retweed

Bình luận

Thêm một bình luận

Không có bình luận

Chủ đề thịnh hành
Xem thêm
#
gStocksTokenizedStocksLive
4,81M Phổ biến
#
StrongNonfarmPayrollsRekindleRateHikeFear
1,07M Phổ biến
#
IsraelStrikesIranBTCPlunges
68,8K Phổ biến
#
PredictWorldCupShare20000U
234,53K Phổ biến
#
ETHBreaks1700
152,64M Phổ biến

Đã ghim

sơ đồ trang web

Phỏng vấn sáng lập DeepMind: Cấu trúc AGI, Tình hình của Agent và những đột phá khoa học trong thập kỷ tới

Editor’s Introduction

Highlights and Quotes

AGI Roadmap and Timeline

Memory and Context Windows

Limitations of Reasoning

Agent and Creativity

Distillation and Small Models

Scientific Discovery and the “Einstein Test”

Deep Tech Entrepreneurship Advice

Pathways to AGI

Distillation and Small Models

Memory and Reasoning

The True Capabilities of Agents

Open Source and Multimodal Models

The Next Scientific Breakthrough

Success Stories

Entrepreneurship Advice

Chủ đề thịnh hành

gStocksTokenizedStocksLive

StrongNonfarmPayrollsRekindleRateHikeFear

IsraelStrikesIranBTCPlunges

PredictWorldCupShare20000U

ETHBreaks1700

Đã ghim