Vitalik on "AI 2027": Will super AI really destroy humanity?

Ethereum is on the rise, but Vitalik seems more concerned about the threat of super AI.

Written by: Vitalik Buterin

Compiled by: Luffy, Foresight News

In April of this year, Daniel Kokotajlo, Scott Alexander, and others released a report titled "AI 2027," depicting "our best guess of the impact of superhuman AI over the next 5 years." They predict that by 2027, superhuman AI will be born, and the future of all human civilization will depend on the outcomes of AI development: by 2030, we will either usher in a utopia (from the American perspective) or head towards complete destruction (from the perspective of all humanity).

In the months that followed, a plethora of differing opinions emerged regarding the possibilities of this scenario. In critical responses, most focused on the issue of "the timeline being too fast": Will the development of AI truly accelerate continuously as Kokotajlo and others suggest, or will it even intensify? This debate has been ongoing in the field of AI for several years, and many are deeply skeptical about the rapid arrival of superhuman AI. In recent years, the duration of tasks that AI can autonomously complete has roughly doubled every 7 months. If this trend continues, AI would need to wait until the mid-2030s to autonomously complete tasks equivalent to an entire human career. While this progress is also rapid, it is far later than 2027.

Those with a longer-term perspective tend to believe that "interpolation / pattern matching" (the work currently done by large language models) is fundamentally different from "extrapolation / true original thinking" (which only humans can currently achieve). Automating the latter may require technologies that we have not yet mastered or do not even know how to approach. Perhaps, we are just repeating the mistakes made during the large-scale adoption of calculators: mistakenly thinking that since we have quickly automated certain types of important cognition, everything else will soon follow.

This article will not directly intervene in the timeline dispute, nor will it address the (very important) debate of "whether super AI is inherently dangerous." However, it should be noted that I personally believe the timeline will extend beyond 2027, and the longer the timeline, the more persuasive the arguments I present in this article will be. Overall, this article will offer criticism from another perspective:

The scenario implied in "AI 2027" contains an assumption: the capabilities of leading AIs ("Agent-5" and subsequent "Consensus-1") will rapidly increase to possess god-like economic and destructive power, while the (economic and defensive) capabilities of everyone else will essentially stagnate. This contradicts the statement within the scenario itself that "even in a pessimistic world, by 2029 we can expect to cure cancer, slow aging, and even achieve consciousness uploading."

In this article, I will describe some strategies that readers may find technically feasible, but impractical for deployment in the real world in a short time. In most cases, I agree with this point. However, the "AI 2027" scenario is not based on the current reality, but assumes that within 4 years (or any timeline that could lead to destruction), technology will develop to give humans capabilities far beyond the current. Therefore, let's explore: What would happen if not just one side had AI superpowers, but both sides did?

The biological apocalypse is far from being as simple as depicted in the scenario.

Let's zoom into the "race" scenario (i.e., a situation where everyone dies due to the United States' excessive obsession with defeating China, neglecting human safety). Here is the plot where everyone dies:

"For about three months, Consensus-1 expanded around humanity, transforming grasslands and ice fields into factories and solar panels. Eventually, it deemed the remaining humans too much of a nuisance: by mid-2030, AI released over a dozen quietly spreading biological weapons in major cities, silently infecting almost everyone, and then triggered lethal effects with chemical sprays. Most people died within hours; a few survivors (such as doomsday responders in bunkers and sailors on submarines) were eliminated by drones. Robots scanned the victims' brains, storing copies in memory for future research or revival."

Let's analyze this scenario. Even now, there are some emerging technologies being developed that can make AI's "clean and decisive victory" less of a reality:

  • Air filtering, ventilation systems, and ultraviolet lights can significantly reduce the transmission rate of airborne diseases;
  • Two types of real-time passive detection technologies: passively detect human infections and issue notifications within hours, quickly testing for unknown new virus sequences in the environment;
  • Various methods to enhance and activate the immune system are more effective, safe, universal, and easy to produce locally than the COVID-19 vaccine, enabling the human body to resist both natural and artificially designed epidemics. Humans evolved in an environment where the global population was only 8 million, spending most of their time outdoors, so intuitively, we should be able to adapt easily to today's world with greater threats.

These methods combined may reduce the basic reproduction number (R0) of airborne diseases by 10-20 times (for example: better air filtration reduces transmission by 4 times, immediate isolation of infected individuals reduces it by 3 times, and simple enhancement of respiratory immunity reduces it by 1.5 times), or even more. This is enough to make all existing airborne diseases (including measles) non-transmissible, and this figure is far from reaching the theoretical optimum.

If real-time viral sequencing can be widely applied for early detection, the idea that "quietly spreading biological weapons could infect the global population without triggering an alarm" becomes very suspicious. It is worth noting that even advanced methods such as "releasing multiple epidemics and chemical substances that are only dangerous after combination" can still be detected.

Don't forget, we are discussing the assumption of "AI 2027": by 2030, nanobots and Dyson spheres are listed as "emerging technologies." This means that efficiency will be greatly improved, making the widespread deployment of the aforementioned countermeasures more promising. Despite the fact that today, in 2025, human actions are sluggish and lethargic, a large number of government services still rely on paper-based operations. If the world's most powerful AI can transform forests and fields into factories and solar farms before 2030, then the world's second most powerful AI can also install a large number of sensors, lighting fixtures, and filters in our buildings before 2030.

But we might as well further adopt the assumption of "AI 2027" and enter a purely sci-fi scenario:

  • Microscopic air filtration in the body (nose, mouth, lungs);
  • The automated process from discovering new pathogens to fine-tuning the immune system to defend against them can be applied immediately;
  • If "mind uploading" is feasible, one only needs to replace the entire body with a Tesla Optimus or Unitree robot;
  • Various new manufacturing technologies (which are likely to be super optimized in the robotic economy) will be able to produce far more protective equipment locally than the current levels, without relying on global supply chains.

In a world where cancer and aging issues will be cured by January 2029, and technological advancements continue to accelerate, it is truly incredible to think that by mid-2030, we might not have wearable devices that can real-time bioprint and inject substances to protect the human body from arbitrary infections (and toxins).

The aforementioned biological defense arguments do not cover "mirror life" and "mosquito-sized killer drones" (as predicted in the AI 2027 scenario starting in 2029). However, these means cannot achieve the kind of sudden "clean and decisive victory" described in AI 2027, and intuitively, symmetrical defenses against them are much easier.

Therefore, biological weapons are actually unlikely to completely destroy humanity in the way described in the "AI 2027" scenario. Of course, all the outcomes I described are far from a "clean and neat victory" for humanity. Whatever we do (perhaps except for "uploading consciousness to robots"), a full-scale AI biological war will still be extremely dangerous. However, it is not necessary to meet the standard of a "clean and neat victory for humanity": as long as there is a high probability that the attack will partially fail, it is enough to provide a strong deterrent against the AI that already holds a dominant position in the world, preventing it from attempting any attacks. Of course, the longer the timeline for AI development, the more likely such defensive measures will be able to fully take effect.

What about combining biological weapons with other means of attack?

For the above measures to be successful, three prerequisites must be met:

  • The world physical security (including biological and anti-drone security) is managed by local authorities (human or AI), and is not solely a puppet of Consensus-1 (the name of the AI that ultimately controls the world and destroys humanity in the "AI 2027" scenario);
  • Consensus-1 cannot invade the defense systems of other countries (or cities, other secure areas) and immediately render them ineffective;
  • Consensus-1 has not controlled the global information domain to the extent that no one is willing to attempt self-defense.

At first glance, the results of premise (1) could lead to two extremes. Nowadays, some police forces are highly centralized with a strong national command system, while others are localized. If physical security must rapidly transform to meet the demands of the AI era, the landscape will be completely reset, and the new outcomes will depend on the choices made in the coming years. Governments may take the easy route and rely on Palantir; or they may actively choose a solution that combines local development with open-source technologies. Here, I believe we need to make the right choice.

Many pessimistic discussions about these topics assume that (2) and (3) are beyond rescue. Therefore, let's analyze these two points in detail.

The apocalypse of cybersecurity is far from here.

The public and professionals generally believe that true cybersecurity is impossible to achieve; at best, we can quickly patch vulnerabilities once they are discovered and deter cyber attackers by hoarding known vulnerabilities. Perhaps the best scenario we can hope for is something akin to "Battlestar Galactica": almost all human ships are paralyzed simultaneously by the Cylon's cyber attacks, while the remaining ships survive because they do not use any networked technology. I do not agree with this view. On the contrary, I believe that the "endgame" of cybersecurity is favorable to the defenders, and under the rapid technological advancements assumed in "AI 2027", we can achieve this endgame.

One way to understand is to use the technique favored by AI researchers: trend extrapolation. Below are the trend lines based on in-depth research surveys of GPT, assuming the adoption of top security technologies, with the vulnerability rate per thousand lines of code varying over time as follows.

In addition, we have seen significant progress in the development and consumer adoption of sandbox technologies and other isolation and minimal trusted codebase technologies. In the short term, attackers' unique superintelligent vulnerability discovery tools can find a large number of vulnerabilities. However, if highly intelligent agents for discovering vulnerabilities or formal verification of code are publicly available, the natural ultimate balance will be that software developers discover all vulnerabilities through continuous integration processes before releasing code.

I can see two compelling reasons that explain why vulnerabilities cannot be completely eradicated even in this world:

  • The defect arises from the complexity of human intention itself, so the main difficulty lies in building a sufficiently accurate model of intention, rather than the code itself;
  • Non-safety-critical components may continue existing trends in the consumer technology field: writing more code to handle more tasks (or reducing development budgets) rather than completing the same number of tasks with increasingly higher security standards.

However, these categories do not apply to situations such as "Can an attacker obtain root access to the systems that sustain our lives?", which is precisely the core of our discussion.

I admit that my viewpoint is more optimistic than the mainstream views held by the smart people in the current field of cybersecurity. But even if you disagree with my perspective in the context of today's world, it's worth remembering: the scenario in "AI 2027" assumes the existence of superintelligence. At the very least, if "100 million superintelligent copies think at 2400 times human speed" and we still cannot obtain code without such flaws, then we absolutely need to reevaluate whether superintelligence is as powerful as the author imagines.

To some extent, we not only need to significantly improve software security standards but also need to enhance hardware security standards. IRIS is a current effort to improve hardware verifiability. We can take IRIS as a starting point or create better technologies. In fact, this may involve the method of "building correctly": the hardware manufacturing process of key components is intentionally designed with specific verification steps. These are tasks that AI automation will greatly simplify.

The end of super persuasion is far from coming.

As mentioned earlier, another situation where a significant enhancement in defense capabilities may still be futile is when AI convinces enough people that there is no need to defend against the threat of superintelligent AI, and that anyone attempting to seek defensive measures for themselves or their community is a criminal.

I have always believed that there are two things that can enhance our ability to resist super persuasion:

  • A less monolithic information ecosystem. It can be said that we have gradually entered a post-Twitter era, and the internet is becoming more fragmented. This is a good thing (even if the fragmentation process is chaotic), as we overall need more information multipolarity.
  • Defensive AI. Individuals need to be equipped with locally running AI that is clearly loyal to them, to balance the dark modes and threats they see online. Such ideas have already seen sporadic pilot projects (like Taiwan's "Message Checker" app, which conducts local scans on mobile phones), and there is a natural market to further test these ideas (such as protecting people from scams), but more effort is needed in this area.

From top to bottom: URL check, cryptocurrency address check, rumor check. Such applications can become more personalized, user-driven, and powerful.

This contest should not be a confrontation between the super persuader of super intelligence and you, but rather a confrontation between the super persuader of super intelligence and an analysis tool that is slightly weaker but still belongs to super intelligence, serving you.

This is the situation that should occur. But will it really happen? In the short time frame of the scenario assumptions of "AI 2027", achieving the widespread adoption of information defense technology is a very difficult goal. However, it can be said that more moderate milestones would be sufficient. If collective decision-making is most crucial, and as shown in the "AI 2027" scenario, all important events occur within an election cycle, then it is strictly important that direct decision-makers (politicians, civil servants, programmers from certain enterprises, and other participants) can use good information defense technology. This is relatively easier to achieve in the short term, and based on my experience, many of these individuals are already accustomed to interacting with multiple AIs to assist in decision-making.

Revelation

In the world of "AI 2027", people take it for granted that superintelligent AI will inevitably be able to easily and swiftly eliminate the remaining humans. Therefore, all we can do is try our best to ensure that the leading AI is benevolent. In my view, the reality is much more complicated: the question of whether leading AI is powerful enough to easily eliminate the remaining humans (and other AIs) is still highly controversial, and we can take actions to influence this outcome.

If these arguments are correct, their implications for today's policies are sometimes similar to the 'mainstream AI safety guidelines' and sometimes different:

Delaying the development of superintelligent AI is still a good thing. The emergence of superintelligent AI in 10 years is safer than in 3 years, and its emergence in 30 years is even safer. Giving human civilization more preparation time is beneficial.

How to achieve this is a challenge. I believe that the rejection of the "10-Year Moratorium on State-Level AI Regulation" proposed by the United States is overall a good thing, but especially after the failure of earlier proposals like SB-1047, the direction of next steps has become less clear. I think the least invasive and most robust way to delay the development of high-risk AI may involve some kind of treaty regulating advanced hardware. Many hardware cybersecurity technologies required for effective defense also help validate international hardware treaties, so there are even synergies here.

Nevertheless, it is worth noting that I believe the main source of risk comes from military-related actors who will strive hard to seek exemptions from such treaties; this must not be allowed, as if they ultimately gain exemptions, the AI development driven solely by the military could increase risks.

It is still beneficial to coordinate efforts to make AI more likely to do good and less likely to do harm. The main exception (and has always been) is that coordination efforts ultimately evolve into enhancing capabilities.

Increasing the transparency of AI laboratory regulation is still beneficial. Incentivizing AI laboratories to comply with regulations can reduce risks, and transparency is a good way to achieve this goal.

The mentality of 'open source is harmful' has become more risky. Many oppose open-weight AI on the grounds that defense is unrealistic, and the only bright prospect is for good people with good AI to achieve superintelligence before any less benevolent individuals can gain any highly dangerous capabilities. However, the argument in this article paints a different picture: defense is unrealistic precisely because one actor is far ahead while others have not caught up. The diffusion of technology to maintain the balance of power becomes important. At the same time, I would never think that simply because it is done in an open-source manner, accelerating the growth of frontier AI capabilities is a good thing.

The mentality of "We must defeat China" in American laboratories has become riskier, for similar reasons. If hegemony is not a safety buffer but a source of risk, this further rebuts the (unfortunately too common) view that "well-meaning people should join leading AI laboratories to help them win faster."

"Public AI" and similar initiatives should receive more support, not only to ensure the widespread distribution of AI capabilities but also to ensure that infrastructure actors genuinely have the tools to quickly apply new AI capabilities in certain ways as described in this article.

Defense technology should more reflect the idea of 'armed sheep' rather than the idea of 'hunting all wolves.' Discussions on the hypothesis of a vulnerable world often assume that the only solution is for hegemonic nations to maintain global surveillance to prevent any potential threats from arising. However, in a non-hegemonic world, this is not a feasible approach, and top-down defense mechanisms can easily be subverted by powerful AI, turning into tools of attack. Therefore, greater defensive responsibility needs to be achieved through hard work to reduce the world's vulnerability.

The above arguments are for speculation only and actions should not be taken based on these almost certain assumptions. However, the story of "AI 2027" is also speculative, and we should avoid taking action based on the assumption that "its specific details are nearly certain."

I am particularly concerned about a common assumption: that establishing an AI hegemony, ensuring its "alliances" and "winning the race," is the only way forward. In my view, this strategy is likely to reduce our security—especially in cases where hegemony is deeply tied to military applications, which would significantly undermine the effectiveness of many alliance strategies. Once a hegemonic AI deviates, humanity will lose all checks and balances.

In the scenario of "AI 2027", human success depends on the United States choosing the path of safety over destruction at critical moments—voluntarily slowing down AI progress to ensure that the internal thought processes of Agent-5 can be interpreted by humans. Even so, success is not guaranteed, and it remains unclear how humanity can escape the precarious cliff of survival that relies on a single superintelligent mind. Regardless of how AI develops in the next 5-10 years, acknowledging that "reducing global vulnerability is feasible" and investing more energy to achieve this goal with the latest human technologies is a path worth trying.

Special thanks to Balvi volunteers for their feedback and review.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)