AI will make biological extinction risks worse before it makes them better

By MichaelDickens @ 2026-06-29T17:05 (+11)

An argument goes: If we don't build aligned artificial superintelligence, we risk driving ourselves extinct for some other reason. We should rush to build ASI quickly, in spite of the risks—the longer we wait, the more vulnerable we are to extinction from a different cause.

Other than ASI, the biggest extinction risk is synthetic biology. Some lab could (accidentally or on purpose) develop a highly transmissible, 100% fatal super-plague that wipes out humanity.

An aligned ASI could stop that from happening by shutting down dangerous biological research, or by developing advanced countermeasures that stop the spread of deadly infections. So the argument goes: We need to build ASI to save us from non-AI extinction risks.

However, that argument doesn't work. In the near term, AI will make biological risks worse, not better. AI will accelerate scientific research, which will bring us closer to the level of knowledge necessary to build extinction-level pathogens. And in the long term, the way ASI eliminates biological x-risk is by taking control of the world.

Cross-posted from my website.

In the near term, AI makes biorisk worse

Some people imagine that AI models would accelerate defensive research while refusing to assist with developing bioweapons. This plan has two minor issues and one fatal one.

The first minor issue: Current AI model refusals are not robust, and there are workarounds to get information out of them for people who want to. It's very hard for AI developers to patch all holes, but the jailbreakers only need to find one.

The second minor issue: Even if the leading AI developer makes their model safe and un-jailbreakable, at least one of their competitors will probably fail at that task.

The fatal issue: It's not just about what AI assistants can do for humans. It's that AI accelerates the rate of scientific progress. As state of knowledge improves for humanity in general, it becomes possible for humanity to develop existentially risky pathogens, even if AI does not assist directly. It seems impossible to advance biological science while surgically preserving ignorance on just those bits of knowledge that are required to engineer pathogens.

AI might refuse to participate in gain-of-function research, and that would be better than not refusing. But suppose I'm an evil scientist and I want to develop a 100% lethal airborne pathogen. Here in the year 2026, I can't do it. Even if I'm on the cutting edge of medicine and biology, I still won't be able to create the "extinction pathogen", because that would require a level of scientific understanding that humanity simply hasn't achieved. If AI advances science in general, it will push me closer to my evil goal of killing everyone with bioweapons.

There is the question of "offense-defense balance": is it easier to develop deadly pathogens, or easier to protect people against pathogens? That question matters in many contexts, but it's not relevant here. At our current level of scientific understanding, we have ~zero ability to develop extinction-level bioweapons. If our understanding becomes sufficiently advanced, then that ability will move from zero to nonzero, regardless of the offense-defense balance.

Leaving AI out of the picture, humanity will probably have the knowledge necessary to make extinction-level pathogens within the next hundred years. If AI causes a hundred years of progress in the next decade, then the evil scientist will be able to engineer their extinction pathogen by 2036, thanks to AI—even if the AI itself doesn't directly participate in the creation of the pathogen.

By 2036, assuming AI hasn't killed us yet, biorisk will be higher than in the alternative 2036 where AI capabilities stopped improving. Would 2036-biorisk-with-AI be higher than 2126-biorisk-without-AI? Maybe not—maybe AI scientists would be safer than human scientists per unit of research effort. But at minimum, AI-accelerated science is more dangerous per unit of time. AI acceleration means the high-risk period starts sooner, and it means we have less time. Less time to identify risks, less time for policy-makers to respond, less time to consider what direction we should go in. Speedrunning through a century of progress in a decade makes it much harder to manage the risks as they come.

AI can't control scientific progress unless it controls everything

The only way to accelerate scientific progress in biology without increasing x-risk is for AI to have complete control over scientific capabilities—basically, it has to be impossible for any humans to use their increasingly-advanced knowledge of biology to develop bioweapons. I don't see how to do that unless all science is being done by AI, with humans not participating anymore.

Many people have a vision of the future in which humans will coexist with advanced AI, and we will remain in control of the steering wheel. But if humanity is in control, how can AI prevent us from developing powerful bioweapons? We can't have it both ways.

One might say, "Governments will have to prevent terrorist and mad scientists from developing bioweapons." To which I say, indeed they should do that. But AI makes governments' jobs harder on that front, not easier, unless AI has totalitarian grip on society—at which point we're back to the scenario where humans lose control over the future.

Another attempt at escaping the dilemma: Let the government control AI, and AI control everyone else. Even in the world where the government is democratically elected, that world is starting to sound like an extreme version of Bad Definitions Of "Democracy" Shade Into Totalitarianism, in which your life is fully controlled by AI, and the only time when you get any say in the matter is at the voting booth.^[1] I can imagine much worse outcomes than that, but it's not what I would describe as a happy ending.

Low biorisk trades off against high AI takeover risk

AI increases biorisk until it's powerful enough to completely shut down any danger. Therefore, the way to minimize AI-driven biological x-risk is to have a very short window of time between "AI is smart enough to accelerate biological research" and "superintelligent AI controls everything". But if that window is short, then we have little time to solve the alignment problem, and little time to steer AI while we are still in control of the future. AI-enhanced biorisk is lowest in the worlds where AI takeover risk is highest.

People with relatively low credence in AI takeover risk tend to expect a slow takeoff. But in a slow takeoff, AI makes biorisk worse well before it's smart enough to robustly prevent extinction-level pandemics.

Accelerating AI development is not a good way to reduce biorisk

We don't currently know how to build bioweapons that kill everyone, and eventually we will know how to do that.^[2] Much like how, in 1900, there was no risk of nuclear winter because we didn't yet know how to build nuclear weapons.

Scientific progress brings prosperity, but it can also enable dangerous new technologies. General biology research might even be harmful on balance due to increasing extinction risk—I don't have a well-informed view on whether that's true. What I can say is that the following argument does not hold up:

We need to accelerate AI progress so that it can save us from biological extinction risks.

Consider the neighboring argument, "we need to accelerate AI progress to create medical advancements." That argument is failing to do basic cost-benefit analysis (the risk of extinction is not outweighed by short-term improvements in medicine), but at least it's true that AI could, indeed, improve the state of medicine. "We should accelerate AI to reduce biological x-risk" isn't even clearly correct about the upside.^[3]

This is yet another illustration of the fact that we don't know what "aligned AI" means

In the (possibly brief) window where AI is smart enough to do scientific research but doesn't yet control the whole world, AI increases biological x-risk by improving humanity's knowledge of how to develop powerful bioweapons. After that window, what happens? If we're in a world where ASI is powerful enough to reduce extinction risk to zero, what does that world look like, and what should it look like? I find it difficult to imagine what sort of radical transformations to civilization would be necessary to achieve a total elimination of x-risk.

Some people imagine a future where everyone owns their own galaxy. How can we make meaningful claims about x-risk when the future looks that weird? If I can own a galaxy (whatever that means), maybe some other person can deconstruct a handful of planets to build an army of 100% deadly super-nanoviruses and send them throughout the universe at 99.9999% the speed of light so that they kill everyone before anyone even sees them coming. Or something.

Many people have an intuition that aligned ASI will fix everything and the world will be great. But if we succeed at figuring out how to get ASI to do what we want, how do we then specify its behavior such that we get a good outcome? Some people hand-wave the problem away by saying "the ASI will be smart, it will help us figure out what to tell it to do." Much like alignment bootstrapping, this answer has a chicken-and-egg problem: how can the ASI figure out what you should tell it to do if you haven't yet told it how to determine what you should tell it to do?

(If an "assistant ASI" comes to you with some answer, and it's far smarter than you, how can you judge whether its answer is correct?)

The biorisk case is an example of the general problem that we don't know how to specify how an ASI should behave. Others have discussed this problem in more general terms, including:

A Conflict Between AI Alignment and Philosophical Competence by Wei Dai (2025)
Intent alignment seems incoherent by Joe Rogero (2025)
Many individual CEVs are probably quite bad by Villiam (2025)

The concerns with biological x-risk are a specific illustration of the general problem. How, exactly, do you build an AI that prevents humans from killing each other with bioweapons, but without making things horrible as a side effect?

To be clear, I do not believe this scenario is at all likely. I'm using it as a hypothetical way of escaping the dilemma, to illustrate that even this "solution" still isn't something we want. ↩︎
Unless AI kills us first. ↩︎
This brings to mind an important (but off-topic) question: if scientific advancement increases existential risk, but it's also essential to improve standards of living, how should we proceed? We don't have an answer for that question yet, but whatever we come up with, I imagine it would be fair to summarize as: "We proceed carefully." As we learn more about what sorts of advancements are dangerous, we can implement mitigations.

If AI rapidly accelerates progress—even assuming AI itself doesn't kill everyone—then it will be difficult to implement mitigations as we go, because the time gap between "top scientists foresee a dangerous technology on the horizon" and "anyone can develop this technology in their garage" will become much shorter.

(Another possibility is that humanity doesn't solve the problem of how to advance science without introducing new x-risks. Instead, we solve AI alignment, and then AI solves every other problem.) ↩︎