The crucible — how I think about the situation with AI

By Owen Cotton-Barratt @ 2025-05-05T13:19 (+37)

This is a linkpost to https://strangecities.substack.com/p/the-crucible

The basic situation

The world is wild and terrible and wonderful and rushing forwards so so fast.

Modern economies are tremendous things, allowing crazy amounts of coordination. People have got really very good at producing stuff. Long-term trends are towards more affluence, and less violence.

The enlightenment was pretty fantastic not just for bringing us better tech, but also more truthseeking, better values, etc.

People, on the whole, are basically good — they want good things for others, and they want to be liked, and they want the truth to come out. This is some mix of innate and socially conditioned. (It isn’t universal.) But they also often are put in a tight spot and end up looking out for themselves or those they love. The hierarchy of needs bites. Effective altruism often grows from a measure of privilege.

The world is shaped by economics and by incentives and by institutions and by narratives and by societal values and by moral leadership and by technology. All of these have a complex interplay.

AI enters the picture

“AI” is a cluster of powerful technologies which are likely to reshape the world. Each of economics and incentives and institutions and narratives and societal values and moral leadership will (I expect) be profoundly impacted by advanced AI. And, of course, AI will be used to shape advanced technologies themselves.

From a zoomed-out perspective, there are three impacts of this AI transition which matter most^[1]:

Technological progress will accelerate — automation of research means the world will get even faster and crazier
New kinds of entities could be making the choices that matter — either purely artificial agents, or new hybrid institutions which incorporate AI deeply in their decision-making
It will become far easier to centralize control — a single entity (AI system, or organization, or person) could end up with robust and enduring power over a large domain, or even the entire world

This is … kind of wild. 2) and 3) could lead to profound changes to the way the world works, and 1) means this entire thing might happen very quickly (and that people are therefore more likely to fumble it).

From the perspective of most people in rich countries today, this is all pretty disturbing. We have enjoyed a good measure of affluence and of stability. This reshaping of the world will bring further affluence, but it is vast and unknown and could easily lead to a collapse of stability.

The crucible

How humanity handles advanced AI is likely to determine the future. We are already entering into a period we could call “the crucible” — where AI begins to have a shaping impact on the world. By the time we exit the crucible, some of the broad lines for further unfolding will be sketched out ahead of us…

The crucible might lead to the ruin of civilization-as-we-know-it:

Catastrophe —
- We might see all-out nuclear war, if there is no agreement when the great powers flex their muscles to exert influence before they become irrelevant
- We might see catastrophic biological attacks, or some other catastrophe caused by future weaponry
Loss of the future —
- We might see a global dictatorship
- We might see the future expropriated from humanity by artificial agents

(Some of these may be recoverable-from; others not. But all seem undesirable — aside from the straightforward ways in which they’re bad, catastrophes seem more likely to lead to further securitization and consolidation of power, and increase the risk of ultimately ending up with loss of the future, or otherwise poor outcomes from the crucible.)

Part of our role during the crucible will be to steer away from these perils. But this cannot just consist of being cautious and reactive … the forces that are driving us forwards are powerful and unrelenting. The way out is through — in order to exit the crucible we must build a system strong enough to contain these forces. Such a system must equip us to recognise and steer away from dangers; and to coordinate actions enough to prevent any unilateral invitations to catastrophe.

Heating up

A reason that I found myself drawn (in writing this) to the crucible metaphor is a sense of things heating up. Right now, AI is starting to be integrated into things, but it’s still kind of small-scale. Most important things are still being done by humans, and by human institutions.

There will be a period over which things get hotter. AI capabilities will be stronger, and it will take on more important roles. Institutions around AI will become increasingly important. Technological progress will accelerate. Geopolitical tensions are likely to rise.

I’m not sure how fast the crucible will heat up, but I do think it’s unlikely to be instantaneous.

A hotter crucible is by default more dangerous — the powers at our fingertips become more awesome, and the world seems more fragile in comparison. Misalignment risk will increase. The way through involves using those powers to help contain the danger. If we do not tame these powers, we may see humanity gradually pushed out of the driving seat by the interaction of highly optimized forces.

The shape of technology

AI capabilities aren’t a scalar. There are a lot of powerful versions of the technologies that we might develop, and some of these look more concerning or more promising.

To some extent right now we don’t have a huge ability to steer technology — competitive pressures push people towards the path of most power soonest. But we do have some ability, and we may well obtain more before we’re through.

What kind of shaping are we concerned with, here?

One important dimension is avoiding dangerous agents. There are various possibilities here. We might strive to avoid powerful agents altogether (AI will still be transformative if we avoid agents!), or to build agentic systems only out of components that are transparent and reliable, or just to avoid especially dangerous approaches like RL-for-agency which may introduce hidden motivations.

[I think there’s something important to say about knowledge; cf. some of my thoughts, and some of Eric Drexler’s; but I’m not in the moment of writing working out how to fit it in.]

Other dimensions are concerned with helping us to navigate the challenges and avoid catastrophe. And above all, to build the kind of positive structures that we will need to safely exit the crucible.

The case for optimism

Right now, people are the ones with their hands on the levers, and most people ultimately don’t want the world to go to shit.

That isn’t enough to stop the world going to shit, of course. But it could be — if only we were a bit better at working out where things were going, and coming together to work out how to not go in the bad directions.

I don’t want to sound like too much of a naive optimist here. I think that that level of awareness and coordination would be a big reach by the track record of the world at handling major challenges.

But: what’s holding us back is fundamentally about capabilities. And this is something that AI could be placed to help with.

With excellent AI tools, we might see:

Greatly increased material affluence, so that it becomes much easier for the world to move out of a default scarcity-mindset
The best forecasts of the implications of technology being much better, and seen to be much better, than today, so that decision-makers treat them more like common knowledge about the landscape, rather than vague speculation
Applications that help a large number of people fluidly navigate complex informational landscapes, and not get caught up in misinformation (or help them to wisely navigate their own emotional issues, and look more objectively at the world)
Design of new systems (of many types) which are deeply secure and reliable
Bargaining assistance and new commitment solutions that help actors to navigate high-stakes situations without devolving into conflict
Mechanisms for democratic deliberation (identifying something like the “collective will” of a group) and oversight (keeping leaders aligned to the populations they represent)
Coordination mechanisms to allow groups of actors (e.g. AI company researchers) to identify and act in their collective interests

Taken together, these could pave the way to a world with a genuinely healthy political process in any democratic nations which choose to have one, and one in which international politics becomes more civilized and less fraught, as people make agreements (which everyone wants) which robustly take us into Paretotopia rather than risk burning the commons. Perhaps this would lead in the end to something like political unification — but it could also lead to more of an archipelago of different societies exploring different models, without hurting each other. A slogan might be: “world peace without world government”.

Will these beneficial capabilities arrive soon enough to help us tackle the biggest challenges? That remains to be seen. But we can try to help improve the odds.

What is needed?

We can frame this in terms of the things we’re trying to avoid, and/or the things we’re trying to build. Ultimately, we need both, and they support each other — successfully avoiding perils gives us more time to build the positive things; while in some cases even building early versions of the necessary tech (and getting appropriate adoption) may be a meaningful help in avoiding bad outcomes.

For avoiding bad outcomes, it’s worth noting that it’s often a better strategy to have a robust response that can intervene early before bad patterns can seriously get going, rather than trying to block them just at the point where things ultimately fall apart. And in an ideal world, there would never be any acute high stakes decisions — because everything would have enough sanity checks on it that errors of judgement would be caught and corrected.

For now, though, we’re not super close to that ideal world. Some decisions may be high stakes (most likely those made by major AI companies and/or national security; and most likely by the leadership of each of these), and they may be made well or poorly — or even without people noticing their importance. One focus, therefore, is to try in relatively direct ways to improve those decisions. Other key focuses are more like “try to build, and drive adoption of, the kind of tech/structures that we need to get through the crucible in good shape”.

So … Key focuses:

Key activities:
- Building key technologies
- Driving adoption of key technologies
- Shaping the direction of AI technology
  - Towards safe versions and away from dangerous versions
  - Could involve:
    - Basic research
    - Building scientific ~consensus
    - Advocating to get decision-makers to adopt safer forms
- Helping key decisions to go well
Key technologies:
- Epistemic tech to help raise the ceiling — better understanding of the situation for smart plugged in people
- Epistemic tech to help raise the floor — better understanding of the world for everyone, helping them navigate adversarial informational environments, or process emotional blockers
- Tech for coordination, to help get off-roads from racing/conflict/disaster
- Tech for democratic decision-making
- Tech for democratic oversight, & for avoiding egregiously bad decisions
- Tech to facilitate better conceptual research — for boosting alignment, and for avoiding philosophical errors
- Directly defensive tech (e.g. biodefence; cybersecurity; monitoring AI systems for safety)
- NB something which I think is important but exclude from this list is “tech to help with general abundance”; while it could help make things easier to navigate in many ways, I think it’s very well incentivized by normal market mechanisms, and is not a strategic priority
Non-tech ways to make key decisions more likely to go well
- Help good people get close to the levers of power
- Provide research and advice on what good decisions might look like
- Help to improve decision-making structures
- Help to shape incentives for key actors
Meta / indirect strategies for helping with the above:
- Field-building
- Building broad understanding of the strategic situation; especially …
  - … of risks, so that people can coordinate to avoid those
  - … of the positive futures that are possible, so that people can build towards these

(As I write this list, I’ve a nagging feeling I’m missing some things.)

Against an overly-narrow focus

These different priorities, to some extent, pull against each other. For example:

If we are exclusively concerned with loss of control to misaligned AI, the most robust ways to avoid that could involve keeping AI systems tightly contained — this could prevent or slow broad dissemination of capabilities, which might prevent us from building the radical new technology and social structures that could help us exit the crucible in good shape
If we are overly focused on preventing misuse of dangerous AI capabilities, we may lean towards damaging hardcore nonproliferation approaches
If we are exclusively focused on preventing human coups, this could lead us to wanting to radically decentralize power, which could increase misalignment risk
If we focus just on building positive technology, this could lead to a general policy of trying to accelerate, ignoring the inherent risks

More broadly, focusing hard on some aspects of the situation is perilous for the usual reasons that maximization is perilous. It makes sense for many individuals or organizations to be focused (because there are efficiency benefits), but as a whole community it’s important to stay in touch with the fact that we almost certainly haven’t traced through all of the important causal pathways involved, and that it’s often a good heuristic to try to make things straightforwardly good in high-leverage domains. (Historically, I think the approach of “think super hard about the strategic picture, and then soften a bit away from maximization” has outperformed “think super hard about the strategic picture, and then do the thing that seems highest EV”.)

Thanks to those who encouraged me to write and/or publish this piece. Thanks to those who commented on a draft. And thanks to many, many people for helping to inform my worldview.

^{^}
I’m not including the possibility of new moral patients. I think that in the longer term, moral patienthood is extremely important, but for the sake of making the transition go well, moral actors are far more important.