The crucible — how I think about the situation with AI

By Owen Cotton-Barratt @ 2025-05-05T13:19 (+33)

This is a linkpost to https://strangecities.substack.com/p/the-crucible

The basic situation

The world is wild and terrible and wonderful and rushing forwards so so fast.

Modern economies are tremendous things, allowing crazy amounts of coordination. People have got really very good at producing stuff. Long-term trends are towards more affluence, and less violence.

The enlightenment was pretty fantastic not just for bringing us better tech, but also more truthseeking, better values, etc.

People, on the whole, are basically good — they want good things for others, and they want to be liked, and they want the truth to come out. This is some mix of innate and socially conditioned. (It isn’t universal.) But they also often are put in a tight spot and end up looking out for themselves or those they love. The hierarchy of needs bites. Effective altruism often grows from a measure of privilege.

The world is shaped by economics and by incentives and by institutions and by narratives and by societal values and by moral leadership and by technology. All of these have a complex interplay.

AI enters the picture

“AI” is a cluster of powerful technologies which are likely to reshape the world. Each of economics and incentives and institutions and narratives and societal values and moral leadership will (I expect) be profoundly impacted by advanced AI. And, of course, AI will be used to shape advanced technologies themselves.

From a zoomed-out perspective, there are three impacts of this AI transition which matter most[1]:

  1. Technological progress will accelerate — automation of research means the world will get even faster and crazier
  2. New kinds of entities could be making the choices that matter — either purely artificial agents, or new hybrid institutions which incorporate AI deeply in their decision-making
  3. It will become far easier to centralize control — a single entity (AI system, or organization, or person) could end up with robust and enduring power over a large domain, or even the entire world

This is … kind of wild. 2) and 3) could lead to profound changes to the way the world works, and 1) means this entire thing might happen very quickly (and that people are therefore more likely to fumble it).

From the perspective of most people in rich countries today, this is all pretty disturbing. We have enjoyed a good measure of affluence and of stability. This reshaping of the world will bring further affluence, but it is vast and unknown and could easily lead to a collapse of stability.

The crucible

How humanity handles advanced AI is likely to determine the future. We are already entering into a period we could call “the crucible” — where AI begins to have a shaping impact on the world. By the time we exit the crucible, some of the broad lines for further unfolding will be sketched out ahead of us

The crucible might lead to the ruin of civilization-as-we-know-it:

(Some of these may be recoverable-from; others not. But all seem undesirable — aside from the straightforward ways in which they’re bad, catastrophes seem more likely to lead to further securitization and consolidation of power, and increase the risk of ultimately ending up with loss of the future, or otherwise poor outcomes from the crucible.)

Part of our role during the crucible will be to steer away from these perils. But this cannot just consist of being cautious and reactive … the forces that are driving us forwards are powerful and unrelenting. The way out is through — in order to exit the crucible we must build a system strong enough to contain these forces. Such a system must equip us to recognise and steer away from dangers; and to coordinate actions enough to prevent any unilateral invitations to catastrophe.

Heating up

A reason that I found myself drawn (in writing this) to the crucible metaphor is a sense of things heating up. Right now, AI is starting to be integrated into things, but it’s still kind of small-scale. Most important things are still being done by humans, and by human institutions.

There will be a period over which things get hotter. AI capabilities will be stronger, and it will take on more important roles. Institutions around AI will become increasingly important. Technological progress will accelerate. Geopolitical tensions are likely to rise.

I’m not sure how fast the crucible will heat up, but I do think it’s unlikely to be instantaneous.

A hotter crucible is by default more dangerous — the powers at our fingertips become more awesome, and the world seems more fragile in comparison. Misalignment risk will increase. The way through involves using those powers to help contain the danger. If we do not tame these powers, we may see humanity gradually pushed out of the driving seat by the interaction of highly optimized forces.

The shape of technology

AI capabilities aren’t a scalar. There are a lot of powerful versions of the technologies that we might develop, and some of these look more concerning or more promising.

To some extent right now we don’t have a huge ability to steer technology — competitive pressures push people towards the path of most power soonest. But we do have some ability, and we may well obtain more before we’re through.

What kind of shaping are we concerned with, here?

One important dimension is avoiding dangerous agents. There are various possibilities here. We might strive to avoid powerful agents altogether (AI will still be transformative if we avoid agents!), or to build agentic systems only out of components that are transparent and reliable, or just to avoid especially dangerous approaches like RL-for-agency which may introduce hidden motivations.

[I think there’s something important to say about knowledge; cf. some of my thoughts, and some of Eric Drexler’s; but I’m not in the moment of writing working out how to fit it in.]

Other dimensions are concerned with helping us to navigate the challenges and avoid catastrophe. And above all, to build the kind of positive structures that we will need to safely exit the crucible.

The case for optimism

Right now, people are the ones with their hands on the levers, and most people ultimately don’t want the world to go to shit.

That isn’t enough to stop the world going to shit, of course. But it could be — if only we were a bit better at working out where things were going, and coming together to work out how to not go in the bad directions.

I don’t want to sound like too much of a naive optimist here. I think that that level of awareness and coordination would be a big reach by the track record of the world at handling major challenges.

But: what’s holding us back is fundamentally about capabilities. And this is something that AI could be placed to help with.

With excellent AI tools, we might see:

Taken together, these could pave the way to a world with a genuinely healthy political process in any democratic nations which choose to have one, and one in which international politics becomes more civilized and less fraught, as people make agreements (which everyone wants) which robustly take us into Paretotopia rather than risk burning the commons. Perhaps this would lead in the end to something like political unification — but it could also lead to more of an archipelago of different societies exploring different models, without hurting each other. A slogan might be: “world peace without world government”.

Will these beneficial capabilities arrive soon enough to help us tackle the biggest challenges? That remains to be seen. But we can try to help improve the odds.

What is needed?

We can frame this in terms of the things we’re trying to avoid, and/or the things we’re trying to build. Ultimately, we need both, and they support each other — successfully avoiding perils gives us more time to build the positive things; while in some cases even building early versions of the necessary tech (and getting appropriate adoption) may be a meaningful help in avoiding bad outcomes.

For avoiding bad outcomes, it’s worth noting that it’s often a better strategy to have a robust response that can intervene early before bad patterns can seriously get going, rather than trying to block them just at the point where things ultimately fall apart. And in an ideal world, there would never be any acute high stakes decisions — because everything would have enough sanity checks on it that errors of judgement would be caught and corrected.

For now, though, we’re not super close to that ideal world. Some decisions may be high stakes (most likely those made by major AI companies and/or national security; and most likely by the leadership of each of these), and they may be made well or poorly — or even without people noticing their importance. One focus, therefore, is to try in relatively direct ways to improve those decisions. Other key focuses are more like “try to build, and drive adoption of, the kind of tech/structures that we need to get through the crucible in good shape”.

So … Key focuses:

(As I write this list, I’ve a nagging feeling I’m missing some things.)

Against an overly-narrow focus

These different priorities, to some extent, pull against each other. For example:

More broadly, focusing hard on some aspects of the situation is perilous for the usual reasons that maximization is perilous. It makes sense for many individuals or organizations to be focused (because there are efficiency benefits), but as a whole community it’s important to stay in touch with the fact that we almost certainly haven’t traced through all of the important causal pathways involved, and that it’s often a good heuristic to try to make things straightforwardly good in high-leverage domains. (Historically, I think the approach of “think super hard about the strategic picture, and then soften a bit away from maximization” has outperformed “think super hard about the strategic picture, and then do the thing that seems highest EV”.)

Thanks to those who encouraged me to write and/or publish this piece. Thanks to those who commented on a draft. And thanks to many, many people for helping to inform my worldview.

  1. ^

    I’m not including the possibility of new moral patients. I think that in the longer term, moral patienthood is extremely important, but for the sake of making the transition go well, moral actors are far more important.