AI and the feeling of living in two worlds

By michel @ 2024-10-10T17:51 (+40)

This is a cross-post from my Substack, where I don't assume much background knowledge on AI safety and trends.

I feel like I live in two different worlds sometimes.

In one world, the next decade will be... normal. Sure, I’ll see the world change, but it will change in familiar ways. I’ll see wars in places other than the US and western Europe, turbulent US politics, and a steady stream of tech and societal progress.

In the other world, the next decade will be.... turbo-charged by transformative AI progress.

In this possible future, the abstract predictions of some AI experts and insiders today stop being abstract. I’ll actually see AI systems that can do most of work of today's software-based knowledge workers, and eventually of small organizations; I’ll see people close to me form deep relationships with AI companions; I’ll see fierce international competition to control AI systems and data centers that are useful for the economy and military; I’ll see AI systems that rapidly speed up science and technology innovations^[1]; I’ll see massive wealth and influence accrue to those who control this technology. And unless we solve some unsolved technical problems, I could also see AI systems that misunderstand their developers' instructions and implicit guardrails in catastrophic ways.

I live with a foot in each of these worlds. In the normal world, most people around me predict continuity. I sometimes join them in that prediction—or maybe it’s just the absence of any predictions at all. Of course the future will change, but it’s very hard to say how and it will probably be fine.

But I also have a foot in the world of AI and forecasting. I live and work in an environment filled with people who develop, invest in, and try to govern frontier AI systems.^[2] Predictions of a decade turbo-charged by transformative AI progress are strikingly com mon. When I hear people in San Francisco talk about the pace of progress of the most advanced AI systems and what they may be capable of in the next few years, they don’t paint a picture of a normal decade.

A deep mismatch

The mismatch between these two worlds feels strange. It feels strange to read papers on the lack of immediate bottlenecks for continuing AI’s exponential progress and the possibility of explosive technological growth... and then hear politicians argue about 2050 climate targets. It feels strange to hear about dreams from AI labs to accelerate AI progress by automating AI research and predictions from AI insiders about just how soon we’ll have thousands of copies of systems that can do nearly every task humans can... and then invest in my retirement fund.

In those moments, I feel like I’m sitting in the same theater as friends from my everyday life yet watching a very different movie.

To be clear, I’m obviously uncertain about how the next decade will unfold. In weeks when everyone around me is minding their own non-AI business, I wonder if I'm the one who has fallen for an elaborate illusion. But given both my deference to some experts and my own thinking on the matter, strikingly fast AI progress seems plausible. And plausible futures are ones to take seriously.

Orienting to the possibility of rapid, transformative AI progress

So I think it’s possible that tech companies will develop extremely capable AI systems in the next decade, and that this transforms the economy, poses risks from misalignment, and drives international rivalry—but I’m not sure.

So what? What is the appropriate reaction to taking the possibility of rapid AI progress seriously? I’ve been asking myself that.

My first answer, as I’ve alluded to, is don’t ignore it. Plausible futures are ones to take seriously if it’s possible to influence the trajectory of that future for better or worse. And I think that’s possible; we have agency over our collective future.

So, reluctantly, I’ve kind of become an AI bro. AI has got my attention (although I’ll try not to be insufferable about it.)

Now there’s still a big spectrum of just how people, like me, pay attention. Some people who see themselves at the dawn of transformative AI progress advocate to accelerate progress while others advocate to stop AI development entirely. It’s not just outside observers who disagree here; the experts are split too. Some scientists argue that advanced AI poses an existential threat to humanity and we should invest heavily in AI safety while others disagree.

I see the debate between ‘accelerationists’ and people advocating for AI safety as a disagreement about two fundamental questions:

What types of AI systems might we develop?
Does the default (or ‘business-as-usual’) development of these systems have good or bad outcomes?

Timidly, I’ve formed views on these questions. And I’ve come to an overarching stance: If we truly get extremely capable, fast, and widely-copyable AI systems this decade that can do nearly all the tasks a smart human can with a computer, I worry we—humanity—are underprepared.

I’m mostly interested in the most capable AI systems

I think a lot of the time when people disagree about the promises and perils of AI they’re really just disagreeing about what systems will be built.

If you think the future of AI just looks like incremental improvements to chatbots that respond to one prompt at a time, a lot of what I’ve written about the effect of AI development may not make sense. What’s this guy talking about when he says chatbots will drive fierce international competition? How could chatbots and video generators reshape the economy, let alone cause a catastrophe??

I want to be clear what I am picturing so that my later discussion of the effects and challenges of these systems makes sense. I’m picturing advanced AI systems that are superhuman at coding and hacking; that can work and ‘think’ much faster than humans; that can be copied arbitrarily and run in parallel; and that can plan and execute on long sequences of tasks. (See what will GPT-2030 look like for the expert prediction I’m working with.)

I can imagine this sounds intuitively far-fetched to some. But remember, 5 years ago AI systems could barely write a coherent sentence. Now systems can pass the bar exam and get a silver-medal medal in the International Math Olympiad. We’re still riding the exponential of capability improvements, and the key inputs for AI progress (computational power, researchers, data, energy) probably still have some room to keep growing at current rates.

My focus on the most capable yet feasible AI systems is deliberate. In the same way that it makes sense to consider the most extreme possible earthquakes when designing a building, I think it makes sense to prepare for the most transformative AI systems society might face.

If I didn’t think such transformative AI systems were at all possible, I expect my views would look much more like those who want the biggest, newest models now. I’m generally pro technological progress; I generally like GPT-4; I’ll probably like GPT-5; I’m pro ‘horizontally’ developing and implementing AI systems. But truly advanced AI systems don’t seem like just another typical technology.

Developing extremely capable AI systems soon poses challenges we don’t seem prepared for

If we actually develop the type of advanced AI systems some experts and insiders expect we will in the next decade, I worry that we’re underprepared.

There are many predictable challenges on the business-as-usual trajectory.^[3] A few illustrative clusters:

Technical alignment challenges: At a technical level, we don’t know how to align the most capable AI systems such that they robustly behave in the ways their developers want. Today's system can almost always be ‘jailbroken’, for example. There are theoretical reasons and empirical evidence to suggest that future models may be even harder to align and could, for example, learn to deceive humans in catastrophic ways.^[4]
Power concentration challenges: The developers of AI models who can afford to run thousands of copies of the most advanced AI systems 24/7, 365 days a year—and accrue enormous income—could accumulate an astonishing amount of power^[5] over those without the same AI labor multiplier. That seems bad, but so does allowing open-source access to these models in a way that allows malicious people to misuse them (eg., launch their own cyberattacks).
Societal challenges: Advanced AI systems could lead to many simultaneous challenges within a society, like wide-spread job displacement without sufficient societal support systems or wide-spread surveillance and misinformation. Some individual risks may be overblown, but collectively challenges like these could strain already frazzled institutions and equilibriums.
Geopolitical challenges: States may perceive advanced AI development as a new arms race, given the effect it could have on strengthening militaries and economies. The US and China are arguably already in such an AI arms race. This could lead to reckless military AI development and international disputes that fan the flames of existing tensions between great powers, while states without advanced AI programs lose bargaining power.

I don’t hear of many people working on these challenges in a way that takes the possibility of very capable AI systems seriously. If such systems arrive soon, I worry many institutions and regulatory bodies that are already pushed to the limit by conventional challenges will struggle to adapt. And there are almost certainly challenges we’ll have to face that we can’t foresee yet.

What worries me as well is the financial incentives of the companies developing the most advanced AI systems. They’re incentivized to beat their competitors, not price in the risks their systems could pose to society at large.

. . .

I'm uncertain, as usual. I know I’m being a downer by focusing on the risks without talking about the upsides of advanced AI systems, or how society could become resilient to these challenges. Don’t get me wrong, I think there could be great benefits from these systems, like energy abundance for disadvantaged parts of the world, cures for diseases, and creative empowerment. I also think (if we solve the alignment challenges) we might just stumble through the transition to a world with advanced AI systems, finding solutions through a combination of democratic processes, patch work regulation, help from AI systems, and at least somewhat responsible action from frontier AI companies.

But right now, I’m looking at this non-exhaustive list of predictable challenges and the financial incentives to keep building more and more capable models, and I feel worried. If we’re in the turbo-charged AI world, today’s institutions and incentives seem ill-equipped to handle this magnitude of change.

If-Then updates, at a personal and policy level

What is the appropriate reaction to predicting worrying futures but remaining uncertain about their likelihood?

I want to find an orientation that takes the possibility of being in a risky world seriously, but maintains the humility that I and others may be wrong.

A tool for this basic problem that I’ve been thinking more about recently is If-Then pairings. The basic idea is to proactively identify signals that would provide strong evidence of being in a world with low-risk or high-risk along with appropriate responses.

I’m excited about the If-Then framework at both a policy level and personal level.

At a policy level, ‘If-Then’ AI policy recommendations (also called an ‘AI policy blueprint’) could allow for better expert collaboration. Experts could collaborate on identifying solutions if certain conditions arise, despite disagreements about when (or even whether) those conditions will arise.^[6] Ideally technical experts also design capability tests that are explicitly made for these kinds of updates.

At a personal level, I also want to stay tuned for evidence that could reduce my uncertainty about both the speed and implications of AI progress. For example, how good is the GPT-5 generation of models relative to expert predictions? Are we seeing positive feedback loops where frontier AI labs can use their AI systems to rapidly improve future AI systems? Do we make progress on technical alignment and sensible AI regulation in the US?

I want to earnestly think more about these signals and appropriate responses. What would really convince me that we’re in a world of low or high-risk?^[7] In doing this thinking proactively I hope I avoid reactions like ‘ah, this is normal’ if I see seriously concerning evidence. And vice versa, if a lot of the capabilities and dangerous signals I worry about don’t bear fruit any time soon, it could help me recognize that maybe I live in the (at least somewhat) normal world.

For now though, I’ll keep living with a foot in two worlds. Maybe that AI movie that I was watching out of the corner of my eye has a happy ending. I hope so. As I watch and cautiously participate in its unfolding, I think I’ll have to keep living with the nagging feeling that I'm either worrying too much or not nearly enough.

^{^}
Or at least the parts of science that don’t require extensive experiments in the physical world.
^{^}
When I say ‘frontier AI systems,’ I’m referring to the cutting-edge, general-purpose systems, typically trained with the most computational power. I also here these referred to as foundation models.
^{^}
In this business-as-usual trajectory, I’m predicting the world looks similar to today and dominant aspects of the current AI development landscape (eg., compute-intensive deep learning paradigm) hold.
^{^}
These problems are exacerbated by unsolved challenges with respect to predicting when a model will develop certain dangerous capabilities, reliably eliciting a models’ capabilities, and interpreting the giant mess of a neural network in terms of concepts familiar to humans.
^{^}
Ie., richer and more capable of influencing others.
^{^}
To be clear though, I think the conditions for some AI policy and regulation efforts have already arisen. For example, I don’t need to wait for more signals to advocate for more investment in AI evaluation science and whistleblower protections for employees of top AI companies.
^{^}
I think earnestly engaging with this question also involves taking a step back and asking questions like ‘What risks would I expect a strong signal on, and for what risks may we see very few signals about before a catastrophe?’ This post has some thought-provoking discussion on this, as well as this one; I may be underestimating how hard this is.

Chris Leong @ 2024-10-11T09:19 (+5)

Interesting article, however, I would class some of the things you've suggested might happen in the crazy growth world as better fitting the modest improvement in AI abilities world.

Ben Stevenson @ 2024-10-11T06:22 (+3)

Hey Michel! I really liked this post and I relate to "sitting in the same theater as friends from my everyday life yet watching a very different movie". "No-one in my org puts money in their pension" by @tobyj captured the same feelings well.

SummaryBot @ 2024-10-10T19:55 (+1)

Executive summary: The author feels torn between two possible futures - one of normal progress and one rapidly transformed by AI - and argues we may be underprepared for the challenges posed by advanced AI systems that could emerge in the coming decade.

Key points:

The author perceives a mismatch between "normal" predictions and the rapid AI progress predicted by some experts and insiders.
Focus is on potentially transformative AI systems that could emerge soon, not just incremental improvements.
Major challenges of advanced AI include technical alignment, power concentration, societal disruption, and geopolitical tensions.
Current institutions and incentives seem ill-equipped to handle rapid, transformative AI progress.
The author advocates for an "If-Then" approach to policy and personal planning to navigate uncertainty about AI trajectories.
While acknowledging potential benefits, the author worries we may be either over- or under-reacting to AI risks.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.