Views on when AGI comes and on strategy to reduce existential risk

By TsviBT @ 2023-07-08T09:00 (+31)

Summary: AGI isn't super likely to come super soon. People should be working on stuff that saves humanity in worlds where AGI comes in 20 or 50 years, in addition to stuff that saves humanity in worlds where AGI comes in the next 10 years.

Thanks to Alexander Gietelink Oldenziel, Abram Demski, Daniel Kokotajlo, Cleo Nardo, Alex Zhu, and Sam Eisenstat for related conversations.

My views on when AGI comes

AGI

By "AGI" I mean the thing that has very large effects on the world (e.g., it kills everyone) via the same sort of route that humanity has large effects on the world. The route is where you figure out how to figure stuff out, and you figure a lot of stuff out using your figure-outers, and then the stuff you figured out says how to make powerful artifacts that move many atoms into very specific arrangements.

This isn't the only thing to worry about. There could be transformative AI that isn't AGI in this sense. E.g. a fairly-narrow AI that just searches configurations of atoms and finds ways to do atomically precise manufacturing would also be an existential threat and a possibility for an existential win.

Conceptual capabilities progress

The "conceptual AGI" view:

The first way humanity makes AGI is by combining some set of significant ideas about intelligence. Significant ideas are things like (the ideas of) gradient descent, recombination, probability distributions, universal computation, search, world-optimization. Significant ideas are to a significant extent bottlenecked on great natural philosophers doing great natural philosophy about intelligence, with sequential bottlenecks between many insights.

The conceptual AGI doesn't claim that humanity doesn't already have enough ideas to make AGI. I claim that——though not super strongly.

Timelines

Giving probabilities here doesn't feel great. For one thing, it seems to contribute to information cascades and to shallow coalition-forming. For another, it hides the useful models. For yet another thing: A probability bundles together a bunch of stuff I have models about, with a bunch of stuff I don't have models about. For example, how many people will be doing original AGI-relevant research in 15 years? I have no idea, and it seems like largely a social question. The answer to that question does affect when AGI comes, though, so a probability about when AGI comes would have to depend on that answer.

But ok. Here's some butt-numbers:

If I were trying to make a model with parts, I might try starting with a mixture of Erlang distributions of different shapes, and then stretching that according to some distribution about the number of people doing original AI research over time.

Again, this is all butt-numbers. I have almost no idea about how much more understanding is needed to make AGI, except that it doesn't seem like we're there yet.

Responses to some arguments for AGI soon

The "inputs" argument

At about 1:15 in this interview, Carl Shulman argues (quoting from the transcript):

We've been scaling [compute expended on ML] up four times as fast as was the case for most of the history of AI. We're running through the orders of magnitude of possible resource inputs you could need for AI much much more quickly than we were for most of the history of AI. That's why this is a period with a very elevated chance of AI per year because we're moving through so much of the space of inputs per year [...].

This isn't the complete argument Shulman gives, but on its own it's interesting. On its own, it's valid, but only if we're actually scaling up all the needed inputs.

On the conceptual AGI view, this isn't the case, because we aren't very greatly increasing the amount of great natural philosophers doing great natural philosophy about intelligence. That's a necessary input, and it's only being somewhat scaled up. For one thing, many new AI researchers are correlated with each other, and many are focused on scaling up, applying, and varying existing ideas. For another thing, sequential progress can barely be sped up with more bodies.

The "big evolution" argument

Carl goes on to argue that eventually, when we have enough compute, we'll be able to run a really big evolutionary process that finds AGIs (if we haven't already made AGI). This idea also appears in Ajeya Cotra's report on the compute needed to create AGI.

I broadly agree with this. But I have two reasons that this argument doesn't make AGI seem very likely very soon.

The first reason is that running a big evolution actually seems kind of hard; it seems to take significant conceptual progress and massive engineering effort to make the big evolution work. What I'd expect to see when this is tried, is basically nothing; life doesn't get started, nothing interesting happens, the entities don't get far (beyond whatever primitives were built in). You can get around this by invoking more compute, e.g. by simulating physics more accurately at a more detailed level, or by doing hyperparameter search to find worlds that lead to cool stuff. But then you're invoking more compute. (I'd also expect a lot of the hacks that supposedly make our version of evolution much more efficient than real evolution, to actually result in our version being circumscribed, i.e. it peters out because the shortcut that saved compute also cut off some important dimensions of search.)

The second reason is that evolution seems to take a lot of serial time. There's probably lots of clever things one can do to shortcut this, but these would be significant conceptual progress.

"I see how to do it"

My (limited / filtered) experience with these ideas leads me to think that [ideas knowably sufficient to make an AGI in practice] aren't widespread or obvious. (Obviously it is somehow feasible to make an AGI, because evolution did it.)

The "no blockers" intuition

An intuition that I often encounter is something like this:

Previously, there were blockers to current systems being developed into AGI. But now those blockers have been solved, so AGI could happen any time now.

This sounds to my ears like: "I saw how to make AGI, but my design required X. Then someone made X, so now I have a design for an AGI that will work.". But I don't think that's what they think. I think they don't think they have to have a design for an AGI in order to make an AGI.

I kind of agree with some version of this——there's a lot of stuff you don't have to understand, in order to make something that can do some task. We observe this in modern ML. But current systems, though they impressively saturate some lower-dimensional submanifold of capability-space, don't permeate a full-dimensional submanifold. Intelligence is a positive thing. Most computer code doesn't put itself on an unbounded trajectory of gaining capabilities. To make it work you have to do engineering and science, at some level. Bridges don't hold weight just because there's nothing blocking them from holding weight.

Daniel Kokotajlo points out that for things that grow, it's kind of true that they'll succeed as long as there aren't blockers——and for example animal husbandry kind of just works, without the breeders understanding much of anything about the internals of why their selection pressures are met with adequate options to select. This is true, but it doesn't seem very relevant to AGI because we're not selecting from an existing pool of highly optimized "genomic" (that is, mental) content. If instead of tinkering with de novo gradient-searched circuits, we were tinkering with remixing and mutating whole-brain emulations, then I would think AGI comes substantially sooner.

Another regime where "things just work" is many mental contexts where a task is familiar enough in some way that you can expect to succeed at the task by default. For example, if you're designing a wadget, and you've previously designed similar wadgets to similar specifications, then it makes sense to treat a design idea as though it's going to work out——as though it can be fully fleshed out into a satisfactory, functioning design——unless you see something clearly wrong with it, a clear blocker like a demand for a metal with unphysical properties. Again, like the case of animal husbandry, the "things just work" comes from the (perhaps out of sight) preexisting store of optimized content that's competent to succeed at the task given a bit of selection and arrangement. In the case of AGI, no one's ever built anything like that, so the store of knowledge that would automatically flesh out blockerless AGI ideas is just not there.

Yet another such regime is markets, where the crowd of many agents can be expected to figure out how to do something as long as it's feasible. So, a version of this intuition goes:

There are a lot of people trying to make AGI. So either there's some strong blocker that makes it so that no one can make AGI, or else someone will make AGI.

This is kind of true, but it just goes back to the question of how much conceptual progress will people make towards AGI. It's not an argument that we already have the understanding needed to make AGI. If it's used as an argument that we already have the understanding, then it's an accounting mistake: it says "We already have the understanding. The reason we don't need more understanding, is that if there were more understanding needed, someone else will figure it out, and then we'll have it. Therefore no one needs to figure anything else out.".

Finally: I also see a fair number of specific "blockers", as well as some indications that existing things don't have properties that would scare me.

"We just need X" intuitions

Another intuition that I often encounter is something like this:

We just need X to get AGI. Once we have X, in combination with Y it will go all the way.

Some examples of Xs: memory, self-play, continual learning, curricula, AIs doing AI research, learning to learn, neural nets modifying their own weights, sparsity, learning with long time horizons.

For example: "Today's algorithms can learn anything given enough data. So far, data is limited, and we're using up what's available. But self-play generates infinite data, so our systems will be able to learn unboundedly. So we'll get AGI soon.".

This intuition is similar to the "no blockers" intuition, and my main response is the same: the reason bridges stand isn't that you don't see a blocker to them standing. See above.

A "we just need X" intuition can become a "no blockers" intuition if someone puts out an AI research paper that works out some version of X. That leads to another response: just because an idea is, at a high level, some kind of X, doesn't mean the idea is anything like the fully-fledged, generally applicable version of X that one imagines when describing X.

For example, suppose that X is "self-play". One important thing about self-play is that it's an infinite source of data, provided in a sort of curriculum of increasing difficulty and complexity. Since we have the idea of self-play, and we have some examples of self-play that are successful (e.g. AlphaZero), aren't we most of the way to having the full power of self-play? And isn't the full power of self-play quite powerful, since it's how evolution made AGI? I would say "doubtful". The self-play that evolution uses (and the self-play that human children use) is much richer, containing more structural ideas, than the idea of having an agent play a game against a copy of itself.

Most instances of a category are not the most powerful, most general instances of that category. So just because we have, or will soon have, some useful instances of a category, doesn't strongly imply that we can or will soon be able to harness most of the power of stuff in that category. I'm reminded of the politician's syllogism: "We must do something. This is something. Therefore, we must do this.".

The bitter lesson and the success of scaling

Sutton's bitter lesson, paraphrased:

AI researchers used to focus on coming up with complicated ideas for AI algorithms. They weren't very successful. Then we learned that what's successful is to leverage computation via general methods, as in deep learning and massive tree search.

Some add on:

And therefore what matters in AI is computing power, not clever algorithms.

This conclusion doesn't follow. Sutton's bitter lesson is that figuring out how to leverage computation using general methods that scale with more computation beats trying to perform a task by encoding human-learned specific knowledge about the task domain. You still have to come up with the general methods. It's a different sort of problem——trying to aim computing power at a task, rather than trying to work with limited computing power or trying to "do the task yourself"——but it's still a problem. To modify a famous quote: "In some ways we feel we are as bottlenecked on algorithmic ideas as ever, but we believe we are bottlenecked on a higher level and about more important things."

Large language models

Some say:

LLMs are already near-human and in many ways super-human general intelligences. There's very little left that they can't do, and they'll keep getting better. So AGI is near.

This is a hairy topic, and my conversations about it have often seemed not very productive. I'll just try to sketch my view:

Other comments on AGI soon

My views on strategy

Things that might actually work

Besides the standard stuff (AGI alignment research, moratoria on capabilities research, explaining why AGI is an existential risk), here are two key interventions:


Chris Leong @ 2023-07-10T00:22 (+4)

Confrontation-worthy empathy feels like it might have been worthwhile splitting out into its own post as it is very different from the timelines discussion.

JP Addison @ 2023-07-08T15:07 (+3)

I found this helpful and clearly written. Thanks for cross-posting.

Roman Leventov @ 2023-07-10T08:52 (+2)

There are many more interventions that might work on decades-long timelines that you didn't mention: