Getting Washington and Silicon Valley to tame AI (Mustafa Suleyman on the 80,000 Hours Podcast)

By 80000_Hours @ 2023-09-04T16:25 (+7)

We just published an interview: Mustafa Suleyman on getting Washington and Silicon Valley to tame AI. You can click through for the audio, a full transcript, and related links. Below are the episode summary and some key excerpts.

Episode summary

So people have this fear, particularly in the US, of pessimistic outlooks. I mean, the number of times people come to me like, “You seem to be quite pessimistic.” No, I just don’t think about things in this simplistic “Are you an optimist or are you a pessimist?” terrible framing. It’s BS. I’m neither.
I’m just observing the facts as I see them, and I’m doing my best to share for critical public scrutiny what I see. If I’m wrong, rip it apart and let’s debate it — but let’s not lean into these biases either way.
- Mustafa Suleyman

Mustafa Suleyman was part of the trio that founded DeepMind, and his new AI project is building one of the world’s largest supercomputers to train a large language model on 10–100x the compute used to train ChatGPT.

But far from the stereotype of the incorrigibly optimistic tech founder, Mustafa is deeply worried about the future, for reasons he lays out in his new book The Coming Wave: Technology, Power, and the 21st Century’s Greatest Dilemma (coauthored with Michael Bhaskar). The future could be really good, but only if we grab the bull by the horns and solve the new problems technology is throwing at us.

On Mustafa’s telling, AI and biotechnology will soon be a huge aid to criminals and terrorists, empowering small groups to cause harm on previously unimaginable scales. Democratic countries have learned to walk a ‘narrow path’ between chaos on the one hand and authoritarianism on the other, avoiding the downsides that come from both extreme openness and extreme closure. AI could easily destabilise that present equilibrium, throwing us off dangerously in either direction. And ultimately, within our lifetimes humans may not need to work to live any more — or indeed, even have the option to do so.

And those are just three of the challenges confronting us. In Mustafa’s view, ‘misaligned’ AI that goes rogue and pursues its own agenda won’t be an issue for the next few years, and it isn’t a problem for the current style of large language models. But he thinks that at some point — in eight, ten, or twelve years — it will become an entirely legitimate concern, and says that we need to be planning ahead.

In The Coming Wave, Mustafa lays out a 10-part agenda for ‘containment’ — that is to say, for limiting the negative and unforeseen consequences of emerging technologies:

Developing an Apollo programme for technical AI safety
Instituting capability audits for AI models
Buying time by exploiting hardware choke points
Getting critics involved in directly engineering AI models
Getting AI labs to be guided by motives other than profit
Radically increasing governments’ understanding of AI and their capabilities to sensibly regulate it
Creating international treaties to prevent proliferation of the most dangerous AI capabilities
Building a self-critical culture in AI labs of openly accepting when the status quo isn’t working
Creating a mass public movement that understands AI and can demand the necessary controls
Not relying too much on delay, but instead seeking to move into a new somewhat-stable equilibria

As Mustafa put it, “AI is a technology with almost every use case imaginable” and that will demand that, in time, we rethink everything.

Rob and Mustafa discuss the above, as well as:

Whether we should be open sourcing AI models
Whether Mustafa’s policy views are consistent with his timelines for transformative AI
How people with very different views on these issues get along at AI labs
The failed efforts (so far) to get a wider range of people involved in these decisions
Whether it’s dangerous for Mustafa’s new company to be training far larger models than GPT-4
Whether we’ll be blown away by AI progress over the next year
What mandatory regulations government should be imposing on AI labs right now
Appropriate priorities for the UK’s upcoming AI safety summit

Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.

Producer and editor: Keiran Harris
Audio Engineering Lead: Ben Cordell
Technical editing: Milo McGuire
Transcriptions: Katy Moore

Highlights

How to get sceptics to take safety seriously

Mustafa Suleyman: The first part of the book mentions this idea of “pessimism aversion,” which is something that I’ve experienced my whole career; I’ve always felt like the weirdo in the corner who’s raising the alarm and saying, “Hold on a second, we have to be cautious.” Obviously lots of people listening to this podcast will probably be familiar with that, because we’re all a little bit more fringe. But certainly in Silicon Valley, that kind of thing… I get called a “decel” sometimes, which I actually had to look up. I guess it’s a play on me being an incel, which obviously I’m not, and some kind of decelerationist or Luddite or something — which is obviously also bananas, given what I’m actually doing with my company.
Rob Wiblin: It’s an extraordinary accusation.
Mustafa Suleyman: It’s funny, isn’t it? So people have this fear, particularly in the US, of pessimistic outlooks. I mean, the number of times people come to me like, “You seem to be quite pessimistic.” No, I just don’t think about things in this simplistic “Are you an optimist or are you a pessimist?” terrible framing. It’s BS. I’m neither. I’m just observing the facts as I see them, and I’m doing my best to share for critical public scrutiny what I see. If I’m wrong, rip it apart and let’s debate it — but let’s not lean into these biases either way.
So in terms of things that I found productive in these conversations: frankly, the national security people are much more sober, and the way to get their head around things is to talk about misuse. They see things in terms of bad actors, non-state actors, threats to the nation-state. In the book, I’ve really tried to frame this as implications for the nation-state and stability — because at one level, whether you’re progressive or otherwise, we care about the ongoing stability of our current order. We really don’t want to live in this Mad Maxian, hyper-libertarian, chaos post-nation-state world.
The nation-state, I think we can all agree that a shackled Leviathan does a good job of putting constraints on the chaotic emergence of bad power, and uses that to do redistribution in a way that keeps peace and prosperity going. So I think that there’s general alignment around that. And if you make clear that this has the potential to be misused, I think that’s effective.
What wasn’t effective, I can tell you, was the obsession with superintelligence. I honestly think that did a seismic distraction — if not disservice — to the actual debate. There were many more practical things. because I think a lot of people who heard that in policy circles just thought, well, this is not for me. This is completely speculative. What do you mean, ‘recursive self-improvement’? What do you mean, ‘AGI superintelligence taking over’?” The number of people who barely have heard the phrase “AGI” but know about paperclips is just unbelievable. Completely nontechnical people would be like, “Yeah, I’ve heard about the paperclip thing. What, you think that’s likely?” Like, “Oh, geez, that is… Stop talking about paperclips!” So I think avoid that side of things: focus on misuse.

Is there a risk that Mustafa's company could speed up the race towards dangerous capabilities?

Rob Wiblin: On that general theme, a recurring question submitted by listeners was along these lines, basically: that you’re clearly alarmed about advances in AI capabilities in the book, and you’re worried that policy is lagging behind. And in the book you propose all kinds of different policies for containment, like auditing and using choke points to slow things down. And you say we need to find ways of, a literal quote: “Finding ways of buying time, slowing down, giving space for more work on the answers.”
But at the same time, your company is building one of the largest supercomputers in the world, and you think over the next 18 months you might do a language model training run that’s 10x or 100x larger than the one that produced GPT-4. Isn’t it possible that your own actions are helping to speed up the race towards dangerous capabilities that you wish were not going on?
Mustafa Suleyman: I don’t think that’s correct for a number of reasons. First, I think the primary threat to the stability of the nation-state is not the existence of these models themselves, or indeed the existence of these models with the capabilities that I mentioned. The primary threat to the nation-state is the proliferation of power. It’s the proliferation of power which is likely to cause catastrophe and chaos. Centralised power has a different threat — which is also equally bad and needs to be taken care of — which is authoritarianism and the misuse of that centralised power, which I care very deeply about. So that’s for sure.
But as we said earlier, I’m not in the AGI intelligence explosion camp that thinks that just by developing models with these capabilities, suddenly it gets out of the box, deceives us, persuades us to go and get access to more resources, gets to inadvertently update its own goals. I think this kind of anthropomorphism is the wrong metaphor. I think it is a distraction. So the training run in itself, I don’t think is dangerous at that scale. I really don’t.
And the second thing to think about is there are these overwhelming incentives which drive the creation of these models: these huge geopolitical incentives, the huge desire to research these things in open source, as we’ve just discussed. So the entire ecosystem of creation defaults to production. Me not participating certainly doesn’t reduce the likelihood that these models get developed. So I think the best thing that we can do is try to develop them and do so safely. And at the moment, when we do need to step back from specific capabilities like the ones I mentioned — recursive self-improvement and autonomy — then I will. And we should.
And the fact that we’re at the table — for example, at the White House recently, signing up to the voluntary commitments, one of seven companies in the US signing up to those commitments — means that we’re able to shape the distribution of outcomes, to put the question of ethics and safety at the forefront in those kinds of discussions. So I think you get to shape the Overton window when it’s available to you, because you’re a participant and a player. And I think that’s true for everybody. I think everybody who is thinking about AI safety and is motivated by these concerns should be trying to operationalise their alignment intentions, their alignment goals. You have to actually make it in practice to prove that it’s possible, I think.

Open sourcing frontier ML models

Mustafa Suleyman: I think I’ve come out quite clearly pointing out the risks of large-scale access. I think I called it “naive open source – in 20 years’ time.” So what that means is if we just continue to open source absolutely everything for every new generation of frontier models, then it’s quite likely that we’re going to see a rapid proliferation of power. These are state-like powers which enable small groups of actors, or maybe even individuals, to have an unprecedented one-to-many impact in the world.
Just as the last wave of social media enabled anybody to have broadcast powers, anybody to essentially function as an entire newspaper from the ’90’s: by the 2000’s, you could have millions of followers on Twitter or Instagram or whatever, and you’re really influencing the world — in a way that was previously the preserve of a publisher, that in most cases was licenced and regulated, that was an authority that could be held accountable if it really did something egregious. And all of that has now kind of fallen away — for good reasons, by the way, and in some cases with bad consequences.
We’re going to see the same trajectory with respect to access to the ability to influence the world. You can think of it as related to my Modern Turing Test that I proposed around artificial capable AI: like machines that go from being evaluated on the basis of what they say — you know, the imitation test of the original Turing test — to evaluating machines on the basis of what they can do. Can they use APIs? How persuasive are they of other humans? Can they interact with other AIs to get them to do things?
So if everybody gets that power, that starts to look like individuals having the power of organisations or even states. I’m talking about models that are two or three or maybe four orders of magnitude on from where we are. And we’re not far away from that. We’re going to be training models that are 1,000x larger than they currently are in the next three years. Even at Inflection, with the compute that we have, will be 100x larger than the current frontier models in the next 18 months.
Although I took a lot of heat on the open source thing, I clearly wasn’t talking about today’s models: I was talking about future generations. And I still think it’s right, and I stand by that — because I think that if we don’t have that conversation, then we end up basically putting massively chaotic destabilising tools in the hands of absolutely everybody. How you do that in practise, somebody referred to it as like trying to catch rainwater or trying to stop rain by catching it in your hands. Which I think is a very good rebuttal; it’s absolutely spot on: of course this is insanely hard. I’m not saying that it’s not difficult. I’m saying that it’s the conversation that we have to be having.

Voluntary vs mandatory commitments for AI labs

Rob Wiblin: In July, Inflection signed on to eight voluntary commitments with the White House](https://www.whitehouse.gov/briefing-room/statements-releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-leading-artificial-intelligence-companies-to-manage-the-risks-posed-by-ai/), including things like committing to internal and external security testing, investing in cybersecurity and insider threat safeguards, and facilitating third-party discovery and reporting of vulnerabilities. Those are all voluntary, though. What commitments would you like to become legally mandatory for all major AI labs in the US and UK?
Mustafa Suleyman: That is a good question. I think some of those voluntary commitments should become legally mandated.
Number one would be scale audits: What size is your latest model?
Number two: There needs to be a framework for harmful model capabilities, like bioweapons coaching, nuclear weapons, chemical weapons, general bomb-making capabilities. Those things are pretty easy to document, and it just should not be possible to reduce the barriers to entry for people who don’t have specialist knowledge to go off and manufacture those things more easily.
The third one — that I have said publicly and that I care a lot about — is that we should just declare that these models shouldn’t be used for electioneering. They just shouldn’t be part of the political process. You shouldn’t be able to ask Pi who Pi would vote for, or what the difference is between these two candidates. Now, the counterargument is that many people will say that this might be able to provide useful and accurate and valuable information to educate people about elections, et cetera. Look, there is never going to be a perfect solution here: you have to take benefits away in order to avoid harms, and that’s always a tradeoff. You can’t have perfect benefits without any harms. That’s just a tradeoff. I would rather just take it all off the table and say that we —
Rob Wiblin: We can put some of it back later on, once we understand how to do it safely.
Mustafa Suleyman: That’s the best way. That is totally the best way. Now, obviously, a lot of people say that I’m super naive in claiming that this is possible because models like Stable Diffusion and Llama 2 are already out in open source, and people will certainly use that for electioneering. Again, this isn’t trying to resolve every single threat vector to our democracy, it’s just trying to say, at least the large-scale hyperscaler model providers — like Amazon, Microsoft, Google, and others — should just say, “This is against our terms of service.” So you’re just making it a little bit more difficult, and maybe even a little bit more taboo, if you don’t declare that your election materials are human-generated only.

No drama @ 2023-09-16T14:20 (+14)

Is there a risk that Mustafa's company could speed up the race towards dangerous capabilities?

Disheartening to a hear a pretty weak answer to this critical question. Analysis of his answer:

First, I think the primary threat to the stability of the nation-state is not the existence of these models themselves, or indeed the existence of these models with the capabilities that I mentioned. The primary threat to the nation-state is the proliferation of power.

I'm really not sure what this means and surprised Rob didn't follow up on this. I think he must mean that they won't be open sourcing the weights, which is certainly good. However, it's unclear how much this matters if the model is available to call from an API. The argument may be that other actors can't fine-tune the model to remove guardrails, which they have put in place to make the model completely safe. I was impressed to hear his claim about jailbreaks later on:

It isn’t susceptible to any of the jailbreaks or prompt hacks, any of them. If anybody gets one, send it to me on Twitter.

Although strangely he also said:

it doesn’t generate code;

Which is trivial to disprove, so I'm not sure what he meant by that. Regardless, I think that providing API access to a model distributes a lot of the "power" of the model to everyone in the world.

I’m not in the AGI intelligence explosion camp that thinks that just by developing models with these capabilities, suddenly it gets out of the box, deceives us, persuades us to go and get access to more resources, gets to inadvertently update its own goals.

There hasn't ever been any very solid rebuttal of the intelligence explosion argument. It mostly gets dismissed of the basis of sounding like sci-fi. You can make a good argument that dangerous capabilities will emerge before we reach this point, and we may have a "slow take-off" in that sense. However, it seems to me that we should expect recursive self-improvement to happen eventually because there is no fundamental reason why it isn't possible and it would clearly be useful for achieving any task. So the question is whether it will start before or after TAI. It's pretty clear that no one knows the answer to this question so it's absurd to be gambling the future of humanity on this point.

Me not participating certainly doesn’t reduce the likelihood that these models get developed.

The AI race currently consists of a small handful of companies. A CEO who was actually trying to minimize the risk of extinction would at least attempt to coordinate a deceleration between these 4 or 5 actors before dismissing this as a hopeless tragedy of the commons.

Robert_Wiblin @ 2023-09-29T12:21 (+1)

"I'm really not sure what this means and surprised Rob didn't follow up on this."

Just the short time constraint. Sometimes I have to I trust the audience to assess for themselves whether or not they find an answer convincing.