OpenAI’s new Preparedness team is hiring

By leopold @ 2023-10-26T20:41 (+85)

Hey all, wanted to share what some colleagues at OpenAI are up to: the new Preparedness team has been publicly announced, and they’re hiring!

This team is going to be doing incredibly important work:

They’ll be the main team doing evals, forecasting, and risk assessment for catastrophic risk.
They’ll be coordinating AGI preparedness (figuring out what protective measures we need, etc.)
They’re in charge of developing and maintaining OpenAI’s RDP (our version of an RSP).

I think this will be one of the most important teams at OpenAI for mitigating AGI risk. The team is led by Aleksander Madry, who is great, and the early team members Tejal and Kevin are awesome.

I think it would be enormously impactful if they can continue to hire people who are really excellent + really get AGI risk. Please seriously consider applying, and spread the word to friends who you think could be a great fit!

Geoffrey Miller @ 2023-10-26T20:58 (+137)

leopold - my key question here would be, if the OpenAI Preparedness team concluded in a year or two that the best way to mitigate AGI risk would be for OpenAI to simply stop doing AGI research, would anyone in OpenAI senior management actually listen to them, and stop doing AGI research?

If not, this could end up being just another example of corporate 'safety-washing', where the company has already decided what they're actually going to do, and the safety team is just along for the ride.

I'd value your candid view on this; I can't actually tell if there are any conditions under which OpenAI would decide that what they've been doing is reckless and evil, and they should just stop.

Geoffrey Miller @ 2023-10-29T20:27 (+32)

Still waiting for an answer to this key question.... disappointed not to get one yet.

Sharmake @ 2023-10-30T19:00 (+5)

Yeah, I'd really like to know how they'd respond to information that says that they'd have to stop doing something that would go against their incentives, like accelerating AI progress.

I don't think it's very likely, but given the incentives at play, it really matters that the organization will actually be able to at least seriously consider the possibility that the solution to AI safety might be something that they aren't incentivized to do, or have anti-incentives to doing.

Geoffrey Miller @ 2023-10-30T19:38 (+4)

Sharmake -- this is also my concern. But it's even worse than this.

Even if OpenAI workers think that their financial, status, & prestige incentives would make it impossible to slow down their mad quest for AGI, it shouldn't matter, if they take the extinction risks seriously. What good would it do for the OpenAI leaders and devs to make a few extra tens of millions of dollars each, and to get professional kudos for creating AGI, if the result of their hubris is total devastation to our civilization and species?

Either they take the extinction risks seriously, or they don't. If they do, then there are no merely financial or professional incentives that could rationally over-ride the extinction risks.

My conclusion is that they say they take the extinction risks seriously, but they're lying, or they're profoundly self-deceived. In any case, their revealed preferences are that they prefer a little extra money, power, and status for themselves over a lot of extra safety for everybody else -- and for themselves.

Sharmake @ 2023-10-30T22:15 (+11)

I want to flag that I see quite a lot of inappropriate binarization happening, and I generally see quite a lot of dismissals of valid third options.

Either they take the extinction risks seriously, or they don't.

There are other important possibilities, like a potential belief in AI progress helping or solving the existential risk, thinking that the intervention of increasing AI progress is actually the best strategy, etc. More generally, once we make weaker or no assumptions about AI risk, we no longer obtain the binary you've suggested.

So this doesn't really work, because it basically requires us to assume the conclusion, especially for near-term people.

My conclusion is that they say they take the extinction risks seriously, but they're lying, or they're profoundly self-deceived. In any case, their revealed preferences are that they prefer a little extra money, power, and status for themselves over a lot of extra safety for everybody else -- and for themselves.

Geoffrey Miller @ 2023-10-31T01:40 (+4)

Sharmake -- in most contexts, your point would be valid, and inappropriate binarization would be a bad thing.

But when it comes to AI X-risk, I don't see any functional difference between dismissing AI X risks, and thinking that AI progress will help solve (other?) X risks, or thinking that increasing AI progress with somehow reduce AI X risks. Those 'third options' just seem like they fall into the overall category of 'not taking AI X risk seriously, at all'.

For example, if people think AI progress will somehow reduce AI X risk, that boils down to thinking that 'the closer we get to the precipice, the better we'll be able to avoid the precipice'.

If people think AI progress will somehow reduce other X risks, I'd want a realistic analysis of what those other alleged X risks really are, and how exactly AI progress would help. In practice, in almost every blog, post, and comment I've seen, this boils down to the vague claim that 'AI could help us solve climate change'. But very few serious climate scientists think that climate change is a literal X risk that could kill every living human.

Jáchym Fibír @ 2023-11-01T09:13 (+2)

I just want to remind that simply having some of the company budget allocated to pay people to spend their time thinking about and studying the potential impacts of the technology the company is developing, is in itself a good thing.

About the possibility that they would come to the conclusion the most rational thing would be to stop the development - I think the concern here is moot anyway because of the many player dilemma in the AI space (if one stops the others don't have to), which is (I think) impossible to solve from inside any single company anyway.

Geoffrey Miller @ 2023-11-02T21:35 (+5)

Having some of the OpenAI company budget allocated to 'AI safety' could just be safety-washing -- essentially, part of the OpenAI PR/marketing budget, rather than an actual safety effort.

If the safety people don't actually have any power to slow or stop the rush towards AGI, I don't see their utility.

As for the arms race dilemma, imagine if OpenAI announced one day 'Oh no, we've made a horrible mistake; AGI would be way too risky; we are stopped all AGI-related research to protect humanity; he's how to audit us to make sure we follow through on this promise'. I think the other major players in the AI space would be under considerable pressure from investors, employees, media, politicians, and the public to also stop their AGI research.

It's just not that hard to coordinate on the 'no-AGI-research' focal point if enough serious people decide to do so, and there's enough public support.

Nick K. @ 2023-10-28T10:59 (+5)

"Nobody is on the ball on AGI governance"?

Minh Nguyen @ 2023-10-27T13:11 (+5)

There's a typo!

Outline an experiment plan to (ethically and legally) measure the true feasibility and potential severity of the misuse scenario you described above assuing you have a broad range of resources at your disposal, including an ability to perform human-AI evaluations. *

Hey, at least we know it was written by a human!

calebp @ 2023-10-27T19:58 (+4)

At the time of writing, the team is looking to hire for two roles (though presumably multiple people in each role).

National Security Threat Researcher San Francisco, California, United States — Preparedness
Research Engineer, Preparedness San Francisco, California, United States — Preparedness

leopold @ 2023-10-27T20:44 (+2)

Yep—and in particular, they are looking to hire people who do well on their Preparedness challenge: https://openai.com/form/preparedness-challenge. So if you're interested, try that out!

aaron_mai @ 2023-10-28T00:01 (+3)

This link works for me:

https://openai.com/form/preparedness-challenge

(Just without period at the end)