How to disclose a new x-risk?

By harsimony @ 2022-08-24T01:35 (+20)

Note: I'm not claiming that I know of a new x-risk, I just want to know about the right policy in this situation

If someone identifies a new existential or catastrophic risk, it seems prudent to avoid publishing it widely as this may constitute an infohazard.

However, it probably doesn't make sense to keep this information to oneself since other people can begin to work on research and mitigation if they are aware of the risk.

Is there a group of people to disclose new x-risks to that can make relevant experts aware of the risk? In general, how and where should someone disclose a new x-risk?

Joseph Bloom @ 2022-08-24T04:52 (+14)

Not a comprehensive answer but a few ideas. I don't know of any existing documentation or organisation about how to do this.

I think talking to people currently heavily involved in funding x-risk mitigation efforts is a good start. People with a proven track record of taking x-risks seriously are more likely to adequately consider the relevant concerns and assist by progressing the discussion and coming up with meaningful mitigation strategies. For example, you could email Nick Bostrom or someone at Open Philanthropy. I've heard Kevin Esvelt is someone with a track record or taking info-hazards seriously too.
Maybe don't go directly to super critical people in existing efforts. It's possible that you should qualify your ideas first by talking to other experts (who you trust) in whichever domain is likely to know about those risks (although of course you'd want to avoid losing control of the narrative, such as by someone you tell overzealously raising alarm and damaging your credibility).

There's probably lots of specific reasoning that might be necessary based on the relevant risk (for example if it's tied up with specific economic activity the way AI capabilities development is).

Linch @ 2022-08-24T06:14 (+9)

I endorse the suggestion to talk to talking to someone senior at Open Phil. EA doesn't have a centralized decisionmaker, but Open Phil might be closest as a generally trusted group which is used to handling these issues.

harsimony @ 2022-08-24T17:10 (+1)

Ok, and any advice for reaching out to trusted-but-less-prestigious experts? It seems unlikely that reaching out to e.g. Kevin Esvelt will generate a response!

Linch @ 2022-08-24T23:59 (+5)

I think someone like Esvelt (and also Greg, who personally answered in the affirmative) will probably respond. Even if they are too busy to do a call, they'll know the appropriate junior-level people to triage things to.

cwbakerlee @ 2023-01-16T19:13 (+10)

To build on Linch's response here:
I work on the biosecurity & pandemic preparedness team at Open Philanthropy. Info hazard disclosure questions are often gnarly. I'm very happy to help troubleshoot these sorts of issues, including both general questions and more specific concerns. The best way to contact me, anonymously or non-anonymously, is through this short form. (Alternatively, you could reach my colleague Andrew Snyder-Beattie here.) Importantly, if you're reaching out, please do not include potentially sensitive details of info hazards in form submissions – if necessary, we can arrange more secure means of follow-up communication, anonymous or otherwise (e.g., a phone call).

Gregory_Lewis @ 2022-08-24T23:31 (+5)

The guiding principle I recommend is 'disclose in the manner which maximally advantages good actors over bad actors'. As you note, this usually will mean something between 'public broadcast' and 'keep it to yourself', and perhaps something in and around responsible disclosure in software engineering: try to get the message to those who can help mitigate the vulnerability without it leaking to those who might exploit it.

On how to actually do it, I mostly agree with Bloom's answer. One thing to add is although I can't speak for OP staff, Esvelt, etc., I'd expect - like me - they would far rather have someone 'pester' them with a mistaken worry than see a significant concern get widely disseminated because someone was too nervous to reach out to them directly.

Speaking for myself: If something comes up where you think I would be worth talking to, please do get in touch so we can arrange a further conversation. I don't need to know (and I would recommend against including) particular details in the first instance.

(As perhaps goes without saying, at least for bio - and perhaps elsewhere - I strongly recommend against people trying to generate hazards, 'red teaming', etc.)

ofer @ 2022-08-24T15:00 (+3)

It's a very important question.

However, it probably doesn't make sense to keep this information to oneself since other people can begin to work on research and mitigation if they are aware of the risk.

I don't think this is always the case. In anthropogenic x-risk domains, it can be very hard to decrease the chance of an existential catastrophe from a certain technology, and very easy to inadvertently increase it (by drawing attention to an info hazard). Even if the researchers (within EA) are very successful, their work can easily be ignored by the relevant actors in the name of competitiveness ("our for-profit public-benefit company takes the risk much more seriously than the competitors, so it's better if we race full speed ahead", "regulating companies in this field would make China get that technology first", etc.).

Phil Tanny @ 2022-08-25T08:50 (+1)

Generally speaking, I would suggest a shift of focus away from particular risks which arise from emerging technologies, and towards the machinery which is generating all such risks, an ever accelerating knowledge explosion.

It's natural to see a particular risk and wish to do something about it. But such a limited focus is not really fully rational once we realize that it doesn't really matter if we remove one particular existential risk unless we can remove them all. As example, if I knew how to make genetic engineering fully safe why would that matter if we then go on to have a nuclear war?

It's a logic failure to assume, as seemingly almost all "experts" do, that we can continue to enthusiastically fuel an ever accelerating knowledge explosion and then somehow successfully manage every existential risk which emerges from that process, every day forever.

We're failing to grasp what the concept of acceleration actually means. It means that if the knowledge explosion is going at, say, 50mph today, tomorrow it will be 75mph, and then 150mph, and then 300mph etc. Sooner or later this accelerating process of power accumulation will exceed the human ability to manage. No one can predict exactly when or how we'll crash the system, but simple common sense logic demonstrates it will happen eventually on our current course.

The "experts" would have us focus on the details of particular emerging technological threats. The experts are wrong. What we need to be focused on instead is the knowledge explosion assembly line which is generating all the threats.

Emrik @ 2022-08-24T08:44 (+1)

The way I deal with info-hazards in general is that I balance the risks and gains of talking about it with specific people. I haven't wanted to talk to "EA seniors" unless I know them well enough to trust them. But I do talk to people, because it helps me grow my own understanding, and that might help me or them do something about it.

I don't think you know me well enough to trust me, but I'd be happy to hear about it and give feedback on the reasoning.