How to find *reliable* ways to improve the future?
By Sjlver @ 2022-08-18T12:47 (+53)
I hear two conflicting voices in my head, and in EA:
- Voice: it's highly uncertain whether deworming is effective, based on 20 years of research, randomized controlled trials, and lots of feedback. In fact, many development interventions have a small or negative impact.
- Same voice: we are confident that work for improving the far future is effective, based on <insert argument involving the number of stars in the universe>.
I believe that I could become convinced to work on artificial intelligence or extinction risk reduction. My main crux is that these problems seem intractable. I am worried that my work would have a negligible or a negative impact.
These questions are not sufficiently addressed yet, in my opinion. So far, I've seen mainly vague recommendations (e.g., "community building work does not increase risks" or "look at the success of nuclear disarmament"). Examples of existing work for improving the far future often feel very indirect (e.g., "build a tool to better estimate probabilities ⇒ make better decisions ⇒ facilitate better coordination ⇒ reduce the likelihood of conflict ⇒ prevent a global war ⇒ avoid extinction") and thus disconnected from actual benefits for humanity.
One could argue that uncertainty is not a problem, that it is negligible when considering the huge potential benefit of work for the far future. Moreover, impact is fat-tailed, and thus the expected value dominated by a few really impactful projects, and thus it's worth trying projects even if they have low success probability[1]. This makes sense, but only if we can protect against large negative impacts. I doubt we really can — for example, a case can be made that even safety-focused AI researchers accelerate AI and thus increase its risks.[2]
One could argue that community building or writing "what we owe the future" are concrete ways to do good for the future . Yet this seems to shift the problem rather than solve it. Consider a community builder who convinces 100 people to work on improving the far future. There are now 100 people doing work with uncertain, possibly-negative impact. The community builder's impact is some function which is similarly uncertain and possibly negative. This is especially true if is fat-tailed, as the impact will be dominated by the most successful (or most destructive) people.
To summarize: How can we reliably improve the far future, given that even near-termist work like deworming, with plenty of available data and research and rapid feedback loops and simple theories, so often fails? As someone who is eager to do spend my work time well, who thinks that our moral circle should include the future, but who does not know ways to reliably improve it... what should I do?
Will MacAskill on fat-tailed impact distribution: https://youtu.be/olX_5WSnBwk?t=695 ↩︎
For examples on this forum, see When is AI safety research harmful? or What harm could AI safety do? ↩︎
Konstantin Pilz @ 2022-08-18T15:49 (+13)
Some ideas for career paths that I think have a very low chance of terrible outcomes and a reasonable chance to do a ton of good for the long-term future (I'm not claiming that they definitely will be net-positive, I'm claiming they are more than 10x more likely to be net positive than to be net negative):
- Developing early warning systems for future pandemics (and related work) (technical bio work)
- Strengthening the bioweapons convention and building better enforcement mechanisms (bio policy)
- Predicting how fast powerful AI is going to be developed to get strategic clarity (AI strategy)
- Developing theories of how to align AI and reasoning about how they could fail (AI alignment research)
- Building institutions that are ready to govern AI effectively once it starts being transformative (AI governance)
Besides these, I think that almost all work longtermists work on today has a positive expected value, even if it has large downsides. Your comparison to deworming isn't perfect. Failed deworming is not causing direct harm. It is still better to give money to ineffective deworming than to do nothing.
Sjlver @ 2022-08-18T18:07 (+5)
This is valuable, thank you. I really like the point on early warning systems for pandemics.
Regarding the bioweapons convention, I intuitively agree. I do have some concerns about how it could tip power balances (akin to how abortion bans tend to increase illegal abortions and put women at risk, but that's a weak analogy). There is also a historical example of how the Geneva Disarmament Conference inspired Japan's bioweapons program.
Predicting how fast powerful AI is going to be developed: That one seems value-neutral to me. It could help regular AI as much as AI safety. Why do you think it's 10x more likely to be beneficial?
AI alignment research and AI governance: I would like to agree with you, and part of me does... I've outlined my hesitations in the comment below.
Konstantin Pilz @ 2022-08-18T20:50 (+2)
Re: bioweapons convention: Good point, so maybe not as straightforward as I described.
Re: predicting AI: You can always not publish the research you are doing or only inform safety-focused institutions about it. I agree that there are some possible downsides to knowing more precisely when AI will be developed, but there seem to be much worse downsides to not knowing when AI will be developed (mainly that nobody is preparing for it policy- and coordination-wise)
I think the biggest risk is getting governments too excited about AI. So I'm actually not super confident that any work on this is 10x more likely to be positive.
Re: policy & alignment: I'm very confident, that there is some form of alignment work that is not speeding up capabilities, especially the more abstract one. Though I agree on interpretability. On policy, I would also be surprised if every avenue of governance was as risky as you describe. Especially laying out big picture strategies and monitoring AI development seem pretty low-risk.
Overall, I think you have done a good job scrutinizing my claims and I'm much less confident now. Still, I'd be really surprised if every type of longtermist work was as risky as your examples - especially for someone as safety-conscious as you are. (Actually, one very positive thing might be criticizing different approaches and showing their downsides)
Sjlver @ 2022-08-19T15:34 (+1)
Thanks a lot for your responses!
I share your sentiment: there must be some form of alignment work that is not speeding up capabilities, some form of longtermist work that isn't risky... right?
Why are the examples so elusive? I think this is the core of the present forum post.
15 years ago, when GiveWell started, the search for good interventions was difficult. It required a lot of research, trials, reasoning etc. to find the current recommendations. We are at a similar point for work targeting the far future... except that we can't do experiments, don't have feedback, don't have historical examples[1], etc. This makes the question a much harder one. It also means that "do research on good interventions" isn't a good answer either, since this research is so intractable.
Ian Morris in this podcast episode discusses to what degree history is contingent, i.e., past events have influenced the future for a long time. ↩︎
Sjlver @ 2022-08-18T18:11 (+3)
Failed deworming is not causing direct harm. It is still better to give money to ineffective deworming than to do nothing.
Apologies in advance for being nitpicky. But you could consider the counterfactual where the money would instead go to another effective charity. A similar point holds for AI safety outreach: it may cause people to switch careers and move away from other promising areas, or cause people to stop earning to give.
Linch @ 2022-08-19T21:36 (+4)
Apologies in advance for being nitpicky. But you could consider the counterfactual where the money would instead go to another effective charity. A similar point holds for AI safety outreach: it may cause people to switch careers and move away from other promising areas, or cause people to stop earning to give.
Sorry if your bar for "reliable good" entails being clearly better than counterfactuals with high confidence, then afaict literally nothing in EA clears that bar. Certainly none of the other Givewell charities clear this bar.
Sjlver @ 2022-08-20T09:47 (+4)
I don't mean to set an unreasonably high bar. Sorry if my comment came across that way.
It's important to use the right counterfactual because work for the long-term future competes with GiveWell-style charities. This is clearly the message of 80000hours.org, for example. After all, we want to do the most good we can, and it's not enough to do better than zero.
Linch @ 2022-08-21T02:00 (+2)
It's important to use the right counterfactual because work for the long-term future competes with GiveWell-style charities
I'm probably confused about what you're saying, but how is this different from saying that work on Givewell-style charities compete with the long-term future, and also donations to Givewell-style charities compete with each other?
Denkenberger @ 2022-08-28T22:27 (+9)
I think resilience to global catastrophes is often a reliable way of improving the long-term future. This is touched on the paper Defence in Depth. Pandemic resilience could include preparation to scale up vaccines and PPE quickly. And I think resilience to climate tail risks and nuclear war makes sense as well.
Linch @ 2022-08-19T21:37 (+4)
I think there aren't reliable things that are a) robustly good for the long-term future under a wide set of plausible assumptions, b) are highly legibly so, c) are easy to talk about in public, d) are within 2 OOMs of cost-effectiveness of the best interventions by our current best guesses, and e) aren't already being done.
I think your question implies that a) is the crux, and I do have a lot of sympathy towards that view. But the reason why it's difficult to generate answers to your question is at least partially due to expectations of b)-e) baked in as well.
Sjlver @ 2022-08-20T10:03 (+1)
Thank you. This is valuable to hear.
Maybe my post simplified things too much, but I'm actually quite open to learn about possibilities for improving the long term future, even those that are hard to understand or difficult to talk about. I sympathize with longtermism, but can't shake off the feeling that epistemic uncertainty is an underrated objection.
When it comes to your linked question about how near-termist interventions affect the far future, I sympathize with Arepo's answer. I think the effect of many such actions decays towards zero somewhat quickly. This is potentially different for actions that explicitly try to affect the long-term, such as many types of AI work. That's why I would like high confidence in the sign of such an action's impact. Is that too strong a demand?
Phil Tanny @ 2022-08-27T12:32 (+1)
- Strengthening the bioweapons convention and building better enforcement mechanisms (bio policy)
In the event of a war where bio-weapons are involved, it will be a knife fight in an alley situation, and all inconvenient conventions, treaties, policies, U.N. proclamations etc will be ignored. Such devices are MAYBE useful in those situations where the major powers have leverage over the small powers.
The world has been largely united in resisting the development of nuclear weapons in North Korea. The North Koreans don't care.
Phil Tanny @ 2022-08-27T12:27 (+1)
As someone who is eager to do spend my work time well, who thinks that our moral circle should include the future, but who does not know ways to reliably improve it... what should I do?
Focus on what you can do to help now, while you consider this further in the background? If all humans present and future are equal, then present humans are as good a target as future humans, and much much more accessible.
Maybe try to de-abstract helping, and make it more tangible and real in your personal experience? Maybe the old lady across the street needs help bringing in her groceries. So you start there, and follow the bread crumbs where ever they lead.
Something simple like this can be a good experiment. If you should find you don't really want to help the old lady who is right in front of you, or if you do, that might help you develop additional clarity regarding your relationship with future humans.
Sjlver @ 2022-08-29T12:29 (+6)
Thanks!
It's clear to me that I want to help people. I think my problem isn't that help is abstract. My current work is in global health, and it's a great joy to be able to observe the positive effects of that work.
My question is about what would be the best use of my time and work. I consider the possibility that this work should target improving the far future, but that kind of work seems intractable, indirect, conditional on many assumptions, etc. I'd appreciate good pointers to concrete avenues for improving the future that don't suffer from these problems. Helping old ladies and introspection probably won't help me with that.
Konstantin Pilz @ 2022-08-18T15:56 (+1)
Note that even if alignment research may sometimes speed up AI development, most AI safety work is still making alignment more likely overall. So I agree that there are downsides here, but it seems really wild to think that it would be better not to do any alignment research instead.
Sjlver @ 2022-08-18T17:55 (+4)
Several people whom I respect hold the view that AI safety might be dangerous. For example, here's Alexander Berger tweeting about it.
A brief list of potential risks:
-
Conflicts of interests: Much AI safety work is done by companies who develop AIs. Max Tegmark makes this analogy: What would we think if a large part of climate change research were done by oil companies, or a large part of lung cancer research by tobacco companies? This situation probably makes AI safety research weaker. There is also the risk that it improves the reputation of AI companies, so that their non-safety work can advance faster and more boldly. And it means safety is delegated to a subteam rather than being everyone's responsibility (different from, say, information security).
-
Speeding up AI: Even well-meaning safety work likely speeds up the overall development of AI. For example, interpretability seems really promising for safety, but at the same time it is a quasi-necessary condition to deploy a powerful AI system. If you look at (for example) the recent papers from anthropic.com, you will find many techniques that are generally useful to build AIs.
-
Information hazard: I admire work like the Slaughterbots video from the Future of Life Institute. Yet it has clear infohazard potential. Similarly, Michael Nielsen writes "Afaict talking a lot about AI risk has clearly increased it quite a bit (many of the most talented people I know working on actual AI were influenced to by Bostrom.)"
-
Other failure modes mentioned by MichaelStJules:
- creating a false sense of security,
- publishing the results of the GPT models, demonstrating AI capabilities and showing the world how much further we can already push it, and therefore accelerating AI development, or
- slowing AI development more in countries that care more about safety than those that don't care much, risking a much worse AGI takeover if it matters who builds it first.