EA and AI Safety Schism: AGI, the last tech humans will (soon*) build
By Phib @ 2023-05-15T02:05 (+6)
Epistemic status
I probably do not know what I am talking about. This post is just trying to deal with some cognitive dissonance I am experiencing with my model of EA and my model of AI x-risk and safety, and perhaps the best way to resolve this, I figure, is to appeal to Cunningham’s Law by writing it up (and doing some writing).
I’m also rather strapped for time and effort these days, sorry if this is awful and sorry that I don’t really support the arguments within, I’m still fairly confident in them but could imagine someone pointing out me being dumb somewhere. I also really don’t do the hard work here of defining terms like “transformative AI” and just sort of leave a lot of things up in the air. Oh well, more for Cunningham.
Anyway, here are a couple other posts that seem relevant…
- My Most Likely Reason to Die Young is AI X-Risk
- Tom Davidson on how quickly AI could transform the world
Main argument
It seems to me that there exists a schism between those pursuing AI Safety and those pursuing other impact-motivated work in effective altruism, namely, that transformative AI timelines are rather short (~<20 years), and that this is not just a question of whether or not we die, but also whether or not everything changes (transforms). And this ‘everything changes’ includes the real potential for a lot of the problems we are focused on now being either solved or transformed dramatically by having something superhuman take a look at them and the larger context. To put it in other words, perhaps the best (most effective) way to solve the problem is to get something smarter than us to look at it, and to put our effort into aligning (and producing) such a system.
A good part of the EA community (those not involved with AI Safety that is) either seem to disagree with or disregard the shortness of AI timelines (which have only become shorter) and the implications of this technology developing drastically in the near future.
What the AI safety community seems to be suggesting, when not concerned with alignment, is that something massively transformative will occur in the next 10 to 20 years. Meanwhile, I think most people outside of the AI safety community are working under the impression that their lives might look rather similar to now - 20 years from now. This seems very unlikely (and I want to note that 20 years is on my longer end of probabilities). I also want to note that I’m not sure people and some orgs are making informed decisions without truly weighing this probability, and perhaps this may be because of accelerating timelines or the weirdness of the thing, but it doesn’t seem like everyone is updating their beliefs accordingly with AI development.
If someone provides me with a model of the future which does not significantly take into account AGI, then I assume it’s because they’re neglecting the technology, its potential, and how soon it could come about. I can’t help but think their model is flawed. Who knows though, maybe I am just buying into the hype and have every reason to believe it, and maybe the technology won't be transformative at all. Hell, consider me crazy if, as someone in their 20s, I’m reconsidering the need to even save for retirement.
I do want to iterate that it does seem very valuable to be pursuing the most moral life that we can right now, and there seem to be other moral models that would prioritize other things like not doing harm to others ourselves, hedging our bets, optics, etc… but I’m also under the impression that within the next couple of decades, we either solve alignment and have a grand time, or we get screwed, and the possibility of neither occurring seems small (but that's what we're otherwise hedging on).
My background [is nontechnical and I didn’t care for AI Safety at first…]
I started off my EA "life" in college (~2019?), when I was introduced to it by some friends who then ran an intro fellowship. They mentioned AI Safety but I wrote it off (as a personal path to being most impactful) for whatever reason, maybe because they were technical and I'm nontechnical. I became a fellowship facilitator and helped out with the uni group, graduated, contracted for some EA orgs, and have found a full-time role doing some AI Safety ops work. This job and the recent acceleration of AI have made me basically become far more 'all in' on AI Safety being the most impactful thing to work on. I’m less sure of other x-risks, but this seems to be the field which is accelerating and changing the most by far, and even acting as a (the most?) significant risk factor of other x-risks.
I think there are other considerations to be made here and I really don’t mean to be unreasonable, but I’m sensing a schism growing between anything AI and anything else. Also, I don’t speak as a representative of the AI Safety community, I speak as someone who used to neglect it.
Cognitive Dissonance
This schism also avails itself to me as a personal cognitive dissonance between AI Safety premises and other impact-motivated work. Once you accept that this technology is incredibly transformative, and soon, and must be differentially influenced now, you also accept that the world will be very different soon and may end up precluding a lot of other efforts.
Two examples of cog diss:
- Again, any models of the future that do not even mention AGI, I consider flawed.
- Personal example, a coworker mentioned to me how weird it'll be for people just born to have lived lives entirely online, like wow how embarrassing to have their entire lives online. I think back, damn, that could be the least of it, if we’re alive!
The last tech we'll build
This kind of transformative AI (maybe: better than humans at everything including: AI research + improving itself) either kills us all or solves all problems.
From a neartermist perspective - it seems nothing matters except AI.
AI isn’t the only thing that matters, but it does seem to be by far the most influential for everyone in my lifetime. Damn, that precipice is sneaking up, I used to have some conception of ⅙ across this century for x-risk, I’m still uncertain about the quantity of risk (enough to really demand attention anyway), but the timing of it seems like it’s rather soon (before 2050). And then I imagine that to deal with x-risks in the future, with human capabilities increasing across the board (more people with more access to infohazards, to AI capabilities and enhancements), it seems like you really do need some sort of powerful AI moderator for harmful actions.
(soon*) and ~timelines~
Just noting that predicting the future is hard though it seems most point to something like AGI occurring within 20 years (e.g. I’d be surprised if it did not).
I can imagine a counterargument that convinces me that someone just is focused on this year, on what they can actually feasibly make some impact on, and that they think these things are way overhyped… but it doesn’t seem to me that this should apply to a lot of people…
I guess one takeaway here is again that risks from AI, much like this other post referenced, should not even be considered under the category of ‘longtermism’. Idk if they ever should have been.
Call to Action: [will the real cause agnostics please stand up?]
I guess the CTA or even the theory of impact of this post here is something like this:
- Have more really intelligent, caring, aligned, and impactful people care about this (that’s you, you brilliant individual)
- (Potentially have someone figure out better how others can more easily and effectively contribute… and scale efforts)
- (also potentially have someone absolutely roast me in the comments)
- Sway some decision-makers and thought leaders (and people with money) to think more about this sort of stuff.
I guess I am wary of additional stuff here… how people can even additionally contribute and CTAing without even providing something to do, as well as strategies to potentially hold off on spending a bunch of money until a crunch period of a couple years before AGI exists. So maybe this sort of push is too early and we should be strategically not worrying people until there are a bunch of low-hanging fruit for everyone to work on? Idk, feeling loosey goosey here.
Counterarguments
- Premises are wrong, e.g. AI will not develop this quickly to be Transformative within 20 years
- In the end, yes, it is very uncertain what the future holds, and yes, a bunch of AI Safety people thinking that AI is risky is also somewhat… selective
- Appealing to history (thinking like AI winters) we end up with a quite different future but superintelligence/singularity/The Long Reflection™ futures do not occur within the next hundred years
- Premises are wrong, AI will not be dangerous
- Then commit fully to building AGI! It’d be an absolutely fantastic problem-solver (superhuman).
- Or just continue to vibe out if we are suddenly contributing negatively to AI safety
- Seems this might actually be the danger of such a post, maybe just further contributes to hype and maybe even causes more actors in the space. Sigh, coordination seems hard.
- If safe AGI development demands that we stop developing AI, then it may make sense to continue the status quo, including for EA work (rather than this meta-work I seem to be encouraging, of making something which can solve problems for you).
Milan Weibel @ 2023-05-15T03:09 (+9)
Side note: calling a world modelling disagreement implied by differences in cause prioritisation a "schism" is in my opinion unwarranted and (low-probability, very negative value) risks becoming a self-fulfilling prophecy.
Milan Weibel @ 2023-05-15T15:13 (+1)
Just for the sake of clarity: I think the word "schism" is inaccurate here because it carries false connotations of conflict.
Phib @ 2023-05-15T03:13 (+1)
I think this is a fair point, thanks for making it. And I certainly overgeneralize at times here, where I believe I’ve experienced moments that indicate such a schism, but also not enough to just label it as such in a public post. Idk!
Phib @ 2023-06-19T04:00 (+3)
FWIW I think this post: https://forum.effectivealtruism.org/posts/J4cLuxvAwnKNQxwxj/how-does-ai-progress-affect-other-ea-cause-areas
Is a way better version of what I was trying to get at here, and MacAskill's answer is pretty good.
Milan Weibel @ 2023-05-15T03:06 (+2)
A more pessimistic counterargument: Safely developing AGI is so hard as to be practically impossible. I do not believe this one, but some pessimistic sectors within AIS do. It combines well with the last counterargument you list (that the timelines where things turn out OK are all ones where we stop / radically slow down the development of AI capabilities). If you are confident that aligning AGI is for all practical purposes impossible, then you focus on preventing the creation of AGI and on improving the future of the timelines where AGI has been successfully avoided.
Phib @ 2023-05-15T03:10 (+2)
Agreed! I think Geoffrey Miller makes this point rather excellently here: