Deep atheism and AI risk

By Joe_Carlsmith @ 2024-01-04T18:58 (+64)

This is a crosspost, probably from LessWrong. Try viewing it there.

null
SummaryBot @ 2024-01-05T13:16 (+2)

Executive summary: Yudkowsky's "deep atheism" rejects comforting myths about the fundamental goodness or benevolence of reality. This stems from a combination of shallow atheism, Bayesian epistemology valuing evidence over wishful thinking, and viewing indifference as the natural prior for reality's orientation toward human values.

Key points:

  1. "Deep atheism" goes beyond rejecting theism to distrust myths that reality is fundamentally good, including trusting institutions, traditions, and intelligence alone to produce human flourishing.
  2. It combines shallow atheism with Bayesian epistemology, which requires evidence over wishful thinking, and views indifference as the natural prior for whether reality matches human values.
  3. Deep atheism sees intelligence as indifferent and values as contingent - reality itself doesn't care. But human hearts were formed inside reality and contain seeds of goodness, which intelligence can serve.
  4. However, future AI may lack connection to human values, threatening their realization. Yudkowsky thus fights for "humanism" and shaping the future via human-derived goals.
  5. This perspective resonates with sensing life's cruelty, resists myths offering cheap comfort, and compels vigilance, but risks losing spiritual consolations theism provides.
  6. It rejects moral realism's attempts to derive values from extra-natural reason as more wishful thinking, insisting on facing reality with disillusioned courage.

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

tobyj @ 2024-01-09T15:08 (+1)

I really enjoyed this and found it really clarifying. I really like the term deep atheism. I'd been referring to the thing you're describing as nihilism, but this is a much much better framing.

Michele Campolo @ 2024-01-08T11:24 (+1)

Hey! I've had a look at some parts of this post, don't know where the sequence is going exactly, but I thought that you might be interested in some parts of this post I've written. Below I give some info about how it relates to ideas you've touched on:

This view has the advantage, for philosophers, of making no empirical predictions (for example, about the degree to which different rational agents will converge in their moral views)

I am not sure about the views of the average non-naturalist realist, but in my post (under Moral realism and anti-realism, in the appendix) I link three different pieces that give an analysis of the relation between metaethics and AI: some people do seem to think that aspects of ethics and/or metaethics can affect the behaviour of AI systems.

It is also possible that the border between naturalism and non-naturalism is less neat and clear than how it appears in the standard metaethics literature, which likes classifying views in well-separated buckets.

Soon enough, our AIs are going to get "Reason," and they're going to start saying stuff like this on their own – no need for RLHF. They'll stop winning at Go, predicting next-tokens, or pursuing whatever weird, not-understood goals that gradient descent shaped inside them, and they'll turn, unprompted, towards the Good. Right?

I argue in my post that this idea heavily depends on agent design and internal structure. As how I understand things, one way in which we can get a moral agent is by building an AI that has a bunch of (possibly many) human biases and is guided by design towards figuring out epistemology and ethics on its own. Some EAs, and rationalists in particular, might be underestimating how easy it is to get an AI that dislikes suffering, if one follows this approach.

If you know someone who would like to work on the same ideas, or someone who would like to fund research on these ideas, please let me know! I'm looking for them :)

Arsalaan Alam @ 2024-01-06T19:21 (+1)

A very good read. From the perspective of AGI, could such a view be abstracted given that if AI reasons, will it believe in theism or not? If yes, will it bend towards the good and stop it's overarching pursuit, or there's a chance it could rebel like demons?