“Intro to brain-like-AGI safety” series—just finished!

By Steven Byrnes @ 2022-05-17T15:35 (+15)

(Previously on EAF: “Intro to brain-like-AGI safety” series—halfway point!)

For those who aren't regular readers of alignmentforum or lesswrong, I’ve been writing a 15-part post series “Intro to Brain-Like-AGI Safety”. And the final post is now posted! 🥳🎉🎊

Some key claims that I argue for in the series

(copied from the final post)

  1. We know enough neuroscience to say concrete things about what “brain-like AGI” would look like (Posts #1#9);
  2. In particular, while “brain-like AGI” would be different from any known algorithm, its safety-relevant aspects would have much in common with actor-critic model-based reinforcement learning with a multi-dimensional value function (Posts #6#8#9);
  3. “Understanding the brain well enough to make brain-like AGI” is a dramatically easier task than “understanding the brain” full stop—if the former is loosely analogous to knowing how to train a ConvNet, then the latter would be loosely analogous to knowing how to train a ConvNet, and achieving full mechanistic interpretability of the resulting trained model, and understanding every aspect of integrated circuit physics and engineering, etc. Indeed, making brain-like AGI should not be thought of as a far-off sci-fi hypothetical, but rather as an ongoing project which may well reach completion within the next decade or two (Posts #2#3);
  4. In the absence of a good technical plan for avoiding accidents, researchers experimenting with brain-like AGI algorithms will probably accidentally create out-of-control AGIs, with catastrophic consequences up to and including human extinction (Posts #1#3#10#11);
  5. Right now, we don’t have any good technical plan for avoiding out-of-control AGI accidents (Posts #10#14);
  6. Creating such a plan seems neither to be straightforward, nor to be a necessary step on the path to creating powerful brain-like AGIs—and therefore we shouldn’t assume that such a plan will be created in the future “by default” (Post #3);
  7. There’s a lot of work that we can do right now to help make progress towards such a plan (Posts #12#15).

General notes

Copied from Post #1
Copied from the final post