Announcing “Key Phenomena in AI Risk” (facilitated reading group)

By nora, particlemania @ 2023-05-09T16:52 (+28)

Cross-posted from Less Wrong and the Alignment Forum.

TLDR: “Key Phenomena in AI Risk” is a 7 week-long, facilitated reading group. It is aimed at people interested in conceptual AI alignment research, in particular from fields such as philosophy, systems research, biology, cognitive and social sciences.

The program will run between July and August 2023. Sign up here by May 28th.

What?

The “Key Phenomena in AI risk” reading curriculum provides an extended introduction to some key ideas in AI risk, in particular risks from misdirected optimization or 'consequentialist cognition'. As such, it aims to remain largely agnostic of solution paradigms.

See here for a short overview of the curriculum; here for a more extensive summary; and here for the full curriculum.

This is a 7-week long program, which consists of a weekly 90’ facilitated call to discuss the week’s key phenomena and readings, as well as individual time for reading (min. 2h, more if you would like to explore the optional readings).

The courses are virtual and free of charge.

For Who?

The curriculum is primarily aimed at people interested in conceptual research in AI risk and alignment.

It is designed to be accessible to audiences in, among others, philosophy (of agency, knowledge, power, etc.) and systems research (e.g. biological, cognitive, information-theoretic, social systems, etc.).

When?

The reading groups will be taking place in July and August 2023.

We expect to run 2-6 groups à 4-8 participants (including 1 facilitator). Each group will be led by a facilitator with substantive knowledge of AI risk.

Overview of the curriculum

Week 0 is dedicated to getting to know each other and clarifying how the program will work.
Week 1 focuses on why important features of generally intelligent 'consequentialist cognition' might be algorithmically realizable, and its potential implications.
Week 2 focuses on why it can be hard to direct such intelligence in a safe and beneficial direction.
Week 3 discusses instrumental convergence in goal-oriented systems. .
Week 4 discusses risks from systems that seek predictive omniscience.
Week 5 discusses some factors on why surveilling (or oversight of) these artificial systems may be fraught with differential advantage for deceptive tendencies.
Week 6 discusses why even an incoherent aggregation of optimizing systems could still impose a (misaligned) optimizing pressure in the world.

Here is a longer summary. Here is the full curriculum.

The curriculum has been developed by TJ (Research Scholar FHI) with inputs from Nora Ammann, Sahil Kulshrestha, and Tsvi Benson-Tilsen. The program is operationally supported by PIBBSS.

The curriculum was initially developed as part of the PIBBSS summer research fellowship,. but we realized that it might be of interest and useful to people outside of the fellowship program, too.

We are orienting to this present round of reading groups as a way to test whether it’s worth continuing to run them on a more regular basis, as well so to help us improve the program.

Sign up

About the application

The application consists of one stage, where we ask you to fill in a form with

Your CV
Your motivation for participating in the program
Your prior exposure to AI risk/alignment to date

We select people based on our best understanding of their motivation to contribute to AI alignment and how much they would counterfactually benefit from participating in the program.

If you have any questions, feel free to leave a comment below or contact us at contact@pibbss.ai

If you are keen to facilitate one of the reading group, also reach out.