MATS AI Safety Strategy Curriculum v2

By DanielFilan @ 2024-10-07T23:01 (+29)

As part of our Summer 2024 Program, MATS ran a series of discussion groups focused on questions and topics we believe are relevant to prioritizing research into AI safety. Each weekly session focused on one overarching question, and was accompanied by readings and suggested discussion questions. The purpose of running these discussions was to increase scholars’ knowledge about the AI safety ecosystem and models of how AI could cause a catastrophe, and hone scholars’ ability to think critically about threat models—ultimately, in service of helping scholars become excellent researchers.

The readings and questions were largely based on the curriculum from the Winter 2023-24 Program, with two changes:

In addition, the curriculum was supplemented in two ways:

As in the post about the previous cohort’s curriculum, we think that there is likely significant room to improve this curriculum, and welcome feedback in the comments.

Week 1: How powerful is intelligence?

Core readings

Other readings

Discussion questions

Week 2: How and when will transformative AI be made?

Core readings

Other readings

Discussion questions

Week 3: How could we train AIs whose outputs we can’t evaluate?

Core readings

Other readings

Discussion questions

Week 4: Will AIs fake alignment?

Core readings

Scheming AIs: Will AIs fake alignment during training in order to get power?, abstract and introduction (Carlsmith - 45 min)
[In retrospect, this probably took longer than 45 minutes for most people to read]

Other readings

On inner and outer alignment

On reasons to think deceptive alignment is likely

Discussion questions

Week 5: How should AI be governed?

Core readings

Other readings

Discussion questions

Readings that did not fit into any specific week

Acknowledgements

Daniel Filan was the primary author of the curriculum (to the extent that it differed from the Winter 2023-24 curriculum) and coordinated the discussion groups. Ryan Kidd scoped, managed, and edited the project. Many thanks to the MATS alumni and other community members who helped as facilitators and to the scholars who showed up and had great discussions!


SummaryBot @ 2024-10-08T15:05 (+1)

Executive summary: MATS ran a series of AI safety discussion groups for their Summer 2024 Program, covering key topics like AI capabilities, timelines, training challenges, deception risks, and governance approaches to help scholars develop critical thinking skills about AI safety.

Key points:

  1. Curriculum covered 5 weekly topics: AI intelligence/power, transformative AI timelines, training challenges, alignment deception risks, and AI governance approaches.
  2. Core and supplemental readings were provided for each topic, along with discussion questions to facilitate critical analysis.
  3. Curriculum aimed to increase scholars' knowledge of AI safety ecosystem and potential catastrophe scenarios.
  4. Changes from previous version included reducingd after the discussion series concluded.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.