Retrospective on the Summer 2021 AGI Safety Fundamentals

By Dewi @ 2021-12-06T20:10 (+67)

Thanks to Jamie Bernardi, Caleb Parikh, Neel Nanda, Isaac Dunn, Michael Chen and JJ Balisanyuka-Smith for feedback on drafts of this document. All mistakes are my own. 

How to read this document 

Summary 

Overview

Further engagement

Why are we writing this post? 

Background 

Reading groups - status quo

This programme came about due to a conversation with Richard Ngo back in November 2020. We were discussing the local AIS landscape in Cambridge, UK, with a focus on the local AIS discussion group - the only regular activity in the AIS community that existed at that time. Attendees would read a paper or blog post and come together to discuss it.

While discussion groups are great for some people, they are not well-optimised for encouraging more people to pursue careers in AIS. In our case, it was additionally unclear whether the reading group required technical ML knowledge, and it was the only landing spot in Cambridge for people interested in AIS.

As a result, regular participants had a wide range of prior understanding in ML and AIS, and had large (and random) gaps in their knowledge of AIS. Keen new members would bounce away due to a lack of any prior context or onboarding mechanism into the group, presumably feeling alienated or out of their depth. Simultaneously, the discussion did not serve to further the understanding or motivation of highly knowledgeable community members, whose experience would vary according to attendance. 

Response to the status quo of reading groups

In response to this situation, Richard offered to curate a curriculum that introduced people in Cambridge and beyond to AIS in a structured way. Our first step was to reach out to local AI safety group leaders from around the world to test out the first curriculum and programme, and to train them up such that they could facilitate future rounds of the programme. 10 AIS group leaders signed up, and Evan Hubinger and Richard each facilitated a cohort of 5 for 7 weeks from Jan-Mar 2021.

We learnt a lot from the first round of the programme, and Richard made significant revisions to the curriculum, resulting in the second version of the curriculum. We opened applications for the second round of the programme in late May 2021, and received significantly more applications than expected (I personally expected ~50, we received 268). This may be partially explained by the fact that many people around the world were under lockdowns, and hence particularly interested in virtual programmes. There was also some demand for an AI governance track, so we put together this short curriculum to replace the final three weeks for the governance cohorts. More details on the Summer 2021 round of the programme can be found throughout the rest of this post. 

How does this fit into the talent pipeline? 

There are a myriad of ways people come across AI safety arguments and ideas: reading Superintelligence or other popular AIS books, learning about EA and being introduced to x-risks, LessWrong or other rationality content, hearing about it from peers and friends, Rob Miles’ YouTube videos, random blogs, etc. However, there hasn’t been a systematic and easy way to go from “public-ish introduction” to understanding the current frontiers. This is where we hope this programme can help.

A curated curriculum and facilitated discussions should help alleviate the issues with the previous status quo. Individuals who were interested would spend months or years reading random blog posts by themselves, trying to figure out who is working on what, and why it is relevant to reducing existential risk from advanced AI. An analogy I like to use is that the status quo is something like a physicist going around telling people “quantum physics is so cool, you should learn about it and contribute to the field!”, but without any undergraduate or postgraduate courses existing to actually train people in what quantum physics is, or accountability mechanisms to help budding physicists to engage with the content in a structured and chronological manner. 

Cohort-based learning is also frequently regarded as superior to self-paced learning, due to higher levels of engagement, accountability to read through specific content and reflect upon it, having a facilitator with more context on the topic who can guide the discussions, and the creation of a community and friendships within each cohort. This is especially important for problems the EA community regularly focuses on where we’ve applied the “neglectedness” filter, given the lack of peers to socialise or work with on these problems in most regions of the world. 

However, I believe there is still much work to be done with regards to AI safety field-building, including:

...and likely much more. Some of the above is already happening to various extents, but it seems very plausible to me that we could be doing a lot more of this type of field-building work, without draining a significant amount of researcher mentorship capacity.

What’s next for this programme? 

We’re now working on the third round of this programme, which will be taking place from Jan-Mar 2022. We hope to provide infrastructure for local discussions where possible (as opposed to being entirely virtual), to continue to improve the standard curriculum, to finish the development of the new governance curriculum by working with AI governance researchers and SERI (shout out to Mauricio for working so hard on this), and to make other major improvements (detailed below). 

I’d like to say a huge thank you to everyone who helped make the Summer 2021 programme happen, including all the facilitators, speakers and participants. I’d also like to say a special thanks to Richard Ngo, Thomas Woodside and Neel Nanda for all their additional work and support.

Impact assessment 

Career 

 

Value to participants 

General sense of impact 

Programme overview and logistics 

This section will briefly highlight the steps that were undertaken to make the programme happen this Summer, and what the programme looked like in more detail from a logistics perspective. Skip ahead if you don’t care to read about logistical details. 

Participant feedback

We had 258 applicants, 168 were accepted onto the programme, around 120 (71%) participants attended 75%+ of the discussions, and 83 participants filled in the post-programme feedback form (as of mid-November). 

Below is a breakdown of the feedback we received from participants, with some accompanying graphs. Tables with the statistics from the feedback forms are provided in the appendix

Feedback 

Demographics / other 

The proportions below are from the participant post-programme survey, and responses are not mandatory, so this is potentially not representative of all participants/applicants. We’ve added (optional) demographics questions to the application form for the next round, to better capture this data.

Selected quotes from participants 

Below is a list of hand-picked (anonymised) quotes from the participant feedback forms. They were selected to represent a range of views the participants expressed in the feedback forms, and to provide more qualitative information to readers on how the programme went from the participants’ perspective. 

Facilitator feedback 

There were 28 facilitators, 5 of whom facilitated 2 cohorts. We had 3 policy/governance-focused facilitators, while the rest were focused on technical AI safety research. 15 facilitators filled in the post-programme feedback form. 

Mistakes we made 

This is a long list of mistakes we (or specifically I, Dewi) made while organising this programme. I expect this to only be useful to individuals or organisations who are also considering running these types of programmes. 

Other Improvements 

Some of these are concrete ideas that we’ll be doing to improve the programme, while some are more vague ideas. As before, feedback on everything is very welcome. 

Uncertainties 

Below is a non-exhaustive list of uncertainties we have regarding the future of this programme. 

Appendix 

Participant feedback table

Ratings from participants

Rating (out of 10) 

Standard deviation

Promoter Score*

8.51

1.30

Facilitator

8.41

1.59

Discussions

7.80

2.06

Readings

7.48

1.71

Exercises

5.74

1.98

Capstone project

6.55

2.52

Speaker events

7.76

1.32

*Promoter score question: How likely would you be to promote this programme to a friend or colleague, who’s interested in learning more about AI safety?  1 = Not at all likely, 10 = Extremely likely.

Other stats

Average

Standard deviation

Reading novelty (/10)

6.89

2.04

Prior likelihood of pursuing an AIS career

57.5%

25.7%

Current likelihood of pursuing an AIS career

68.0%

22.3%

Hours spent per week total

4.8

3.1

Facilitator feedback table 

 

Mean

Standard deviation

Promoter score*

8.6

1.02

Facilitator hours spent per week outside sessions

2.6

1.7

Estimate of # of participants 80%+ likely to pursue an AIS career

1.5

1.0

Percentage of core readings they’d done before the programme

68%

17%

Percentage of further readings they’d done before the programme

38%

18%

Cohort Scheduling 


casebash @ 2021-12-07T06:24 (+7)

One of the key benefits I see from this program is in establishing a baseline of knowledge for EAs who are interested in AI safety which other programs can build upon.

Jamie_Harris @ 2021-12-15T22:08 (+3)

I'm very grateful you wrote up this detailed report, complete with various numbers and with lists of uncertainties.

As you know, I've been thinking through some partly overlapping uncertainties for some programmes I'm involved in, so feel free to reach out if you want to talk anything over.

gergogaspar @ 2023-01-09T10:03 (+2)

I found reading this incredibly helpful, thank you for writing it up!  
Also, I just wanted to flag that given that some time has passed since the post's publication, I found two hyperlinks that no longer work:
"5 speaker events" and "4 weeks doing a self-directed capstone project"

Thank you again!