Takeaways from a survey on AI alignment resources

By DanielFilan @ 2022-11-05T23:45 (+18)

This is a linkpost to https://www.lesswrong.com/posts/rXSBvSKvKdaNkhLeJ/takeaways-from-a-survey-on-ai-alignment-resources

Cross-posted from LW, I'll probably mostly be checking comments there

What am I talking about?

In June and July of this year, I ran a survey to ask a lot of people how useful they found a variety of resources on AI alignment. I was particularly interested in "secondary resources": that is, not primary resource outputs, but resources that summarize, discuss, analyze, or propose concrete research efforts. I had many people promote the survey in an attempt to make it not obvious that I was running it (so that it would not affect what people said about AXRP, the podcast that I run). CEA helped a great deal with the shaping and promotion of the survey.

The goal of the survey was initially to figure out how useful AXRP was, but I decided that it would be useful to get a broader look at the space of these secondary resources. My hope is that the results give people a better sense of what secondary resources might be worth checking out, as well as gaps that could be filled.

Participants were shown a list of resources, select those they'd engaged with for >30 min, and for each they selected, rate on a scale from 0 to 4 how useful they'd found it, how likely they'd be to recommend to a friend getting into the field who hadn't read widely, and how likely they'd be to recommend to someone paid to do AI alignment research. You can do a test run of the survey at this link.

My summary of the results

Basic stats

Context for questions

When sorting things by ratings, I've included the top 5, and anything just below the top 5 if that was a small number. I also included ratings for AXRP, the podcast I make. Ratings are paired with the standard error of the mean (total ratings have this standard error multiplied by the number of people in the sample). Only things that at least 2 people engaged with were included.

Ratings were generally rounded to two significant figures, and standard errors were reported to the same precision.

Usefulness ratings

Among all respondents:

Total usefulness (multiplying average rating by reach):

  1. 80k podcast: 167 +/- 8
  2. Superintelligence: 166 +/- 8
  3. Talks by AI alignment researchers: 134 +/- 6
  4. Rob Miles videos: 131 +/- 7
  5. AI alignment newsletter: 117 +/- 7
  6. conversations with AI alignment researchers at conferences: 107 +/- 5

Everything else 85 or below, AXRP is at 59 +/- 4.

Average usefulness ratings:

  1. AI Safety Camp: 3.4 +/- 0.2
  2. Conversations: 3.1 +/- 0.2
  3. AGI Safety Fundamentals Course (AGISF): 3.0 +/- 0.2
  4. MLAB: 2.8 +/- 0.8
  5. Rob Miles videos: 2.7 +/- 0.1
  6. 80k podcast: 2.6 +/- 0.1
  7. Superintelligence: 2.6 +/- 0.1
  8. AXRP: 2.6 +/- 0.2

Everything else 2.5 or below.

Among people trying to get into alignment:

Total usefulness:

  1. 80k podcast: 95 +/- 6
  2. AI Alignment Newsletter: 76 +/- 6
  3. Talks by AI alignment researchers: 72 +/- 4
  4. AGISF: 68 +/- 3
  5. Rob Miles videos: 67 +/- 5
  6. Superintelligence: 64 +/- 5

Everything else 50 or below, AXRP is at 37 +/- 3

Average usefulness:

  1. Tie between AI Safety Camp at 3.5 +/- 0.3 and MLAB at 3.5 +/- 0.4
  2. AGISF: 3.2 +/- 0.2
  3. Convos: 3.1 +/- 0.2
  4. ARCHES agenda: 3.0 +/- 0.7
  5. 80k podcast: 2.7 +/- 0.2

Then there's a tail just under that, AXRP is at 2.6 +/- 0.2

Among people who spend time solving alignment problems:

Total usefulness:

  1. Superintelligence: 48 +/- 5
  2. Talks: 47 +/- 4
  3. Convos: 45 +/- 4
  4. AI Alignment Newsletter: 42 +/- 5
  5. 80k podcast: 37 +/- 4
  6. Embedded Agency sequence: 36 +/- 5

Everything else 29 or below, AXRP is 20 +/- 2.

Average usefulness:

  1. Convos: 3.2 +/- 0.3
  2. AI Safety Camp: 3.2 +/- 0.3
  3. Tie between AGISF at 2.7 +/- 0.4 and ML Safety Newsletter at 2.7 +/- 0.3
  4. AI Alignment Newsletter: 2.6 +/- 0.3
  5. Embedded Agency sequence: 2.6 +/- 0.3

Then a smooth drop in average usefulness, AXRP is at 2.2 +/- 0.3

Among people paid to work on technical AI alignment research:

Total usefulness:

  1. Convos: 28 +/- 3
  2. Talks: 26 +/- 2
  3. Superintelligence: 23 +/- 4
  4. AXRP: 22 +/- 3
  5. Embedded Agency sequence: 20 +/- 3

Everything else 19 or below.

Average usefulness:

  1. AI Safety Camp: 3.7 +/- 0.3
  2. AI Alignment Newsletter: 3.2 +/- 0.4
  3. Convos: 3.1 +/- 0.3
  4. Rob Miles videos: 2.8 +/- 0.5 (honourable mention to AIRCS workshops, which had one rating and scored 3 for usefulness)
  5. AXRP: 2.8 +/- 0.3

Everything else 2.5 or below.

Recommendation ratings

Alignment professionals recommend to peers:

  1. Convos with researchers: 3.7 +/- 0.2
  2. AXRP: 3.3 +/- 0.2
  3. Tie between ML safety newsletter at 3.0 +/- 0.4 and AI alignment newsletter at 3.0 +/- 0.5
  4. Rob Miles videos: 2.6 +/- 0.5
  5. Embedded Agency sequence: 2.5 +/- 0.5

Everything else 2.4 or lower

Alignment professionals recommend to newcomers (= people trying to move into AI alignment career):

  1. AGISF: 3.7 +/- 0.2
  2. Rob Miles: 3.4 +/- 0.3
  3. The Alignment Problem: 3.2 +/- 0.3
  4. 80k podcast: 3.13 +/- 0.3
  5. AI safety camp: 3.0 +/- 0.5

Everything else 2.8 or lower (AXRP is at 1.9 +/- 0.4)

Newcomers recommend to newcomers:

  1. MLAB: 4.0 +/- 0.0 (2 ratings)
  2. AGISF: 3.7 +/- 0.1
  3. Rob Miles: 3.4 +/- 0.2
  4. AI safety camp: 3.0 +/- 0.9
  5. Human Compatible (the book): 2.8 +/- 0.3 (honourable mention to AIRCS workshops which had one rating, and scored 3)
  6. The Alignment Problem: 2.8 +/- 0.3

Everything else 2.6 or lower (AXRP is at 2.4 +/- 0.3)

One tidbit: newcomers seem to agree with the professionals about what newcomers should engage with, in terms of ratings.

Details of the survey

The survey was run on GuidedTrack. Due to an error on my part, if anybody pressed the 'back' button and changed a rating, this messed up their results unrecoverably (hence the drop-off from the number of entries total and the number with data I could use).

The list of resources:

The rating scale for usefulness:

The probability rating scale:

As well as the details published here, I also collected how many years people had been interested in AI alignment and/or paid to work on technical AI alignment research, as applicable. Also, people were able to write in comments about specific resources, as well as the survey as a whole, and could write in the place they heard about the survey.

For more details, you can see my GitHub repository for this survey. It contains the GuidedTrack code to specify the survey, the results, and a script to analyze the results. Note that I redacted some details of some comments to remove detail that might identify a respondent.


Ben_West @ 2022-11-06T06:12 (+4)

Thanks for doing this! I'm surprised conversations rated so low. I feel confused about whether this means we should put more effort into connecting people because conversations aren't happening enough by default, or less effort because the conversations aren't valuable.

DanielFilan @ 2022-11-06T07:32 (+2)

My take is they rated pretty high? You consistently see them in the top tier of average usefulness ratings. My guess is that any deficits are due to (a) low reach/scalability depressing total usefulness (but among all respondents they ranked #6 on that front!) and people being less likely to recommend because conferences are rare and it's hard to meet people.

Ben_West @ 2022-11-06T15:03 (+2)

Yeah fair, thinking through this more I changed my mind and it does seem somewhat high. I think I wasn't taking into account enough that conferences have limited reach.

Abby Hoskin @ 2022-11-06T02:33 (+4)

Cool survey! Multiplying average rating by reach is an interesting technique, but I wasn't reading carefully and was super confused by the results for a minute. 

It was smart to report the differences between resources recommended based on who was doing the recommending and to whom. Do you have any thoughts on how Rob Miles manages to create useful content for both experts and newcomers? I would have thought a few of the other resources on your list were trying to achieve this as well, but it seems like he's doing the best job. 

DanielFilan @ 2022-11-06T19:12 (+1)

To be honest, I haven't watched many of his videos, so don't have much to say on this front.

PeterSlattery @ 2022-11-06T21:54 (+2)

Thanks for this! I would like to see significantly more effort put into assessing the reach and performance of different resources and subsequent updating of relevant movement building strategies.

DanielFilan @ 2022-11-11T19:45 (+1)

BTW: if anyone is interested in running this survey in future years, drop me a line.

Niel_Bowerman @ 2022-11-07T14:32 (+1)

Thanks for doing this - I found it helpful!  

Am I correct in thinking that under 'Among all respondents' under 'Average usefulness ratings:' the category
> 80k: 2.6 +/- 0.1

is just the 80k podcast and not all of 80k?  If so one could change it to:
> 80k podcast: 2.6 +/- 0.1

DanielFilan @ 2022-11-07T17:49 (+1)

Yep, just the 80k podcast. Good point, will expand that abbreviation.