AI Welfare Debate Week retrospective

By Toby Tremlett🔹 @ 2024-09-16T09:18 (+39)

I wrote this retrospective to be shared internally in CEA - but in the spirit of more open communication, I'm sharing it here as well. Note that this is a review of the event considered as a product, not a summary or review of the posts from the week.

If you have any questions, or any additional feedback, that'd be appreciated! I'll be running another debate week soon, and feedback has already been very helpful in preparing for it.

Also, feedback on the retro itself is appreciated- I'd ideally like to pre-register my retros and just have to fill in the graphs and conclusions once the event actually happens, so suggesting data we should measure/ questions I should be asking would be very helpful for making better retro templates.

How successful was the event?

In my OKRs (Objectives and Key Results- AKA, my goals for the event), I wanted this event to:

Have 50 participants, with “participant” being anyone taking an event-related action such as voting, commenting, or posting.
- We did an order of magnitude better than 50. Over 558 people voted during the week, and 27 authors wrote or co-wrote at least one post.
Change people’s minds. I wanted the equivalent of 25 people changing their minds by 25% of the debate slider.
- We did twice as well as I hoped here- 53 unique users made at least one mind change of 0.25 delta (representing 25% of the slider) or more.

Therefore, on our explicit goals, this event was successful 🎊. But how successful was it based on our other, non-KR goals and hopes?

Some other goals that we had for the event- either in the ideation phase, or while it was ongoing, were:

Create more good content on a particularly important issue to EAs.
- Successful.
Increase engagement.
- Seems unsuccessful.
Bring in some new users.
- Not noticeably successful.
Increase messaging.
- Not noticeably successful.

In the next four sections, I examine each of these goals in turn.

Engagement

How much engagement did the event get?

In total, debate week posts got 127 hours of engagement during the debate week (or 11.6% of total engagement), and 181 hours from July 1-14 (debate week and the week after), 7.5% of that fortnight’s engagement hours.

Did it increase total daily hours of engagement?

Note: Discussion of Manifest controversies happened in June, and led to higher engagement hours per day in the build up to the event. Important dates: June 17: 244 comments, June 18: 349 comments, June 20: 33 comments, June 25: 38 comments

It doesn’t look as if the debate week meaningfully increased daily engagement. The average daily engagement for the week after the event is actually higher, although the 3rd day of the event (July 3rd- the day I mentioned that the event was ongoing in the EA Digest) remains the highest hours of engagement between July 1st and the date I'm writing this, August 21st.

Did it get us new users?

Not very noticeably, but slightly. We got a peak of new users on the third day of debate week:

And the average daily new users during debate week was (very marginally) higher than that of any other week between June 24 and today (August 22):

Did it increase messaging?

I don’t think so.

Debate week had a lower average of new convos per day (and some of them would have been me discussing posts with authors), and a slightly higher average messages per user than the next two weeks. It’d be cool if we could bump this metric up next time.

Below is all relevant messaging data, with debate week marked in green. Debate week doesn’t stand out.

How successful were our new features?

For this event we debuted several new features: The debate week banner, the “most influential posts” score, and the in-post debate slider.

The frontpage banner

Overall- I think this was very successful. It got a lot of great feedback during the event, as well as many more votes than I expected.

Distribution of votes

We got votes throughout the week, but more towards the start of the week. We also had many more first votes than vote changes.

Below you can see the graph, with yellow cells representing the largest delta score (change on the debate slider from initial vote) and purple the smallest (generally representing a first vote). I’ve marked the peaks that came organically and from the Digest.

Graph showing votes per hour, and colour-coding to show the delta-score of each vote

Votes on the debate were quite nicely distributed (although see the next section on feedback for concerns on visibility). I’ve put together a hacky chart below to show the distribution:

Histogram lined up with the debate slider to show the final vote distribution.

Feedback on the debate week banner

On the Forum:

Lots of discussion about visualising the votes here. Basic takeaway is that with so many votes, it became difficult to see a) the distribution b) all the individual voters.

From CEA slack:

A^[1] enjoyed the liveliness of debate week, and mentioned that voting is a great way for users who don't post or comment to interact publicly.
A and B would like more ways to interact with and visualise data on the banner- specifically, A likes the idea of a lower bar way to engage (a tweet length explanation of your view for instance) and B wants to be able to click on someone’s vote to see what they have written or dm them.
- I like the idea of adding a little speech bubble which appears on hover, so that people can make a statement along with their vote.

Most influential posts

This didn’t work out as planned. I hoped it would sort out the posts that most changed user’s minds, allowing us to find the most informative posts from the week. However, only 30 users cited posts when they changed their mind, and the second highest mind changing post is my announcement post (which shouldn’t have ranked at all).

For the next debate week, I’d vote for cutting this feature, or just changing it to count a mind-change as same amount of points, no matter how large the mind-change was (in this case, that’d lead to a more rational leaderboard).

The in-post debate slider

Thanks to a suggestion from @EdoArad we had a debate slider in posts, which would give us a delta score which automatically cited the post you were reading, if you changed your vote while on a post. We unfortunately don't have stats which show how often this was used as opposed to the frontpage slider.

Other feedback

There was a lot of feedback clustered around the framing of the debate question, which I responded to in this quicktake. General takeaway: make sure the next debate question is very specific, and nudge people to give feedback on it before it is locked in.

People were generally excited about another debate week. Here is Nathan Young's question post asking for people’s ideas. Users also discussed how frequent debate weeks should be (this and other comments pointed to every 6-8 weeks being a good start).

Feature suggestions:

Being able to cite comments as changing your mind.
From Nathan DMs–
- being able to insert your own debate sliders in posts.
- letting people vote on the next debate week question.

Takeaways for next time:

We should make some changes to our debate-week specific features:
- The next debate question has to be either unambiguous and empirical, or clear and values-based. With EAs, we should probably go for the former (the latter would likely become empirical anyway).
- We should go back to the drawing board with the debate slider, to make sure that it can visualise the distribution of hundreds of votes at once.
- We should either remove, or reform the “most influential posts” ranking to avoid weird rankings and encourage people to use it.
We can try to increase engagement:
- We could take A and B’s suggestions on board:
  - To increase the amount of people contributing to the debate: A quicker and easier way to engage than writing a comment or post (such as a tweet-length summary attached to your avatar as a hover over speech bubble)
  - To seed more messages: An option to message users that disagree with you.
- Perhaps we would easily see more engagement if we discussed a more popular topic/ one that more people have takes on. Next time I’ll go a bit less niche.
- People wanted to tweet about the debate week- can we make something shareable? For example, a graphic which tells you stats like how popular your opinion was, who agreed with you the most, and how many people changed their mind because of your post. If people wanted to share something like this, it could be a neat way to get more people who rarely check the Forum to come and check out the event.

Impacts and mentions beyond the Forum

An EA-aligned org asked us for the vote info [we provided them with only what was already public, just in an easier to read format], to inform their fundraising efforts, and perhaps to find collaborators.
A tangential rationality reading group which discussed posts from the event.
Debate week influences Metaculus to run their own “focus week”.
Bonus: The event prompted some funny anti-EA snark here.

Thanks for reading! If you have more feedback about this event, positive or negative, please comment it below or dm me.

^{^}
Anonymised just to speed up posting.

jenn @ 2024-09-17T19:55 (+3)

oh hey i ran that rationality meetup on radical empathy and AI welfare. i think it went pretty well and it was directly prompted by AI welfare debate week happening on the forums, so thanks for organizing!

i can talk a little more about the takeaways from that meetup specifically, which had around a half dozen attendees:

it was really interesting to try to model how to even plausibly give moral weight to entities that were so bizarrely different from biotic life forms (e.g. can be shut down and rebooted/reverted, can change their own reward functions, can spin up a million copies of itself.) we kept running into assumptions around ideas like consciousness and pain that just sort of fell apart upon any sort of examination
i tried to construct a scenario/case study with an ai entity that was possibly developing sentience, and the response from basically everyone was "wow these behaviours are sus and we have to shoot the mainframe with a gun immediately". this was kind of genuinely illuminating to me about the difficulties of trying to grant ~rights/freedoms to something more powerful than yourself and discussing the specifics of the case study turned the sense of danger from something abstract to something that felt real. we tried to come up with some possible ways for an AI entity to signal ~deservingness of moral weight without signalling dangerous capabilities and kind of came up blank, but this might say more about the collective intelligence of the meetup attendees than it does anything else haha.

like, i don't think these are amazing take-aways, in that higher quality versions of these conclusions have surely been written up in the forums long before debate week. but i think it's helpful to get them in the water a bit more, and i came out of it with a greater appreciation for the complexity of this question (and also like, more deeply grokking the difficulties of alignment research and just how different ai entities can be from humans).

out of curiousity, do you remember how you came across the meetup posting?

Toby Tremlett🔹 @ 2024-09-18T07:49 (+2)

Thanks for fleshing that out jenn! (and for running the meeting). I'll feed this back to my team.
I found the meetup either via a google search for "AI Welfare Debate Week" or a backlink checker.

Toby Tremlett🔹 @ 2024-09-18T14:10 (+3)

This is the backlink checker. It's pretty helpful for seeing whether your posts (or in my case, events) have been mentioned anywhere outside of the Forum.

SummaryBot @ 2024-09-16T16:56 (+1)

Executive summary: The EA Forum's AI Welfare Debate Week was successful in meeting its key goals of participation and changing minds, but had mixed results on secondary objectives like increasing overall engagement and attracting new users.

Key points:

The event exceeded targets for participation (558 voters vs 50 goal) and mind-changing (53 users vs 25 goal).
28 posts were created, with 7 high-quality ones (50+ karma), though none had very high karma.
The event did not noticeably increase overall Forum engagement, new user signups, or messaging.
New features like the debate banner and slider were successful, but the "most influential posts" feature needs improvement.
Feedback suggested making future debate questions more specific and empirical.
Recommendations for future events include improving data visualization, adding easier engagement options, and choosing less niche topics.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.