Idea: Red-teaming fellowships

By MaxRa, JasperGo, slg, Yannick_Muehlhaeuser @ 2022-02-02T22:13 (+102)

Summary

A red team is an independent group that challenges an organization or movement in order to improve it. Red teaming is the practice of using red teams. 

Edit: HT to Linch, who came up with the same idea in a shortform half a year ago and didn't follow Aaron's advice of turning it into a top-level post!

Example of a concrete red-teaming fellowship, focussing on LAWS advocacy

  1. Introducing fellows to red-teaming.
  2. 4 weeks of background reading and discussions on AI governance, autonomous lethal weapons and related policy considerations.
  3. 2 full-day red-teaming sprints scrutinizing the hypothetical report “Why EAs should work on bans of autonomous lethal weapons”.
  4. Write-up results of the sprints, receive and incorporate feedback from organizers and volunteer EA researchers, and finally share the write-up with possibly interested parties like FLI, and possibly post it on the EA forum.

Why?

Who might organize this?

More concrete thoughts on implementation

Participants

Structure

The first part of the red-teaming fellowship would be structured like a normal EA introductory fellowship, only focused on specific background reading material for the red-teaming exercise

Desiderata for the red-teaming challenge

Some off-the-cuff examples for topics

Some reservations

We have some concerns about potential down-side risks of this idea. They probably can be averted by input from experienced community members, though.

Other thoughts


AppliedDivinityStudies @ 2022-02-03T10:07 (+28)

One really useful way to execute this would be to bring in more outside non-EA experts in relevant disciplines. So have people in development econ evaluate GiveWell (great example of this here), engage people like Glen Wely to see how EA could better incorporate market-based thinking and mechanism design, engage hardcore anti-natalist philosophers (if you can find a credible one), engage anti-capitalist theorists skeptical of welfare and billionaire philanthropy, etc.

One specific pet project I'd love to see funded is more EA history. There are plenty of good legitimate expert historians, and we should be commissioning them to write for example on the history of philanthropy (Open Phil did a bit here), better understanding the causes of past civilizations' ruin, better understanding intellectual moral history and how ideas have progressed over time, and so on. I think there's a ton to dig into here, and think history is generally underestimated as a perspective (you can't just read a couple secondary sources and call it a day).

cryptograthor @ 2022-04-06T23:44 (+4)

Hi there. This thread was advertised to me by a friend of mine, and I thought I would put a comment somewhere. In the spirit of red teaming, I'm a cryptography engineer with work in multi-party computation, consensus algorithms, and apologetically, nft's. I've done previous work with the Near grants board, evaluating technical projects. I also comaintain the https://zkmesh.substack.com/ monthly newsletter on cryptography. I could have time to contribute toward a red-teaming effort and idea shaping and evaluation, for projects along these lines, if the project comes into existence.

MaxRa @ 2022-04-07T00:39 (+2)

Hi! That’s really cool, I haven’t seen someone with actual OG red teaming experience involved in the discussion here!

The team at Training for Good are currently running with the idea and I imagine would be happy about hearing from you: https://forum.effectivealtruism.org/posts/DqBEwHqCdzMDeSBct/apply-for-red-team-challenge-may-7-june-4

AppliedDivinityStudies @ 2022-02-03T09:58 (+16)

This is a good idea, but I think you mind find that there's surprisingly little EA consensus. What's the likelihood that this is the most important century? Should we be funding near-term health treatments for the global poor, or does nothing really matter aside from AI Safety? Is the right ethics utilitarian? Person-affecting? Should you even be a moral realist?

As far as I can tell, EAs (meaning both the general population of uni club attendees and EA Forum readers, alongside the "EA elite" who hold positions of influence at top EA orgs) disagree substantially amongst themselves on all of these really fundamental and critical issues.

What EAs really seems to have in common is an interest in doing the most good, thinking seriously and critically about what that entails, and then actually taking those ideas seriously and executing. As Helen once put it, Effective Altruism is a question, not an ideology.

So I think this could be valuable in theory, but I don't think your off-the-cuff examples do a good job of illustrating the potential here. For pretty much everything you list, I'm pretty confident that many EAs already disagree, and that these are not actually matters of group-think or even local consensus.

Finally, I think there are questions which are tricky to red-team because of how much conversation around them is private, undocumented, or otherwise obscured. So if you were conducting this exercise, I don't think it would make sense as an entry-level thing, I think you would have to find people who are already fairly knowledgeable. 

MaxRa @ 2022-02-03T10:49 (+12)

Thanks those are good points, especially when the focus is on making progress on issues that might be affected by group-think. Relatedly, I also like your idea of getting outside experts to scrutinize EA ideas. I've seen OpenPhil pay for expert feedback on at least one occasion, which seems pretty useful.

We were thinking about writing a question post along something like "Which ideas, assumptions, programmes, interventions, priorities, etc. would you like to see a red-teaming effort for?". What do you think about the idea, and would you add something to the question to make it more useful?

And I think what your comment neglects is the value of:

  • having this fellowship only as a first stepping-stone for bigger projects in the future (by installing habits & skills and highlighting the value of similar investigations)
  • have fellows work on a more serious research project together will build stronger ties among them relative to discussion groups, and I expect will lead to deeper engagement with the ideas
vaidehi_agarwalla @ 2022-02-03T01:14 (+13)

I really like this idea and would love to organise and participate in such a fellowship!  

 

To address this concern: 


Similarly, this might make EA look unwelcoming and uncooperative from the outside.

It might be better to avoid calling it "red-teaming". According to Wikipedia, red teams are used "cybersecurity, airport security, the military, and intelligence agencies" so the connotations of the word are probably not great. 

AppliedDivinityStudies @ 2022-02-03T10:02 (+6)

I agree that it's important to ask the meta questions about which pieces of information even have high moral value to begin with. OP gives as an example, the moral welfare of shrimps. But who cares? EA puts so little money and effort into this already on the assumption that they probably are valuable. Even if you demonstrated that they weren't or forced an update in that direction, the overall amount of funding shifted would be fairly small.

You might worry that all the important questions are already so heavily scrutinized as to bear little low-hanging fruit, but I don't think that's true. EAs are easily nerd sniped, and there isn't any kind of "efficient market" for prioritizing high impact questions. There's also a bit of intimidation here where it feels a bit wrong to challenge someone like MacAskill or Bostrom on really critical philosophical questions. But that's precisely where we should be focusing more attention.

MichaelA @ 2022-02-04T13:03 (+7)

Other things I imagine the authors or some readers might find interesting:

MichaelA @ 2022-02-04T13:01 (+7)

Thanks for this post!

Quick thoughts:

Aaron Gertler @ 2022-02-09T09:46 (+6)

I was surprised that this post didn't define "red-teaming" in the introduction or summary. The concept isn't especially well-known, and many of the examples that come up in an online search specifically involve software engineering or issues of physical security (rather than something like "critical reading").

Might be good to add a definition to this, perhaps based on the one Linch gave in the Shortform you link to.

MaxRa @ 2022-02-09T14:06 (+4)

Thanks, good catch, I added the definition from the red-teaming tag.

Aaron_Scher @ 2022-02-03T20:07 (+4)

Thanks for writing this up. It seems like a good idea, and you address what I view as the main risks. I think that (contingent on a program like this going well) there is a pretty good chance that it would generate useful insights (Why #3). This seems particularly important to me for a couple reasons. 

  1. Having better ideas and quality scrutiny = good
  2. Relatively new EAs who do a project like this and have their work be received as meaningful/valuable would probably feel much more accepted/wanted in the community 

I would therefore add what I think is helpful structure, the goal being to increase the chances of a project like this generating useful insights. In your Desiderata you mention 

“Red-teaming targets should ideally be actual problems from EA researchers who would like to have an idea/approach/model/conclusion/… red-teamed against.” 

I propose a stronger view here: topics are chosen in conjunction with EA researchers or community members who want a specific idea/approach/model/conclusion/… red-teamed against and agree to provide feedback at the end. Setting up this relationship from the beginning seems important if you actually want the right people to read your report. I think with a less structured format, I'm worried folks might construct decent arguments or concerns in their red-team write up, but nobody or not the right people read them, so it's useless. 

Note 1: maybe researchers are really busy so this is actually "I will provide feedback on a 2 page summary"

Note 2: asking people what they want red-teamed is maybe a little ironic when a goal is good epistemic norms. This makes me quite uncertain that this is a useful approach, but it also might be that researchers are okay providing feedback on anything. But it seems like one way of increasing the chances of projects like this having actual impact.

This idea makes me really excited because I would love to do this!

I agree that this gets around most of the issues with paying program participants.