Air-gapping evaluation and support

By Ryan Kidd @ 2022-12-26T22:52 (+22)

This is a crosspost, probably from LessWrong. Try viewing it there.

null
calebp @ 2022-12-27T00:13 (+8)

I feel like surprisingly often within EA the evaluation of people/orgs is not adversarial. I’ve heard of lots of cases of people being very transparent with hiring managers (as they are very keen for the manager to make a good decision) or with being very transparent funders where the applicant wants to know their project is worthwhile on the view of the fund.

I am not sure how cruxy this is with the claim that it should be air gapped by gapped by default but it seemed like people most of the time wanting the air gap was fairly important on your view to the key argument of the post.

Linda Linsefors @ 2022-12-28T11:37 (+6)

I used to do this, i.e. try to be super open about everything. Not any more. The reason being the information bottleneck. There is no way I can possible transmit all relevant information, and my experience with funding evaluation (and some second hand anecdotes) is that I don't get a chance to clear up any misconception. So if I have some personal issue that someone might think would interfere with my job, but which I have strong reason to think would not be a problem for complicated reasons, then I would just keep quiet about it to funders and other evaluators. 

Sure in a perfect world where there where not information constraints (and also assuming everyone is aligned) then reviling everything is an optimal policy. But this is not the world we live in.

Ryan Kidd @ 2022-12-27T01:31 (+2)

Yeah, I think that EA is far better at encouraging and supporting disclosure to evaluators than, for example, private industry. I also think EAs are more likely to genuinely report their failures (and I take pride in doing this myself, to the extent I'm able). However, I feel that there is still room for more support in the EA community that is decoupled from evaluation, for individuals that might benefit from this.

Linda Linsefors @ 2022-12-28T11:32 (+5)

I think the EA and AI safety communities could benefit from more confidential support roles, like the CEA community health team

They are not air-gaped!

https://forum.effectivealtruism.org/posts/NbkxLDECvdGuB95gW/the-community-health-team-s-work-on-interpersonal-harm-in?commentId=vBxnPpQ9jydv5KEmB

 

On the other hand Shay is 

AI Safety Support - Health Coach

I'm also pretty sure AISS's job coaching is air gaped too, but I'm only 90% sure. I'll ping JJ to ask

calebp @ 2022-12-27T00:09 (+4)

I mostly agree with this post though I can think of some concrete cases where I’m more confused.

I think in most cases that come to mind the support services are already pretty airgapped from the evaluators. Can you point to some examples where you think that the gap is insufficient, or is this mostly pointing at a feature you like to see in support services?

Linch @ 2022-12-27T02:54 (+4)

I think of "management responsibilities" as very much putting on both hats. Though this seems like a very normal arrangement in broader society, so presumably not what Ryan is pointing towards.

Ryan Kidd @ 2022-12-27T03:01 (+1)

In my management role, I have to juggle these responsibilities. I think a HR department should generally exist, even if management is really fair and only wants the best for the world, we promise (not bad faith, just humour).

D0TheMath @ 2022-12-27T18:07 (+1)

It would not surprise me if most HR departments are set up as the result of lots of political pressures from various special interests within orgs, and that they are mostly useless at their “support” role.

With more confidence, I’d guess a smart person could think of a far better way to do support that looks nothing like an HR department.

I think MATS would be far better served by ignoring the HR frame, and just trying to rederive all the properties of what an org which does support well would look like. The above post looks like a good start, but it’d be a shame if you all just went with a human human resources department. Traditional companies do not in fact seem like they would be good at the thing you are talking about here.

Unless there’s some weird incentives I know nothing about, effective community support is the kind of thing you should expect to do better than all of civilization at, if you are willing to think about it from first principles for 10 minutes.

Ryan Kidd @ 2022-12-27T18:28 (+2)

I'm not advocating a stock HR department with my comment. I used "HR" as a shorthand for "community health agent who is focused on support over evaluation." This is why I didn't refer to HR departments in my post. Corporate HR seems flawed in obvious ways, though I think it's probably usually better than nothing, at least for tail risks.

Ryan Kidd @ 2022-12-27T01:34 (+1)

This post is mainly explaining part of what I'm currently thinking about regarding community health in EA and at MATS. If I think of concrete, shareable examples of concerns regarding insufficient air-gapping in EA or AI safety, I'll share them here.

NunoSempere @ 2022-12-26T23:03 (+2)

I think that what you are pointing to is a real distinction. I'd also point at:

In principle, evaluations of the first kind could also be super invasive, and it might be in the interest of the candidate for them to be so.

Just wanted to add that since your post doesn't quite contemplate evaluations of the first kind kind.

Ryan Kidd @ 2022-12-26T23:10 (+3)

I think the distinction you make is real. In the language of this post, I consider the first type of evaluation you mention as a form of "support." Whether someone desires comfort or criticism, they might prefer this to be decoupled from evaluation that might disadvantage them.