Who Aligns the Alignment Researchers?
By ben.smith @ 2023-03-05T23:22 (+23)
This is a crosspost, probably from LessWrong. Try viewing it there.
nullmore better @ 2023-03-06T03:51 (+4)
ben.smith @ 2023-03-06T05:08 (+6)
Can you describe exactly how much you think the average person, or average AI researcher, is willing to sacrifice on a personal level for a small chance at saving humanity? Are they willing to halve their income for the next ten years? Reduce by 90%?
I think in a world where there was a top down societal effort to try to reduce alignment risk, you might see different behavior. In the current world, I think the "personal choice" framework really is how it works because (for better or worse) there is not (yet) strong moral or social values attached to capability vs safety work.
NickLaing @ 2023-03-06T05:30 (+2)
What makes you think that the average person rates saving humanity highly enough to make it worth doing alignment research rather than capabilities? That seems like a pretty conservative statement from my experience. Most people I know would definitely take a small-moderate amount of extra money rather than doing more valuable work for humanity. Also building something could feel like more rewarding work than safety work as well.
Maybe I'm missing something, what do you think are the assumptions that that statement makes?
more better @ 2023-03-06T11:49 (+7)
Maybe my comment is off, since your article is specifically about AI alignment vs. capabilities research and I was taking the single sentence I quoted out of context. Will remove .
Reply