Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility

This is a crosspost, probably from LessWrong. Try viewing it there.

null

Emrik @ 2022-11-23T14:56 (+5)

Just wanted to say, I love the 500-word limit. A contest that doesn't goodhart on effort moralization!