Model evals for dangerous capabilities

By Zach Stein-Perlman @ 2024-09-23T11:00 (+19)

This is a crosspost, probably from LessWrong. Try viewing it there.

null