Demonstrating specification gaming in reasoning models
By Matrice Jacobine @ 2025-02-20T19:26 (+9)
This is a linkpost to https://arxiv.org/pdf/2502.13295
This is a crosspost, probably from LessWrong. Try viewing it there.
nullBy Matrice Jacobine @ 2025-02-20T19:26 (+9)
This is a linkpost to https://arxiv.org/pdf/2502.13295
This is a crosspost, probably from LessWrong. Try viewing it there.
null