Neel Nanda
I lead the DeepMind mechanistic interpretability team
Posts
Concrete open problems in mechanistic interpretability: a technical overview
by Neel Nanda @ 2023-07-06 | +27 | 0 comments
by Neel Nanda @ 2023-07-06 | +27 | 0 comments
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
by Neel Nanda @ 2022-12-26 | +18 | 0 comments
by Neel Nanda @ 2022-12-26 | +18 | 0 comments
A Barebones Guide to Mechanistic Interpretability Prerequisites
by Neel Nanda @ 2022-11-29 | +54 | 0 comments
by Neel Nanda @ 2022-11-29 | +54 | 0 comments
An Extremely Opinionated Annotated List of My Favourite Mechanistic...
by Neel Nanda @ 2022-10-18 | +19 | 0 comments
by Neel Nanda @ 2022-10-18 | +19 | 0 comments
My Overview of the AI Alignment Landscape: A Bird’s Eye View
by Neel Nanda @ 2021-12-15 | +45 | 0 comments
by Neel Nanda @ 2021-12-15 | +45 | 0 comments
Optimisation-focused introduction to EA podcast episode
by Neel Nanda @ 2021-01-15 | +8 | 0 comments
by Neel Nanda @ 2021-01-15 | +8 | 0 comments