Neel Nanda

I lead the DeepMind mechanistic interpretability team

Posts

Concrete open problems in mechanistic interpretability: a technical overview
by Neel Nanda @ 2023-07-06 | +27 | 0 comments
Concrete Steps to Get Started in Transformer Mechanistic Interpretability
by Neel Nanda @ 2022-12-26 | +18 | 0 comments
A Barebones Guide to Mechanistic Interpretability Prerequisites
by Neel Nanda @ 2022-11-29 | +54 | 0 comments
An Extremely Opinionated Annotated List of My Favourite Mechanistic...
by Neel Nanda @ 2022-10-18 | +19 | 0 comments
Concrete Advice for Forming Inside Views on AI Safety
by Neel Nanda @ 2022-08-17 | +58 | 0 comments
Things That Make Me Enjoy Giving Career Advice
by Neel Nanda @ 2022-06-17 | +34 | 0 comments
How I Formed My Own Views About AI Safety
by Neel Nanda @ 2022-02-27 | +134 | 0 comments
Simplify EA Pitches to "Holy Shit, X-Risk"
by Neel Nanda @ 2022-02-11 | +185 | 0 comments
My Overview of the AI Alignment Landscape: A Bird’s Eye View
by Neel Nanda @ 2021-12-15 | +45 | 0 comments
Optimisation-focused introduction to EA podcast episode
by Neel Nanda @ 2021-01-15 | +8 | 0 comments
Retrospective on Teaching Rationality Workshops
by Neel Nanda @ 2021-01-03 | +42 | 0 comments
Local Group Event Idea: EA Community Talks
by Neel Nanda @ 2020-12-20 | +26 | 0 comments
Make a Public Commitment to Writing EA Forum Posts
by Neel Nanda @ 2020-11-18 | +21 | 0 comments
Helping each other become more effective
by Neel Nanda @ 2020-10-30 | +10 | 0 comments
What altruism means to me
by Neel Nanda @ 2020-08-15 | +14 | 0 comments
The world is full of wasted motion
by Neel Nanda @ 2020-08-05 | +21 | 0 comments