Neel Nanda

AI Researcher and Team Lead

Organization
Google DeepMind

Position
Mechanistic Interpretability Team Lead, Google DeepMind

🇬🇧British

Twitter Website

h-Index--

Citations7,818

Followers--

Awards0

Publications0

Companies3

Intelligence Briefing

Leads the mechanistic interpretability team at Google DeepMind, working to reverse-engineer the algorithms learned by neural networks. Named to MIT Technology Review's list of innovators, which called mechanistic interpretability a "breakthrough technology for 2026." Published Gemma Scope, a collection of 400+ sparse autoencoders for analyzing Gemma models. His perspective has evolved from "low chance of incredibly big deal" to "high chance of medium big deal" — now warns the most ambitious vision of mechanistic interpretability may be unattainable, advocating for "pragmatic interpretability" over theoretical purity. Previously worked at Anthropic under Chris Olah.

Expertise

Mechanistic InterpretabilityTransformer CircuitsAI SafetySparse AutoencodersInterpretability

Education

BA, Pure Mathematics — University of Cambridge

Operational History

2026

Named Innovator by MIT Technology Review

Recognized for contributions to mechanistic interpretability as a breakthrough technology.

award

2025

Published Gemma Scope

Released a collection of 400+ sparse autoencoders for analyzing Gemma models.

research

2024

Joined Google DeepMind

Became the Mechanistic Interpretability Team Lead.

career

2023

Shifted Focus to Pragmatic Interpretability

Evolved perspective on mechanistic interpretability, advocating for practical applications.

research

2022

Worked at Anthropic

Conducted research on language model interpretability under Chris Olah.

career

2021

Interned at DeepMind

Gained experience in AI research and development.

career

2020

Interned at Centre for Human-Compatible AI

Focused on AI safety and alignment research.

career

2019

Interned at Future of Humanity Institute

Engaged in research on the long-term impacts of AI.

career

AGI Position Assessment

Risk Level

LOW

MODERATE

HIGH

CRITICAL

Predicted AGI Timeline

Unknown

Committed to AI safety through interpretability research. Has become more measured about what mechanistic interpretability can achieve, pivoting toward practical safety applications rather than full theoretical understanding of models.

Safety Approach