
Intelligence Briefing
Leads the mechanistic interpretability team at Google DeepMind, working to reverse-engineer the algorithms learned by neural networks. Named to MIT Technology Review's list of innovators, which called mechanistic interpretability a "breakthrough technology for 2026." Published Gemma Scope, a collection of 400+ sparse autoencoders for analyzing Gemma models. His perspective has evolved from "low chance of incredibly big deal" to "high chance of medium big deal" — now warns the most ambitious vision of mechanistic interpretability may be unattainable, advocating for "pragmatic interpretability" over theoretical purity. Previously worked at Anthropic under Chris Olah.
BA, Pure Mathematics — University of Cambridge
Operational History
Named Innovator by MIT Technology Review
Recognized for contributions to mechanistic interpretability as a breakthrough technology.
awardPublished Gemma Scope
Released a collection of 400+ sparse autoencoders for analyzing Gemma models.
researchJoined Google DeepMind
Became the Mechanistic Interpretability Team Lead.
careerShifted Focus to Pragmatic Interpretability
Evolved perspective on mechanistic interpretability, advocating for practical applications.
researchWorked at Anthropic
Conducted research on language model interpretability under Chris Olah.
careerInterned at DeepMind
Gained experience in AI research and development.
careerInterned at Centre for Human-Compatible AI
Focused on AI safety and alignment research.
careerInterned at Future of Humanity Institute
Engaged in research on the long-term impacts of AI.
careerAGI Position Assessment
Unknown
Committed to AI safety through interpretability research. Has become more measured about what mechanistic interpretability can achieve, pivoting toward practical safety applications rather than full theoretical understanding of models.
Committed to AI safety through interpretability research. Has become more measured about what mechanistic interpretability can achieve, pivoting toward practical safety applications rather than full theoretical understanding of models.
Intercepted Communications
“Mechanistic interpretability is a breakthrough technology for 2026.”
“The most ambitious vision of mechanistic interpretability may be unattainable.”
“I advocate for pragmatic interpretability over theoretical purity.”
“My perspective has evolved from a low chance of incredibly big deal to a high chance of medium big deal.”
“AI safety must be grounded in practical applications.”
Known Associates
Chris Olah
mentorMentored Neel Nanda during his time at Anthropic.
View Dossier →Anthropic Team
collaboratorCollaborated with the team on language model interpretability research.
View Dossier →DeepMind Team
collaboratorWorked with various researchers at DeepMind during his internship.
View Dossier →Centre for Human-Compatible AI Team
collaboratorCollaborated on AI safety research during his internship.
View Dossier →Organizational Affiliations
Current
Google DeepMind
Mechanistic Interpretability Team Lead
2024-Present
Former
Anthropic
Language Model Interpretability Researcher
2022-2024
DeepMind
Research Intern
2021
Source Material
Dossier last updated: 2026-03-04