
Intelligence Briefing
Co-founded Anthropic and pioneered the field of neural network interpretability. Left university at 18 without a degree and received a Thiel Fellowship. Despite no formal credentials, his blog posts (colah.github.io) are assigned reading at MIT, Stanford, and other top universities. His "Understanding LSTM Networks" post is among the most-read technical AI pieces ever. At Anthropic, his team discovered "features" inside Claude models and demonstrated selective activation/deactivation (e.g., the famous "Golden Gate Bridge neuron"). Anthropic CEO Dario Amodei has set a goal to reliably detect most AI model problems by 2027, driven largely by Olah's interpretability research.
Attended (no degree), Computer Science — University of Toronto
Operational History
Continued AI Safety Advocacy
Continues to advocate for AI safety through interpretability and transparency in AI systems.
careerAI Safety Goals
Anthropic set a goal to reliably detect most AI model problems by 2027, influenced by Olah's research.
policyMechanistic Interpretability Research
Published research on the mechanistic interpretability of AI models, contributing to the understanding of AI decision-making.
researchGolden Gate Bridge Neuron Discovery
Led a team that discovered a specific neuron in Claude models that activates for images of the Golden Gate Bridge.
researchCo-founder of Anthropic
Co-founded Anthropic, an AI safety and research company focused on developing reliable AI systems.
foundingResearch Scientist at Google Brain
Joined Google Brain as a research scientist, focusing on neural network interpretability.
careerUnderstanding LSTM Networks
Published a widely-read blog post that explains LSTM networks, which became a key resource in AI education.
researchThiel Fellowship
Received the Thiel Fellowship, which supports young people to pursue scientific research and entrepreneurship.
careerAGI Position Assessment
Unknown
Deeply committed to AI safety through interpretability. Believes understanding what happens inside neural networks is critical for making AI safe. His work is the foundation of Anthropic's safety research agenda.
Deeply committed to AI safety through interpretability. Believes understanding what happens inside neural networks is critical for making AI safe. His work is the foundation of Anthropic's safety research agenda.
Intercepted Communications
“Understanding what happens inside neural networks is critical for making AI safe.”
“The goal of our research is to make AI systems that are interpretable and reliable.”
“The discovery of the Golden Gate Bridge neuron shows how specific features can be activated in neural networks.”
“We need to ensure that AI systems align with human values and intentions.”
“Interpretability is not just a nice-to-have; it's essential for the future of AI.”
Research Output
Understanding AI Decisions
2023Focuses on the need for transparency in AI decision-making.
Towards Safer AI Systems
2023Anthropic Research
Research paper on developing safer AI systems.
AI Safety through Interpretability
2022Anthropic Blog
Discusses the importance of interpretability in AI safety.
Mechanistic Interpretability of AI Models
2021arXiv
Explores the mechanistic interpretability of AI models.
The Golden Gate Bridge Neuron
2021Describes the discovery of a neuron that activates for specific images.
Circuits in Neural Networks
2020arXiv
Research on understanding the internal circuits of neural networks.
Neural Network Feature Visualization
2016arXiv
Pioneering work in visualizing features learned by neural networks.
Understanding LSTM Networks
2015Highly influential blog post that explains LSTM networks.
Known Associates
Dario Amodei
co-founderCo-founder of Anthropic and collaborator on AI safety research.
View Dossier →Matthew Zeiler
collaboratorCollaborated on neural network feature visualization research.
View Dossier →Jack Clark
collaboratorWorked together on understanding circuits in neural networks.
View Dossier →Demis Hassabis
rivalCEO of DeepMind, competing in the AI research space.
View Dossier →Organizational Affiliations
Current
Anthropic
Co-founder, Anthropic
2020 - Present
Former
Google Brain
Research Scientist
2016 - 2020
Google DeepMind
Researcher
2015 - 2016
Source Material
Dossier last updated: 2026-03-04