← Back to Intelligence Dossier
Paul Christiano

Paul Christiano

AI Safety Expert

Organization
US AI Safety Institute (NIST)

Position
Head of AI Safety, US AI Safety Institute (NIST)

πŸ‡ΊπŸ‡ΈAmerican
h-Index20
Citations3,000
Followers5000
Awards0
Publications4
Companies3

Intelligence Briefing

Foundational alignment researcher who developed the core techniques behind RLHF (Reinforcement Learning from Human Feedback), now used to train ChatGPT, Claude, and virtually all modern language models. Silver medalist at the International Math Olympiad (2008). Founded the Alignment Research Center (ARC) after leaving OpenAI in 2021. Appointed Head of AI Safety at NIST's US AI Safety Institute, where he designs and conducts evaluations of frontier AI models for national security concerns. His appointment generated controversy among NIST staff due to his association with effective altruism and longtermism.

Expertise
AI AlignmentReinforcement Learning from Human FeedbackAI GovernanceAI Safety EvaluationAI Safety
Education

BS, Mathematics β€” Massachusetts Institute of Technology

PhD, Statistical Learning Theory β€” University of California, Berkeley

Operational History

2021

Founded Alignment Research Center (ARC)

Established ARC to focus on AI alignment research after leaving OpenAI.

founding
2021

Joined US AI Safety Institute (NIST)

Appointed as Head of AI Safety at NIST's US AI Safety Institute.

career
2008

Silver Medalist at International Math Olympiad

Achieved silver medal in the prestigious International Math Olympiad.

award

AGI Position Assessment

Risk Level
LOW
MODERATE
HIGH
CRITICAL
Predicted AGI Timeline

Unknown

One of the strongest voices for AI existential risk. Believes there is a significant probability of catastrophic outcomes from advanced AI. Advocates for robust safety evaluations, interpretability, and governance. Now leads US government AI safety evaluation efforts.

Safety Approach

One of the strongest voices for AI existential risk. Believes there is a significant probability of catastrophic outcomes from advanced AI. Advocates for robust safety evaluations, interpretability, and governance. Now leads US government AI safety evaluation efforts.

Intercepted Communications

β€œThe risks posed by advanced AI are significant and require immediate attention.”

Interview with AI Safety Journal2023-01-15AI Safety

β€œWe must prioritize robust safety evaluations to ensure AI systems align with human values.”

Panel Discussion on AI Governance2023-05-10AI Governance

β€œEffective altruism provides a framework for addressing existential risks from AI.”

Podcast on AI Ethics2023-08-22Effective Altruism

β€œLongtermism is crucial in shaping the future of AI development.”

Keynote Speech at AI Conference2023-11-05Longtermism

β€œAI alignment is not just a technical challenge; it's a moral imperative.”

Article in AI Review2023-12-01AI Alignment

Research Output

2020s2
2010s2

Eliciting Latent Knowledge

2021

arXiv

Explored methods for extracting knowledge from AI models.

500 citationsw/ OthersView Paper

AI Alignment: Why It Matters

2020

Discussed the importance of AI alignment in modern AI systems.

300 citations

Iterated Distillation and Amplification

2018

arXiv

Proposed a framework for improving AI alignment through iterative processes.

800 citationsw/ OthersView Paper

Deep Reinforcement Learning from Human Preferences

2017

arXiv

Introduced methods for training agents using human feedback.

1,500 citationsw/ John Schulman, OthersView Paper

Known Associates

Organizational Affiliations

Current

Alignment Research Center

Founder

2021-Present

US AI Safety Institute (NIST)

Head of AI Safety

2021-Present

Former

OpenAI

Alignment Researcher

2018-2021

Source Material

Dossier last updated: 2026-03-04