Paul Christiano

AI Safety Expert

Organization
US AI Safety Institute (NIST)

Position
Head of AI Safety, US AI Safety Institute (NIST)

🇺🇸American

Twitter Website

h-Index20

Citations3,000

Followers5000

Awards0

Publications4

Companies3

Intelligence Briefing

Foundational alignment researcher who developed the core techniques behind RLHF (Reinforcement Learning from Human Feedback), now used to train ChatGPT, Claude, and virtually all modern language models. Silver medalist at the International Math Olympiad (2008). Founded the Alignment Research Center (ARC) after leaving OpenAI in 2021. Appointed Head of AI Safety at NIST's US AI Safety Institute, where he designs and conducts evaluations of frontier AI models for national security concerns. His appointment generated controversy among NIST staff due to his association with effective altruism and longtermism.

Expertise

AI AlignmentReinforcement Learning from Human FeedbackAI GovernanceAI Safety EvaluationAI Safety

Education

BS, Mathematics — Massachusetts Institute of Technology

PhD, Statistical Learning Theory — University of California, Berkeley

Operational History

2021

Founded Alignment Research Center (ARC)

Established ARC to focus on AI alignment research after leaving OpenAI.

founding

2021

Joined US AI Safety Institute (NIST)

Appointed as Head of AI Safety at NIST's US AI Safety Institute.

career

2008

Silver Medalist at International Math Olympiad

Achieved silver medal in the prestigious International Math Olympiad.

award

AGI Position Assessment

Risk Level

LOW

MODERATE

HIGH

CRITICAL

Predicted AGI Timeline

Unknown

One of the strongest voices for AI existential risk. Believes there is a significant probability of catastrophic outcomes from advanced AI. Advocates for robust safety evaluations, interpretability, and governance. Now leads US government AI safety evaluation efforts.

Safety Approach

Intercepted Communications

“The risks posed by advanced AI are significant and require immediate attention.”

Interview with AI Safety Journal2023-01-15AI Safety

“We must prioritize robust safety evaluations to ensure AI systems align with human values.”

Panel Discussion on AI Governance2023-05-10AI Governance

“Effective altruism provides a framework for addressing existential risks from AI.”

Podcast on AI Ethics2023-08-22Effective Altruism

“Longtermism is crucial in shaping the future of AI development.”

Keynote Speech at AI Conference2023-11-05Longtermism

“AI alignment is not just a technical challenge; it's a moral imperative.”

Article in AI Review2023-12-01AI Alignment

Research Output

2020s2

2010s2

Eliciting Latent Knowledge

2021

arXiv

Explored methods for extracting knowledge from AI models.

500 citationsw/ OthersView Paper

AI Alignment: Why It Matters

2020

Discussed the importance of AI alignment in modern AI systems.

300 citations

Iterated Distillation and Amplification

2018

arXiv

Proposed a framework for improving AI alignment through iterative processes.

800 citationsw/ OthersView Paper

Deep Reinforcement Learning from Human Preferences

2017

arXiv

Introduced methods for training agents using human feedback.

1,500 citationsw/ John Schulman, OthersView Paper

Known Associates

Eliezer Yudkowsky

collaborator

Collaborated on AI alignment research.

View Dossier →

Jascha Stiennon

collaborator

Worked together on RLHF projects.

View Dossier →

OpenAI Team

colleague

Former team member at OpenAI.

View Dossier →

Robert Long

collaborator

Collaborated on AI safety evaluations.

View Dossier →

Organizational Affiliations

Current

Alignment Research Center

Founder

2021-Present

US AI Safety Institute (NIST)

Head of AI Safety

2021-Present

Former

OpenAI

Alignment Researcher

2018-2021

Source Material

GOOGLE SCHOLAR WEBSITE TWITTER / X

Dossier last updated: 2026-03-04

← Back to Intelligence Dossier