← Back to Intelligence Dossier
Victoria Krakovna

Victoria Krakovna

AI Safety Researcher

Organization
Google DeepMind

Position
Research Scientist, Google DeepMind

🇷🇺🇨🇦Russian-Canadian
h-Index--
Citations--
Followers5000
Awards0
Publications8
Companies3

Intelligence Briefing

AI safety researcher at Google DeepMind and co-founder of the Future of Life Institute (FLI), the organization behind the famous open letters on AI risk. Her PhD at Harvard focused on building interpretable models. At DeepMind, she works on AI alignment including deceptive alignment detection, dangerous capability evaluations, specification gaming, goal misgeneralization, and avoiding side effects. Maintains a widely-referenced list of real-world examples of specification gaming in AI systems.

Expertise
AI SafetyAI AlignmentSpecification GamingSide Effects Avoidance
Education

BS, Statistics and MathematicsUniversity of Toronto

MS, StatisticsUniversity of Toronto

PhD, Statistics and Machine LearningHarvard University

Operational History

2025

Advocacy for AI Safety

Continued advocacy for AI safety through public speaking and writing.

policy
2024

Avoiding Side Effects in RL

Contributed to research on avoiding side effects in reinforcement learning agents.

research
2023

Research on Deceptive Alignment

Published research on detecting deceptive alignment in AI systems.

research
2022

Public Engagement on AI Safety

Participated in various public discussions and panels on AI safety and alignment.

policy
2021

Specification Gaming Database

Launched a widely-referenced database of real-world examples of specification gaming in AI systems.

research
2020

Research Scientist at Google DeepMind

Joined Google DeepMind as a Research Scientist focusing on AI safety and alignment.

career
2018

PhD Completion

Completed PhD in Statistics and Machine Learning at Harvard University.

career
2015

Co-founder of Future of Life Institute

Co-founded the Future of Life Institute to address existential risks from advanced technologies.

founding

AGI Position Assessment

Risk Level
LOW
MODERATE
HIGH
CRITICAL
Predicted AGI Timeline

Unknown

Strong advocate for AI safety research. Co-founded the Future of Life Institute to mitigate existential risks from advanced technology. Works on technical alignment to ensure AI systems behave as intended.

Safety Approach

Strong advocate for AI safety research. Co-founded the Future of Life Institute to mitigate existential risks from advanced technology. Works on technical alignment to ensure AI systems behave as intended.

Intercepted Communications

AI safety is not just a technical challenge; it's a moral imperative.

Public Interview2023-05-15AI Safety

We must ensure that AI systems align with human values to prevent unintended consequences.

Conference Keynote2024-09-10AI Alignment

Specification gaming is a critical area of research that can help us understand AI behavior.

Research Paper2022-11-01Specification Gaming

The future of AI depends on our ability to manage its risks effectively.

Podcast Interview2025-02-20AI Risk

Collaboration across disciplines is essential for advancing AI safety research.

Panel Discussion2023-08-30Collaboration

Research Output

2020s8

Technical Alignment in AI

2025

Journal of Machine Learning Research

Explores technical approaches to AI alignment.

The Role of Human Values in AI Systems

2024

Ethics in AI

Discusses the integration of human values in AI design.

Deceptive Alignment in AI

2023

Journal of AI Research

Discusses methods for detecting deceptive alignment in AI systems.

AI Safety and Alignment: A Review

2023

AI Safety Journal

Reviews current trends and challenges in AI safety and alignment.

Avoiding Side Effects in Reinforcement Learning

2022

Proceedings of the AAAI Conference

Explores strategies for minimizing side effects in RL agents.

Understanding Specification Gaming in AI Systems

2021

arXiv

Introduces a framework for understanding specification gaming.

Specification Gaming: Real-World Examples

2021

AI Research Conference

Presents a database of real-world examples of specification gaming.

Goal Misgeneralization in AI

2020

NeurIPS

Analyzes the phenomenon of goal misgeneralization in AI systems.

Known Associates

Organizational Affiliations

Current

Google DeepMind

AI Safety Researcher

2020-Present

Future of Life Institute

Co-founder

2015-Present

Former

University of Toronto

Teaching Assistant

2016-2018

Source Material

Dossier last updated: 2026-03-04