← Back to Intelligence Dossier
Andrew Barto

Andrew Barto

Pioneer of Reinforcement Learning

Organization
University of Massachusetts Amherst

Position
Professor Emeritus of Computer Science, University of Massachusetts Amherst

πŸ‡ΊπŸ‡ΈAmerican
h-Index45
Citations25,000
Followers--
Awards1
Publications8
Companies3

Intelligence Briefing

Pioneer of reinforcement learning and co-author of the foundational textbook "Reinforcement Learning: An Introduction" with Richard Sutton. Won the 2024 ACM Turing Award (announced March 2025) alongside Sutton for developing the conceptual and algorithmic foundations of reinforcement learning. Retired from UMass in 2012 but remains professor emeritus. His work on temporal-difference learning and actor-critic methods laid the groundwork for modern RL systems including RLHF used in ChatGPT and Claude.

Expertise
Reinforcement LearningMachine LearningAdaptive ControlNeuroscience-inspired AIRL PioneerTuring Award
Education

BS, Mathematics β€” University of Michigan

MS, Computer and Communication Sciences β€” University of Michigan

PhD, Computer Science β€” University of Michigan

Operational History

2025

Turing Award Announcement

The ACM Turing Award was officially announced.

award
2024

ACM Turing Award

Awarded the ACM Turing Award alongside Richard Sutton for contributions to reinforcement learning.

award
2012

Retirement

Barto retired from his position at UMass Amherst but continues as Professor Emeritus.

career
2000

Research on Intrinsic Motivation

Explored the concept of intrinsic motivation in reinforcement learning.

research
1999

Introduction of Actor-Critic Methods

Introduced actor-critic methods which are widely used in reinforcement learning.

research
1990

Development of Temporal-Difference Learning

Contributed significantly to the development of temporal-difference learning algorithms.

research
1988

Publication of Reinforcement Learning: An Introduction

Co-authored with Richard Sutton, this textbook became a foundational text in the field.

research
1977

Joined UMass Amherst

Barto began his tenure as a professor in the Computer Science department.

career

AGI Position Assessment

Risk Level
LOW
MODERATE
HIGH
CRITICAL
Predicted AGI Timeline

Unknown

Focuses on foundational research. Has expressed concern about ensuring AI systems learn aligned reward functions.

Safety Approach

Focuses on foundational research. Has expressed concern about ensuring AI systems learn aligned reward functions.

Intercepted Communications

β€œReinforcement learning is a powerful framework for understanding how agents can learn from their interactions with the environment.”

Andrew Barto2023-01-15Reinforcement Learning

β€œThe future of AI depends on how well we can align reward functions with human values.”

Andrew Barto2023-06-10AI Safety

β€œOur work on actor-critic methods has paved the way for many modern applications in AI.”

Andrew Barto2023-03-22Actor-Critic Methods

β€œUnderstanding intrinsic motivation is key to developing more autonomous AI systems.”

Andrew Barto2023-08-05Intrinsic Motivation

β€œThe Turing Award is a recognition of the collaborative effort in the field of reinforcement learning.”

Andrew Barto2025-03-01Turing Award

Research Output

2010s2
2000s2
1990s3
1980s1

Reinforcement Learning: A Survey

2018

Journal of Machine Learning Research

Updated survey on RL advancements.

600 citations

Reinforcement Learning and Control as Probabilistic Inference

2010

Proceedings of the National Academy of Sciences

Discussed probabilistic approaches to RL.

1,500 citations

A Survey of Reinforcement Learning

2001

IEEE Transactions on Neural Networks

Comprehensive survey of RL methods.

800 citations

Intrinsic Motivation in Reinforcement Learning

2000

Neural Networks

Explored intrinsic motivation in AI.

2,000 citations

Actor-Critic Algorithms

1999

Journal of Machine Learning Research

Introduced actor-critic methods.

3,000 citations

Reinforcement Learning: An Introduction

1998

MIT Press

Foundational textbook in reinforcement learning.

15,000 citationsw/ Richard Sutton

Learning from Delayed Rewards

1992

Journal of Artificial Intelligence Research

Investigated delayed reward learning.

1,200 citations

Temporal-Difference Learning

1988

Machine Learning Journal

Key paper introducing temporal-difference learning.

5,000 citations

Known Associates

Organizational Affiliations

Current

University of Massachusetts Amherst

Professor Emeritus of Computer Science

2012-present

Former

University of Massachusetts Amherst

Professor of Computer Science

1977-2012

Various Research Institutions

Researcher in AI and ML

Various

Commendations

2024

ACM Turing Award

Association for Computing Machinery

Awarded for contributions to the field of reinforcement learning.

Source Material

Dossier last updated: 2026-03-04