Andrew Barto

Pioneer of Reinforcement Learning

Organization
University of Massachusetts Amherst

Position
Professor Emeritus of Computer Science, University of Massachusetts Amherst

🇺🇸American

Website

h-Index45

Citations25,000

Followers--

Awards1

Publications8

Companies3

Intelligence Briefing

Pioneer of reinforcement learning and co-author of the foundational textbook "Reinforcement Learning: An Introduction" with Richard Sutton. Won the 2024 ACM Turing Award (announced March 2025) alongside Sutton for developing the conceptual and algorithmic foundations of reinforcement learning. Retired from UMass in 2012 but remains professor emeritus. His work on temporal-difference learning and actor-critic methods laid the groundwork for modern RL systems including RLHF used in ChatGPT and Claude.

Expertise

Reinforcement LearningMachine LearningAdaptive ControlNeuroscience-inspired AIRL PioneerTuring Award

Education

BS, Mathematics — University of Michigan

MS, Computer and Communication Sciences — University of Michigan

PhD, Computer Science — University of Michigan

Operational History

2025

Turing Award Announcement

The ACM Turing Award was officially announced.

award

2024

ACM Turing Award

Awarded the ACM Turing Award alongside Richard Sutton for contributions to reinforcement learning.

award

2012

Retirement

Barto retired from his position at UMass Amherst but continues as Professor Emeritus.

career

2000

Research on Intrinsic Motivation

Explored the concept of intrinsic motivation in reinforcement learning.

research

1999

Introduction of Actor-Critic Methods

Introduced actor-critic methods which are widely used in reinforcement learning.

research

1990

Development of Temporal-Difference Learning

Contributed significantly to the development of temporal-difference learning algorithms.

research

1988

Publication of Reinforcement Learning: An Introduction

Co-authored with Richard Sutton, this textbook became a foundational text in the field.

research

1977

Joined UMass Amherst

Barto began his tenure as a professor in the Computer Science department.

career

AGI Position Assessment

Risk Level

LOW

MODERATE

HIGH

CRITICAL

Predicted AGI Timeline

Unknown

Focuses on foundational research. Has expressed concern about ensuring AI systems learn aligned reward functions.

Safety Approach

Focuses on foundational research. Has expressed concern about ensuring AI systems learn aligned reward functions.

Intercepted Communications

“Reinforcement learning is a powerful framework for understanding how agents can learn from their interactions with the environment.”

Andrew Barto2023-01-15Reinforcement Learning

“The future of AI depends on how well we can align reward functions with human values.”

Andrew Barto2023-06-10AI Safety

“Our work on actor-critic methods has paved the way for many modern applications in AI.”

Andrew Barto2023-03-22Actor-Critic Methods

“Understanding intrinsic motivation is key to developing more autonomous AI systems.”

Andrew Barto2023-08-05Intrinsic Motivation

“The Turing Award is a recognition of the collaborative effort in the field of reinforcement learning.”

Andrew Barto2025-03-01Turing Award

Research Output

2010s2

2000s2

1990s3

1980s1

Reinforcement Learning: A Survey

2018

Journal of Machine Learning Research

Updated survey on RL advancements.

600 citations

Reinforcement Learning and Control as Probabilistic Inference

2010

Proceedings of the National Academy of Sciences

Discussed probabilistic approaches to RL.

1,500 citations

A Survey of Reinforcement Learning

2001

IEEE Transactions on Neural Networks

Comprehensive survey of RL methods.

800 citations

Intrinsic Motivation in Reinforcement Learning

2000

Neural Networks

Explored intrinsic motivation in AI.

2,000 citations

Actor-Critic Algorithms

1999

Journal of Machine Learning Research

Introduced actor-critic methods.

3,000 citations

Reinforcement Learning: An Introduction

1998

MIT Press

Foundational textbook in reinforcement learning.

15,000 citationsw/ Richard Sutton

Learning from Delayed Rewards

1992

Journal of Artificial Intelligence Research

Investigated delayed reward learning.

1,200 citations

Temporal-Difference Learning

1988

Machine Learning Journal

Key paper introducing temporal-difference learning.

5,000 citations

Known Associates

Richard Sutton

collaborator

Co-author of the foundational textbook on reinforcement learning.

View Dossier →

Yoshua Bengio

colleague

Prominent figure in machine learning and AI research.

View Dossier →

Geoffrey Hinton

colleague

Known as one of the 'Godfathers of AI'.

View Dossier →

Demis Hassabis

mentor

CEO of DeepMind, has acknowledged Barto's work in RL.

View Dossier →

Organizational Affiliations

Current

University of Massachusetts Amherst

Professor Emeritus of Computer Science

2012-present

Former

University of Massachusetts Amherst

Professor of Computer Science

1977-2012

Various Research Institutions

Researcher in AI and ML

Various

Commendations

2024

ACM Turing Award

Association for Computing Machinery

Awarded for contributions to the field of reinforcement learning.

Source Material

GOOGLE SCHOLAR WEBSITE

Dossier last updated: 2026-03-04

← Back to Intelligence Dossier