Dopamine, reward learning, and active inference

Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine i...

Full description

Bibliographic Details
Main Authors: FitzGerald, Thomas H. B., Dolan, Raymond J., Friston, Karl
Format: Online
Language:English
Published: Frontiers Media S.A. 2015
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4631836/
id pubmed-4631836
recordtype oai_dc
spelling pubmed-46318362015-11-18 Dopamine, reward learning, and active inference FitzGerald, Thomas H. B. Dolan, Raymond J. Friston, Karl Neuroscience Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We offer a resolution to this paradox based on an hypothesis that dopamine encodes the precision of beliefs about alternative actions, and thus controls the outcome-sensitivity of behavior. We extend an active inference scheme for solving Markov decision processes to include learning, and show that simulated dopamine dynamics strongly resemble those actually observed during instrumental conditioning. Furthermore, simulated dopamine depletion impairs performance but spares learning, while simulated excitation of dopamine neurons drives reward learning, through aberrant inference about outcome states. Our formal approach provides a novel and parsimonious reconciliation of apparently divergent experimental findings. Frontiers Media S.A. 2015-11-04 /pmc/articles/PMC4631836/ /pubmed/26581305 http://dx.doi.org/10.3389/fncom.2015.00136 Text en Copyright © 2015 FitzGerald, Dolan and Friston. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
repository_type Open Access Journal
institution_category Foreign Institution
institution US National Center for Biotechnology Information
building NCBI PubMed
collection Online Access
language English
format Online
author FitzGerald, Thomas H. B.
Dolan, Raymond J.
Friston, Karl
spellingShingle FitzGerald, Thomas H. B.
Dolan, Raymond J.
Friston, Karl
Dopamine, reward learning, and active inference
author_facet FitzGerald, Thomas H. B.
Dolan, Raymond J.
Friston, Karl
author_sort FitzGerald, Thomas H. B.
title Dopamine, reward learning, and active inference
title_short Dopamine, reward learning, and active inference
title_full Dopamine, reward learning, and active inference
title_fullStr Dopamine, reward learning, and active inference
title_full_unstemmed Dopamine, reward learning, and active inference
title_sort dopamine, reward learning, and active inference
description Temporal difference learning models propose phasic dopamine signaling encodes reward prediction errors that drive learning. This is supported by studies where optogenetic stimulation of dopamine neurons can stand in lieu of actual reward. Nevertheless, a large body of data also shows that dopamine is not necessary for learning, and that dopamine depletion primarily affects task performance. We offer a resolution to this paradox based on an hypothesis that dopamine encodes the precision of beliefs about alternative actions, and thus controls the outcome-sensitivity of behavior. We extend an active inference scheme for solving Markov decision processes to include learning, and show that simulated dopamine dynamics strongly resemble those actually observed during instrumental conditioning. Furthermore, simulated dopamine depletion impairs performance but spares learning, while simulated excitation of dopamine neurons drives reward learning, through aberrant inference about outcome states. Our formal approach provides a novel and parsimonious reconciliation of apparently divergent experimental findings.
publisher Frontiers Media S.A.
publishDate 2015
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4631836/
_version_ 1613496855767810048