[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Brain-inspired meta-reinforcement learning cognitive control in conflictual inhibition decision-making task for artificial agents

Published: 01 October 2022 Publication History

Abstract

Conflictual cues and unexpected changes in human real-case scenarios may be detrimental to the execution of tasks by artificial agents, thus affecting their performance. Meta-learning applied to reinforcement learning may enhance the design of control algorithms, where an outer learning system progressively adjusts the operation of an inner learning system, leading to practical benefits for the learning schema. Here, we developed a brain-inspired meta-learning framework for inhibition cognitive control that i) exploits the meta-learning principles in the neuromodulation theory proposed by Doya, ii) relies on a well-established neural architecture that contains distributed learning systems in the human brain, and iii) proposes optimization rules of meta-learning hyperparameters that mimic the dynamics of the major neurotransmitters in the brain. We tested an artificial agent in inhibiting the action command in two well-known tasks described in the literature: NoGo and Stop-Signal Paradigms. After a short learning phase, the artificial agent learned to react to the hold signal, and hence to successfully inhibit the motor command in both tasks, via the continuous adjustment of the learning hyperparameters. We found a significant increase in global accuracy, right inhibition, and a reduction in the latency time required to cancel the action process, i.e., the Stop-signal reaction time. We also performed a sensitivity analysis to evaluate the behavioral effects of the meta-parameters, focusing on the serotoninergic modulation of the dopamine release. We demonstrated that brain-inspired principles can be integrated into artificial agents to achieve more flexible behavior when conflictual inhibitory signals are present in the environment.

References

[1]
Akam T., Rodrigues-Vaz I., Marcelo I., Zhang X., Pereira M., Oliveira R.F., et al., The anterior cingulate cortex predicts future states to mediate model-based action selection, Neuron 109 (1) (2021) 149–163,. e7.
[2]
Alexander G.E., Crutcher M.D., Functional architecture of basal ganglia circuits: Neural substrates of parallel processing, Trends in Neurosciences 13 (7) (1990) 266–271,.
[3]
Alexander G.E., DeLong M.R., Strick P.L., Parallel organization of functionally segregated circuits linking basal ganglia and cortex, Annual Review of Neuroscience 9 (1986) 357–381,.
[4]
Alexander M.E., Wickens J.R., Analysis of striatal dynamics: The existence of two modes of behaviour, Journal of Theoretical Biology 163 (4) (1993) 413–438,.
[5]
Amiez C., Local morphology predicts functional organization of the Dorsal Premotor Region in the human brain, Journal of Neuroscience 26 (10) (2006) 2724–2731,.
[6]
Amiez C., Joseph J.-P., Procyk E., Anterior cingulate error-related activity is modulated by predicted reward, European Journal of Neuroscience 21 (12) (2005) 3447–3452,.
[7]
Apicella P., Ljungberg T., Scarnati E., Schultz W., Responses to reward in monkey dorsal and ventral striatum, Experimental Brain Research 85 (3) (1991),.
[8]
Aston-Jones G., Cohen J.D., An integrative theory of LOCUS CoeruleUS-norepinephrine function: Adaptive gain and optimal performance, Annual Review of Neuroscience 28 (1) (2005) 403–450,.
[9]
Avery M.C., Krichmar J.L., Neuromodulatory systems and their interactions: A review of models, theories, and experiments, Frontiers in Neural Circuits 11 (108) (2017),.
[10]
Badre D., Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes, Trends in Cognitive Sciences 12 (5) (2008) 193–200,.
[11]
Badre D., Kayser A.S., D’Esposito M., Frontal cortex and the discovery of abstract action rules, Neuron 66 (2) (2010) 315–326,.
[12]
Band G.P.H., van der Molen M.W., Logan G.D., Horse-race model simulations of the stop-signal procedure, Acta Psychologica 112 (2) (2003) 105–142,.
[13]
Bari A., Eagle D.M., Mar A.C., Robinson E.S.J., Robbins T.W., Dissociable effects of noradrenaline, dopamine, and serotonin uptake blockade on stop task performance in rats, Psychopharmacology 205 (2) (2009) 273–283,.
[14]
Baxter J., Theoretical models of learning to learn, in: Thrun S., Pratt L. (Eds.), Learning to learn, Springer US., 1998, pp. 71–94,.
[15]
Beninger R.J., The role of dopamine in locomotor activity and learning, Brain Research Reviews 6 (2) (1983) 173–196,.
[16]
Berger M., Gray J.A., Roth B.L., The expanded biology of serotonin, Annual Review of Medicine 60 (355) (2009),.
[17]
Berns G.S., Sejnowski T.J., How the basal ganglia make decisions, in: Damasio A.R., Damasio H., Christen Y. (Eds.), Neurobiology of decision-making, Springer Berlin Heidelberg., 1996, pp. 101–113,.
[18]
Berridge K.C., Motivation concepts in behavioral neuroscience, Physiology & Behavior 81 (2) (2004) 179–209,.
[19]
Berridge K.C., Robinson T.E., What is the role of dopamine in reward: Hedonic impact, reward learning, or incentive salience?, Brain Research Reviews 28 (3) (1998) 309–369,.
[20]
Binas J., Rutishauser U., Indiveri G., Pfeiffer M., Learning and stabilization of winner-take-all dynamics through interacting excitatory and inhibitory plasticity, Frontiers in Computational Neuroscience 8 (2014),.
[21]
Bogacz R., Gurney K., The basal ganglia and cortex implement optimal decision making between alternative actions, Neural Computation 19 (2) (2007) 442–477,.
[22]
Botvinick M., Ritter S., Wang J.X., Kurth-Nelson Z., Blundell C., Hassabis D., Reinforcement learning, fast and slow, Trends in Cognitive Sciences 23 (5) (2019) 408–422,.
[23]
Boucher L., Palmeri T.J., Logan G.D., Schall J.D., Inhibitory control in mind and brain: An interactive race model of countermanding saccades, Psychological Review 114 (2) (2007) 376–397,.
[24]
Boureau Y.-L., Dayan P., Opponency revisited: Competition and cooperation between dopamine and serotonin, Neuropsychopharmacology 36 (1) (2011) 74–97,.
[25]
Bouret S., Sara S.J., Network reset: A simplified overarching theory of locus coeruleus noradrenaline function, Trends in Neurosciences 28 (11) (2005) 574–582,.
[26]
Bromberg-Martin E.S., Matsumoto M., Hikosaka O., Dopamine in motivational control: Rewarding, aversive, and alerting, Neuron 68 (5) (2010) 815–834,.
[27]
Caligiore D., Arbib M.A., Miall R.C., Baldassarre G., The super-learning hypothesis: Integrating learning processes across cortex, cerebellum and basal ganglia, Neuroscience and Biobehavioral Reviews 100 (2019) 19–34,. Scopus.
[28]
Cannon C.M., Palmiter R.D., Reward without Dopamine, The Journal of Neuroscience 23 (34) (2003) 10827–10831,.
[29]
Capi G., Doya K., Evolution of neural architecture fitting environmental dynamics, Adaptive Behavior 13 (1) (2005) 53–66,.
[30]
Carr D.B., Sesack S.R., Projections from the rat prefrontal cortex to the Ventral Tegmental Area: Target specificity in the synaptic associations with Mesoaccumbens and Mesocortical neurons, The Journal of Neuroscience 20 (10) (2000) 3864–3873,.
[31]
Chamberlain S.R., Neurochemical modulation of response inhibition and probabilistic learning in humans, Science 311 (5762) (2006) 861–863,.
[32]
Chen W., Hemptinne C.de., Miller A.M., Leibbrand M., Little S.J., Lim D.A., et al., Prefrontal-subthalamic hyperdirect pathway modulates movement inhibition in humans, Neuron 106 (4) (2020) 579–588,.
[33]
Cohen J.D., McClure S.M., Yu A.J., Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration, Philosophical Transactions of the Royal Society, Series B (Biological Sciences) 362 (1481) (2007) 933–942,.
[34]
Collins A.G.E., Frank M.J., Cognitive control over learning: Creating, clustering and generalizing task-set structure, Psychological Review 120 (1) (2013) 190–229,.
[35]
Collins A., Koechlin E., Reasoning, learning, and creativity: Frontal lobe function and human decision-making, PLoS Biology 10 (3) (2012),.
[36]
Cools R., Nakamura K., Daw N.D., Serotonin and dopamine: Unifying affective, activational, and decision functions, Neuropsychopharmacology 36 (1) (2011) 98–113,.
[37]
Daw N.D., Doya K., The computational neurobiology of learning and reward, Current Opinion in Neurobiology 16 (2) (2006) 199–204,.
[38]
Daw N.D., Kakade S., Dayan P., Opponent interactions between serotonin and dopamine, Neural Networks 15 (4–6) (2002) 603–616,.
[39]
Daw N.D., Niv Y., Dayan P., Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control, Nature Neuroscience 8 (12) (2005) 1704–1711,.
[40]
Daw N.D., Tobler P.N., Value learning through reinforcement, in: Neuroeconomics, Elsevier, 2014, pp. 283–298,.
[41]
De Deurwaerdère P., Chagraoui A., Di Giovanni G., Serotonin/dopamine interaction: Electrophysiological and neurochemical evidence, in: Progress in brain research vol. 261, Elsevier., 2021, pp. 161–264,.
[42]
Dehaene S., Kerszberg M., Changeux J.-P., A neuronal model of a global workspace in effortful cognitive tasks, Proceedings of the National Academy of Sciences 95 (24) (1998) 14529–14534,.
[43]
Doya K., What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?, Neural Networks 12 (7–8) (1999) 961–974,.
[44]
Doya K., Complementary roles of basal ganglia and cerebellum in learning and motor control, Current Opinion in Neurobiology 10 (6) (2000) 732–739,.
[45]
Doya K., Metalearning and neuromodulation, Neural Networks 15 (4) (2002) 495–506,.
[46]
Doya K., Uchibe E., The cyber rodent project: Exploration of adaptive mechanisms for self-preservation and self-reproduction, Adaptive Behavior 13 (2) (2005) 149–160,.
[47]
Dreher J.-C., Berman K.F., Fractionating the neural substrate of cognitive control processes, Proceedings of the National Academy of Sciences 99 (22) (2002) 14595–14600,.
[48]
Duan Y., Schulman J., Chen X., Bartlett P.L., Sutskever I., Abbeel P., Rl$̂2$: Fast reinforcement learning via slow reinforcement learning, 2016, arXiv:1611.02779  [Cs, Stat]. http://arxiv.org/abs/1611.02779.
[49]
Dunovan K., Verstynen T., Believer-skeptic meets actor-critic: Rethinking the role of basal ganglia pathways during decision-making and reinforcement learning, Frontiers in Neuroscience 10 (2016),.
[50]
Eagle D.M., Bari A., Robbins T.W., The neuropsychopharmacology of action inhibition: Cross-species translation of the stop-signal and go/no-go tasks, Psychopharmacology 199 (3) (2008) 439–456,.
[51]
Eagle D.M., Baunez C., Is there an inhibitory-response-control system in the rat? Evidence from anatomical and pharmacological studies of behavioral inhibition, Neuroscience & Biobehavioral Reviews 34 (1) (2010) 50–72,.
[52]
Eagle D.M., Baunez C., Hutcheson D.M., Lehmann O., Shah A.P., Robbins T.W., Stop-signal reaction-time task performance: Role of prefrontal cortex and subthalamic nucleus, Cerebral Cortex 18 (1) (2008) 178–188,.
[53]
Elfwing S., Uchibe E., Doya K., Emergence of different mating strategies in artificial embodied evolution, in: Leung C.S., Lee M., Chan J.H. (Eds.), Neural information processing, vol. 5864, Springer Berlin Heidelberg., 2009, pp. 638–647,.
[54]
Elfwing S., Uchibe E., Doya K., Christensen H.I., Co-evolution of shaping rewards and meta-parameters in reinforcement learning, Adaptive Behavior 16 (6) (2008) 400–412,.
[55]
Eriksson A., Capi G., Doya K., Evolution of meta-parameters in reinforcement learning algorithm, in: Proceedings 2003 IEEE/RSJ international conference on intelligent robots and systems (IROS 2003) (Cat. No. 03CH37453), vol. 1, 2003, pp. 412–417,.
[56]
Fischer A.G., Ullsperger M., An update on the role of serotonin and its interplay with dopamine for reward, Frontiers in Human Neuroscience 11 (484) (2017),.
[57]
Fluxe K., Hökfelt T., Johansson O., Jonsson G., Lidbrink P., Ljungdahl A., The origin of the dopamine nerve terminals in limbic and frontal cortex. Evidence for meso-cortico dopamine neurons, Brain Research 82 (2) (1974) 349–355,.
[58]
Guiard B.P., Mansari M.El., Merali Z., Blier P., Functional interactions between dopamine, serotonin and norepinephrine neurons: An in-vivo electrophysiological study in rats with monoaminergic lesions, International Journal of Neuropsychopharmacology 11 (5) (2008) 625–639,.
[59]
Hasselmo M.E., Bower J.M., Acetylcholine and memory, Trends in Neurosciences 16 (6) (1993) 218–222,.
[60]
Hasselmo M., Schnell E., Laminar selectivity of the cholinergic suppression of synaptic transmission in rat hippocampal region CA1: Computational modeling and brain slice physiology, The Journal of Neuroscience 14 (6) (1994) 3898–3914,.
[61]
Heekeren H.R., Marrett S., Ungerleider L.G., The neural systems that mediate human perceptual decision making, Nature Reviews Neuroscience 9 (6) (2008) 467–479,.
[62]
Holroyd C.B., Coles M.G.H., The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity, Psychological Review 109 (4) (2002) 679–709,.
[63]
Homberg J.R., Pattij T., Janssen M.C.W., Ronken E., De Boer S.F., Schoffelmeer A.N.M., et al., Serotonin transporter deficiency in rats improves inhibitory control but not behavioural flexibility: Serotonin transporter knockout and impulse control, European Journal of Neuroscience 26 (7) (2007) 2066–2073,.
[64]
Horvitz J.C., Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events, Neuroscience 96 (4) (2000) 651–656,.
[65]
Houk J.C., Davis J.L., Beiser D.G., Models of information processing in the Basal Ganglia, 2019, http://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9780262275774.
[66]
Humphries M., Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia, Frontiers in Neuroscience 6 (2012),.
[67]
Humphries M.D., Basal ganglia: Mechanisms for action selection, in: Jaeger D., Jung R. (Eds.), Encyclopedia of computational neuroscience, Springer, New York, 2014, pp. 1–7,.
[68]
Ishii S., Yoshida W., Yoshimoto J., Control of exploitation–exploration meta-parameter in reinforcement learning, Neural Networks 15 (4–6) (2002) 665–687,.
[69]
Kaplan R., Schuck N.W., Doeller C.F., The role of mental maps in decision-making, Trends in Neurosciences 40 (5) (2017) 256–259,.
[70]
Kennerley S.W., Walton M.E., Behrens T.E.J., Buckley M.J., Rushworth M.F.S., Optimal decision making and the anterior cingulate cortex, Nature Neuroscience 9 (7) (2006) 940–947,.
[71]
Kesteren M.T.R. van, Ruiter D.J., Fernández G., Henson R.N., How schema and novelty augment memory formation, Trends in Neurosciences 35 (4) (2012) 211–219,.
[72]
Khamassi M., Enel P., Dominey P.F., Procyk E., Medial prefrontal cortex and the adaptive regulation of reinforcement learning parameters, in: Progress in brain research vol. 202, Elsevier, 2013, pp. 441–464,.
[73]
Khamassi M., Lallée S., Enel P., Procyk E., Dominey P.F., Robot cognitive control with a neurophysiologically inspired reinforcement learning model, Frontiers in Neurorobotics 5 (2011),.
[74]
Kim H.R., Malik A.N., Mikhael J.G., Bech P., Tsutsui-Kimura I., Sun F., et al., A unified framework for dopamine signals across timescales, Cell (2020),.
[75]
Krichmar J.L., The neuromodulatory system: A framework for survival and adaptive behavior in a challenging world, Adaptive Behavior 16 (6) (2008) 385–399,.
[76]
Lake B.M., Salakhutdinov R., Tenenbaum J.B., Human-level concept learning through probabilistic program induction, Science 350 (6266) (2015) 1332–1338,.
[77]
Lakens D., Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs, Frontiers in Psychology 4 (2013),.
[78]
Lapidus K.A.B., Stern E.R., Berlin H.A., Goodman W.K., Neuromodulation for obsessive–compulsive disorder, Neurotherapeutics 11 (3) (2014) 485–495,.
[79]
Lee B., Groman S., London E.D., Jentsch J.D., Dopamine D2/D3 receptors play a specific role in the reversal of a learned visual discrimination in monkeys, Neuropsychopharmacology 32 (10) (2007) 2125–2134,.
[80]
Leisman G., Braun-Benjamin O., Melillo R., Cognitive-motor interactions of the basal ganglia in development, Frontiers in Systems Neuroscience 8 (2014),.
[81]
Lowe R., Ziemke T., The feeling of action tendencies: On the emotional regulation of goal-directed behavior, Frontiers in Psychology 2 (2011),.
[82]
Matsumoto M., Hikosaka O., Lateral habenula as a source of negative reward signals in dopamine neurons, Nature 447 (7148) (2007) 1111–1115,.
[83]
Middleton F., Strick P., Anatomical evidence for cerebellar and basal ganglia involvement in higher cognitive function, Science 266 (5184) (1994) 458–461,.
[84]
Montague P.R., Dayan P., Sejnowski T.J., A framework for mesencephalic dopamine systems based on predictive hebbian learning, The Journal of Neuroscience: The Official Journal of the Society for Neuroscience 16 (5) (1996) 1936–1947.
[85]
Mosher C.P., Mamelak A.N., Malekmohammadi M., Pouratian N., Rutishauser U., Distinct roles of dorsal and ventral subthalamic neurons in action selection and cancellation, Neuron (2021),.
[86]
Mosher C.P., Mamelak A.N., Malekmohammadi M., Pouratian N., Rutishauser U., Distinct roles of dorsal and ventral subthalamic neurons in action selection and cancellation, Neuron 109 (5) (2021) 869–881,. e6.
[87]
Nagel K.I., Wilson R.I., Mechanisms underlying population response dynamics in inhibitory interneurons of the drosophila antennal lobe, The Journal of Neuroscience: The Official Journal of the Society for Neuroscience 36 (15) (2016) 4325–4338,.
[88]
Nakamura K., Matsumoto M., Hikosaka O., Reward-dependent modulation of neuronal activity in the primate dorsal raphe nucleus, Journal of Neuroscience 28 (20) (2008) 5331–5343,.
[89]
Padoa-Schioppa C., Assad J.A., Neurons in the orbitofrontal cortex encode economic value, Nature 441 (7090) (2006) 223–226,.
[90]
Partridge J.G., Apparsundaram S., Gerhardt G.A., Ronesi J., Lovinger D.M., Nicotinic acetylcholine receptors interact with dopamine in induction of striatal long-term depression, The Journal of Neuroscience 22 (7) (2002) 2541–2549,.
[91]
Pasquereau B., Turner R.S., A selective role for ventromedial subthalamic nucleus in inhibitory control, ELife 6 (2017),.
[92]
Pasquereau B., Turner R.S., A selective role for ventromedial subthalamic nucleus in inhibitory control, ELife 6 (2017),.
[93]
Pfeifer R., Lungarella M., Iida F., Self-organization, embodiment, and biologically inspired robotics, Science 318 (5853) (2007) 1088–1093,.
[94]
Poulin J.-F., Caronia G., Hofer C., Cui Q., Helm B., Ramakrishnan C., et al., Mapping projections of molecularly defined dopamine neuron subtypes using intersectional genetic approaches, Nature Neuroscience 21 (9) (2018) 1260–1271,.
[95]
Ranade S., Pi H.-J., Kepecs A., Neuroscience: Waiting for serotonin, Current Biology 24 (17) (2014) R803–R805,.
[96]
Rasmusson D.D., The role of acetylcholine in cortical synaptic plasticity, Behavioural Brain Research 115 (2) (2000) 205–218,.
[97]
Redgrave P., Gurney K., Reynolds J., What is reinforced by phasic dopamine signals?, Brain Research Reviews 58 (2) (2008) 322–339,.
[98]
Robinson E.S.J., Dalley J.W., Theobald D.E.H., Glennon J.C., Pezze M.A., Murphy E.R., et al., Opposing roles for 5-HT2a and 5-HT2C receptors in the nucleus accumbens on inhibitory response control in the 5-choice serial reaction time task, Neuropsychopharmacology 33 (10) (2008) 2398–2406,.
[99]
Rosenbloom M.H., Schmahmann J.D., Price B.H., The functional neuroanatomy of decision-making, The Journal of Neuropsychiatry and Clinical Neurosciences 24 (3) (2012) 266–277,.
[100]
Rushworth M.F.S., Behrens T.E.J., Choice, uncertainty and value in prefrontal and cingulate cortex, Nature Neuroscience 11 (4) (2008) 389–397,.
[101]
Rushworth M.F.S., Behrens T.E.J., Rudebeck P.H., Walton M.E., Contrasting roles for cingulate and orbitofrontal cortex in decisions and social behaviour, Trends in Cognitive Sciences 11 (4) (2007) 168–176,.
[102]
Schall J.D., Neural basis of deciding, choosing and acting, nature reviews, Neuroscience 2 (1) (2001) 33–42,.
[103]
Schall J.D., Palmeri T.J., Logan G.D., Models of inhibitory control, Philosophical Transactions of the Royal Society, Series B (Biological Sciences) 372 (1718) (2017),.
[104]
Schmidhuber J., Zhao J., Wiering M., Simple principles of metalearning, 1996, SEE.
[105]
Schmidt R., Leventhal D.K., Mallet N., Chen F., Berke J.D., Canceling actions involves a race between basal ganglia pathways, Nature Neuroscience 16 (8) (2013) 1118–1124,.
[106]
Schultz W., Predictive reward signal of dopamine neurons, Journal of Neurophysiology 80 (1) (1998) 1–27,.
[107]
Schultz W., Apicella P., Ljungberg T., Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task, The Journal of Neuroscience 13 (3) (1993) 900–913,.
[108]
Schultz W., Dayan P., Montague P.R., A neural substrate of prediction and reward, Science 275 (5306) (1997) 1593–1599,.
[109]
Schweighofer N., Bertin M., Shishida K., Okamoto Y., Tanaka S.C., Yamawaki S., et al., Low-serotonin levels increase delayed reward discounting in humans, Journal of Neuroscience 28 (17) (2008) 4528–4532,.
[110]
Schweighofer N., Doya K., Meta-learning in reinforcement learning, Neural Networks: The Official Journal of the International Neural Network Society 16 (1) (2003) 5–9,.
[111]
Schweighofer N., Tanaka S.C., Doya K., Serotonin and the evaluation of future rewards: Theory, experiments, and possible neural mechanisms, Annals of the New York Academy of Sciences 1104 (1) (2007) 289–300,.
[112]
Seo M., Lee E., Averbeck B.B., Action selection and action value in frontal-striatal circuits, Neuron 74 (5) (2012) 947–960,.
[113]
Sesack S.R., Pickel V.M., Prefrontal cortical efferents in the rat synapse on unlabeled neuronal targets of catecholamine terminals in the nucleus accumbens septi and on dopamine neurons in the ventral tegmental area, The Journal of Comparative Neurology 320 (2) (1992) 145–160,.
[114]
Shadlen M.N., Newsome W.T., Neural basis of a perceptual decision in the parietal cortex (area LIP) of the Rhesus monkey, Journal of Neurophysiology 86 (4) (2001) 1916–1936,.
[115]
Solway A., Diuk C., Córdova N., Yee D., Barto A.G., Niv Y., et al., Optimal behavioral hierarchy, PLoS Computational Biology 10 (8) (2014),.
[116]
Spelke E.S., Kinzler K.D., Core knowledge, Developmental Science 10 (1) (2007) 89–96,.
[117]
Starkweather C.K., Uchida N., Dopamine signals as temporal difference errors: Recent advances, Current Opinion in Neurobiology 67 (2021) 95–105,.
[118]
Sutton R.S., Barto A.G., Reinforcement Learning: An Introduction, 2nd ed., The MIT Press., 2018.
[119]
Tanaka S.C., Doya K., Okada G., Ueda K., Okamoto Y., Yamawaki S., Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops, Nature Neuroscience 7 (8) (2004) 887–893,.
[120]
Tanaka S.C., Schweighofer N., Asahi S., Shishida K., Okamoto Y., Yamawaki S., et al., Serotonin differentially regulates short- and long-term prediction of rewards in the ventral and dorsal striatum, PLoS One 2 (12) (2007),.
[121]
Tse D., Langston R.F., Kakeyama M., Bethus I., Spooner P.A., Wood E.R., et al., Schemas and memory consolidation, Science e1333 (5821) (2007) 76–82,.
[122]
Tsutsui K.-I., Grabenhorst F., Kobayashi S., Schultz W., A dynamic code for economic object valuation in prefrontal cortex neurons, Nature Communications 7 (1) (2016) 12554,.
[123]
Usher M., Cohen J.D., Servan-Schreiber D., Rajkowski J., Aston-Jones G., The role of locus coeruleus in the regulation of cognitive performance, Science 283 (5401) (1999) 549–554,.
[124]
Verbruggen F., Aron A.R., Band G.P., Beste C., Bissett P.G., Brockett A.T., et al., A consensus guide to capturing the ability to inhibit actions and impulsive behaviors in the stop-signal task, ELife 8 (e46323) (2019),.
[125]
Wang J.X., Meta-learning in natural and artificial intelligence, 2020, arXiv:2011.13464  [Cs]. http://arxiv.org/abs/2011.13464.
[126]
Wang J.X., Kurth-Nelson Z., Kumaran D., Tirumala D., Soyer H., Leibo J.Z., et al., Prefrontal cortex as a meta-reinforcement learning system, Nature Neuroscience 21 (6) (2018) 860–868,.
[127]
Wang J.X., Kurth-Nelson Z., Tirumala D., Soyer H., Leibo J.Z., Munos R., et al., Learning to reinforcement learn, 2017, arXiv:1611.05763  [Cs, Stat], http://arxiv.org/abs/1611.05763.
[128]
Wessel J.R., Aron A.R., On the globality of motor suppression: Unexpected events and their influence on behavior and cognition, Neuron 93 (2) (2017) 259–280,.
[129]
Wickens J., Striatal dopamine in motor activation and reward-mediated learning: Steps towards a unifying model, Journal of Neural Transmission. General Section 80 (1) (1990) 9–31,.
[130]
Wiecki T.V., Frank M.J., A computational model of inhibitory control in frontal cortex and basal ganglia, Psychological Review 120 (2) (2013) 329–355,.
[131]
Williams B.R., Ponesse J.S., Schachar R.J., Logan G.D., Tannock R., Development of inhibitory control across the life span, Developmental Psychology 35 (1) (1999) 205–213,.
[132]
Winstanley C.A., Theobald D.E.H., Dalley J.W., Robbins T.W., Interactions between serotonin and dopamine in the control of impulsive choice in rats: Therapeutic implications for impulse control disorders, Neuropsychopharmacology 30 (4) (2005) 669–682,.
[133]
Wise R.A., Rompre P.P., Brain dopamine and reward, Annual Review of Psychology 40 (1) (1989) 191–225,.
[134]
Xu Z., van Hasselt H., Hessel M., Oh J., Singh S., Silver D., Meta-gradient reinforcement learning with an objective discovered, 2020, Online. arXiv:2007.08433  [Cs, Stat]. http://arxiv.org/abs/2007.08433.
[135]
Xu Z., Hasselt H.van., Silver D., Meta-gradient reinforcement learning, 2018, arXiv:1805.09801  [Cs, Stat], http://arxiv.org/abs/1805.09801.
[136]
Ye Z., Altena E., Nombela C., Housden C.R., Maxwell H., Rittman T., et al., Selective serotonin reuptake inhibition modulates response inhibition in parkinson’s disease, Brain 137 (4) (2014) 1145–1155,.
[137]
Yu A.J., Dayan P., Acetylcholine in cortical inference, Neural Networks 15 (4–6) (2002) 719–730,.
[138]
Zhou F.-M., Liang Y., Salas R., Zhang L., De Biasi M., Dani J.A., Corelease of dopamine and serotonin from striatal dopamine terminals, Neuron 46 (1) (2005) 65–74,.

Cited By

View all
  • (2023)Cross-Domain Feature learning and data augmentation for few-shot proxy development in oil industryApplied Soft Computing10.1016/j.asoc.2023.110972149:PAOnline publication date: 1-Dec-2023

Index Terms

  1. Brain-inspired meta-reinforcement learning cognitive control in conflictual inhibition decision-making task for artificial agents
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image Neural Networks
            Neural Networks  Volume 154, Issue C
            Oct 2022
            576 pages

            Publisher

            Elsevier Science Ltd.

            United Kingdom

            Publication History

            Published: 01 October 2022

            Author Tags

            1. Meta-learning
            2. Brain-inspired modeling
            3. Inhibition cognitive control
            4. Basal ganglia
            5. Prefrontal cortex

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 14 Dec 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2023)Cross-Domain Feature learning and data augmentation for few-shot proxy development in oil industryApplied Soft Computing10.1016/j.asoc.2023.110972149:PAOnline publication date: 1-Dec-2023

            View Options

            View options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media