Rare neural correlations implement robotic conditioning with reward delays and disturbances

Soltoggio, Andrea; Lemme, Andre; Reinhart, Felix; Steil, Jochen J.

Rare neural correlations implement robotic conditioning with reward delays and disturbances

Soltoggio A, Lemme A, Reinhart F, Steil JJ (2013)
Frontiers in Neurorobotics 7: 6.

Zeitschriftenaufsatz | Veröffentlicht | Englisch

Download

soltoggio.fnbot-07-00006.pdf

DOI

https://doi.org/10.3389/fnbot.2013.00006

URN

urn:nbn:de:0070-pub-25478952

Autor*in

Soltoggio, Andrea^UniBi; Lemme, Andre^UniBi; Reinhart, Felix^UniBi; Steil, Jochen J.^UniBi

Einrichtung

Research Institute for Cognition and Robotics
Center of Excellence - Cognitive Interaction Technology CITEC
Technische Fakultät

Abstract / Bemerkung

Neural conditioning associates cues and actions with following rewards. The environments in which robots operate, however, are pervaded by a variety of disturbing stimuli and uncertain timing. In particular, variable reward delays make it difficult to reconstruct which previous actions are responsible for following rewards. Such an uncertainty is handled by biological neural networks, but represents a challenge for computational models, suggesting the lack of a satisfactory theory for robotic neural conditioning. The present study demonstrates the use of rare neural correlations in making correct associations between rewards and previous cues or actions. Rare correlations are functional in selecting sparse synapses to be eligible for later weight updates if a reward occurs. The repetition of this process singles out the associating and reward-triggering pathways, and thereby copes with distal rewards. The neural network displays macro-level classical and operant conditioning, which is demonstrated in an interactive real-life human-robot interaction. The proposed mechanism models realistic conditioning in humans and animals and implements similar behaviors in neuro-robotic platforms.

Stichworte

CoR-Lab Publication

Erscheinungsjahr

2013

Zeitschriftentitel

Frontiers in Neurorobotics

Band

Seite(n)

ISSN

1662-5218

eISSN

1662-5218

Finanzierungs-Informationen

Open-Access-Publikationskosten wurden durch die Deutsche Forschungsgemeinschaft und die Universität Bielefeld gefördert.

Page URI

https://pub.uni-bielefeld.de/record/2547895

Zitieren

Soltoggio A, Lemme A, Reinhart F, Steil JJ. Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics. 2013;7:6.

Soltoggio, A., Lemme, A., Reinhart, F., & Steil, J. J. (2013). Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics, 7, 6. doi:10.3389/fnbot.2013.00006

Soltoggio, Andrea, Lemme, Andre, Reinhart, Felix, and Steil, Jochen J. 2013. “Rare neural correlations implement robotic conditioning with reward delays and disturbances”. Frontiers in Neurorobotics 7: 6.

Soltoggio, A., Lemme, A., Reinhart, F., and Steil, J. J. (2013). Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics 7, 6.

Soltoggio, A., et al., 2013. Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics, 7, p 6.

A. Soltoggio, et al., “Rare neural correlations implement robotic conditioning with reward delays and disturbances”, Frontiers in Neurorobotics, vol. 7, 2013, pp. 6.

Soltoggio, A., Lemme, A., Reinhart, F., Steil, J.J.: Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics. 7, 6 (2013).

Soltoggio, Andrea, Lemme, Andre, Reinhart, Felix, and Steil, Jochen J. “Rare neural correlations implement robotic conditioning with reward delays and disturbances”. Frontiers in Neurorobotics 7 (2013): 6.

Alle Dateien verfügbar unter der/den folgenden Lizenz(en):

Copyright Statement:

Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]

Volltext(e)

Name

soltoggio.fnbot-07-00006.pdf

Access Level

Open Access

Zuletzt Hochgeladen

2019-09-06T09:18:08Z

MD5 Prüfsumme

9670861895a54af2b5c68913593f242e

Daten bereitgestellt von European Bioinformatics Institute (EBI)

9 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Hebbian learning for online prediction, neural recall and classical conditioning of anthropomimetic robot arm motions.
Feldotto B, Walter F, Röhrbein F, Knoll A., Bioinspir Biomim 13(6), 2018
PMID: 30221625

RM-SORN: a reward-modulated self-organizing recurrent neural network.
Aswolinskiy W, Pipa G., Front Comput Neurosci 9(), 2015
PMID: 25852533

Self-organizing neural integration of pose-motion features for human action recognition.
Parisi GI, Weber C, Wermter S., Front Neurorobot 9(), 2015
PMID: 26106323

Learning touch preferences with a tactile robot using dopamine modulated STDP in a model of insular cortex.
Chou TS, Bucci LD, Krichmar JL., Front Neurorobot 9(), 2015
PMID: 26257639

Reward-Modulated Hebbian Plasticity as Leverage for Partially Embodied Control in Compliant Robotics.
Burms J, Caluwaerts K, Dambre J., Front Neurorobot 9(), 2015
PMID: 26347645

Editorial: Neural plasticity for rich and uncertain robotic information streams.
Soltoggio A, van der Velde F., Front Neurorobot 9(), 2015
PMID: 26578947

Operant conditioning: a minimal components requirement in artificial spiking neurons designed for bio-inspired robot's controller.
Cyr A, Boukadoum M, Thériault F., Front Neurorobot 8(), 2014
PMID: 25120464

Neuromodulatory adaptive combination of correlation-based learning in cerebellum and reward-based learning in basal ganglia for goal-directed behavior control.
Dasgupta S, Wörgötter F, Manoonpong P., Front Neural Circuits 8(), 2014
PMID: 25389391

Value and reward based learning in neurorobots.
Krichmar JL, Röhrbein F., Front Neurorobot 7(), 2013
PMID: 24062683

57 References

Daten bereitgestellt von Europe PubMed Central.

An embodied model of learning, plasticity, and reward
Alexander W., Sporns O.., 2002

Impaired learning and decreased cortical norepinephrine after bilateral locus coeruleus lesions.
Anlezark GM, Crow TJ, Greenway AP., Science 181(4100), 1973
PMID: 4724483

Cognitive developmental robotics as a new paradigm for the design of humanoid robots
Asada M., MacDormanb K., Ishigurob H., Kuniyoshic Y.., 2001

Adaptive gain and the role of the locus coeruleus-norepinephrine system in optimal performance.
Aston-Jones G, Cohen JD., J. Comp. Neurol. 493(1), 2005
PMID: 16254995

Simulation of cholinergic and noradrenergic modulation of behavior in uncertain environments.
Avery MC, Nitz DA, Chiba AA, Krichmar JL., Front Comput Neurosci 6(), 2012
PMID: 22319488

Molecular evidence for a relationship between LINE-1 elements and X chromosome inactivation: the Lyon repeat hypothesis.
Bailey JA, Carrel L, Chakravarti A, Eichler EE., Proc. Natl. Acad. Sci. U.S.A. 97(12), 2000
PMID: 10841562

Operant reward learning in Aplysia: neuronal correlates and mechanisms.
Brembs B, Lorenzetti FD, Reyes FD, Baxter DA, Byrne JH., Science 296(5573), 2002
PMID: 12040200

Classical conditioning in a simple withdrawal reflex in Aplysia californica.
Carew TJ, Walters ET, Kandel ER., J. Neurosci. 1(12), 1981
PMID: 7320755

Neuromodulation as a robot controller: a brain inspired strategy for controlling autonomous robots
Cox R., Krichmar J.., 2009

What are the computations of the cerebellum, the basal ganglia and the cerebral cortex?
Doya K., Neural Netw 12(7-8), 1999
PMID: 12662639

Metalearning and neuromodulation.
Doya K., Neural Netw 15(4-6), 2002
PMID: 12371507

Reinforcement learning with modulated spike timing dependent synaptic plasticity.
Farries MA, Fairhall AL., J. Neurophysiol. 98(6), 2007
PMID: 17928565

Computational models of neuromodulation.
Fellous JM, Linster C., Neural Comput 10(4), 1998
PMID: 9573404

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity.
Florian RV., Neural Comput 19(6), 2007
PMID: 17444757

Synaptic tagging and long-term potentiation.
Frey U, Morris RG., Nature 385(6616), 1997
PMID: 9020359

Gallistel C.., 1993

An identified neuron mediates the unconditioned stimulus in associative olfactory learning in honeybees
Hammer M.., 1993

Runaway synaptic modification in models of cortex: implications for Alzheimer’s disease
Hasselmo M.., 1994

Neuromodulation: acetylcholine and memory consolidation.
Hasselmo ME., Trends Cogn. Sci. (Regul. Ed.) 3(9), 1999
PMID: 10461198

Neuromodulation and cortical function: modeling the physiological basis of behavior.
Hasselmo ME., Behav. Brain Res. 67(1), 1995
PMID: 7748496

Hull C.., 1943

Solving the distal reward problem through linkage of STDP and dopamine signaling.
Izhikevich EM., Cereb. Cortex 17(10), 2007
PMID: 17220510

Kamin L.., 1969

Heterosynaptic facilitation in neurones of the abdominal ganglion of Aplysia depilans.
Kandel ER, Tauc L., J. Physiol. (Lond.) 181(1), 1965
PMID: 5866283

Winner-take-all networks for physiological models of competitive learning
Kaski S., Kohonen T.., 1994

The neuromodulatory system: a framework for survival and adaptive behavior in a challenging world
Krichmar J.., 2008

A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.
Legenstein R, Pecevski D, Maass W., PLoS Comput. Biol. 4(10), 2008
PMID: 18846203

Cellular, synaptic and network effects of neuromodulation.
Marder E, Thirumalai V., Neural Netw 15(4-6), 2002
PMID: 12371506

Variations of box plots
McGill R., Turkey J., Larsen W.., 1978

Learning and memory in honeybees: from behavior to neural substrates.
Menzel R, Muller U., Annu. Rev. Neurosci. 19(), 1996
PMID: 8833448

A framework for mesencephalic dopamine systems based on predictive Hebbian learning.
Montague PR, Dayan P, Sejnowski TJ., J. Neurosci. 16(5), 1996
PMID: 8774460

Osherson D., Stob M., Weinstein S.., 1990

Synaptic tagging, evaluation of memories, and the distal reward problem.
Papper M, Kempter R, Leibold C., Learn. Mem. 18(1), 2010
PMID: 21191043

Pavlov I.., 1927

Reward-modulated Hebbian learning of decision making.
Pfeiffer M, Nessler B, Douglas RJ, Maass W., Neural Comput 22(6), 2010
PMID: 20141476

Learning with "relevance": using a third factor to stabilize Hebbian learning.
Porr B, Worgotter F., Neural Comput 19(10), 2007
PMID: 17716008

An imperfect dopaminergic error signal can drive temporal-difference learning.
Potjans W, Diesmann M, Morrison A., PLoS Comput. Biol. 7(5), 2011
PMID: 21589888

A spiking neural network model of an actor-critic learning agent.
Potjans W, Morrison A, Diesmann M., Neural Comput 21(2), 2009
PMID: 19196231

Making memories last: the synaptic tagging and capture hypothesis.
Redondo RL, Morris RG., Nat. Rev. Neurosci. 12(1), 2011
PMID: 21170072

Dopamine-dependent plasticity of corticostriatal synapses.
Reynolds JN, Wickens JR., Neural Netw 15(4-6), 2002
PMID: 12371508

Order-dependent coincidence detection in cerebellar Purkinje neurons at the inositol trisphosphate receptor.
Sarkisov DV, Wang SS., J. Neurosci. 28(1), 2008
PMID: 18171931

Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task.
Schultz W, Apicella P, Ljungberg T., J. Neurosci. 13(3), 1993
PMID: 8441015

A neural substrate of prediction and reward.
Schultz W, Dayan P, Montague PR., Science 275(5306), 1997
PMID: 9054347

Superstition in the pigeon.
SKINNER BF., J Exp Psychol 38(2), 1948
PMID: 18913665

Skinner B.., 1953

“Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios,”
Soltoggio A., Bullinaria J., Mattiussi C., Dürr P., Floreano D.., 2008

From modulated Hebbian plasticity to simple behavior learning through noise and weight saturation.
Soltoggio A, Stanley KO., Neural Netw 34(), 2012
PMID: 22796669

Solving the distal reward problem with rare correlations.
Soltoggio A, Steil JJ., Neural Comput 25(4), 2013
PMID: 23339615

“Learning at the edge of chaos: temporal coupling of spiking neurons controller for autonomous robotic,”
Soula H., Alwan A., Beslon G.., 2005

Neuromodulation and plasticity in an autonomous robot.
Sporns O, Alexander WH., Neural Netw 15(4-6), 2002
PMID: 12371525

Sutton R., Barto A.., 1998

Thorndike E.., 1911

iCub – the design and realization of an open humanoid platform for cognitive and neuroscience research
Tsakarakis N., Metta G., Sandini G., Vernon D., Beira R., Becchi F.., 2007

Spike-based reinforcement learning in continuous state and action space: when policy gradient methods fail.
Vasilaki E, Fremaux N, Urbanczik R, Senn W, Gerstner W., PLoS Comput. Biol. 5(12), 2009
PMID: 19997492

Coincidence detection in single dendritic spines mediated by calcium release.
Wang SS, Denk W, Hausser M., Nat. Neurosci. 3(12), 2000
PMID: 11100147

The psychology and neuroscience of forgetting.
Wixted JT., Annu Rev Psychol 55(), 2004
PMID: 14744216

Neuromodulation of reactive sensorimotor mappings as short-term memory mechanism in delayed response tasks
Ziemke T., Thieme M.., 2002

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

Quellen

PMID: 23565092
PubMed | Europe PMC

Suchen in

Google Scholar

PUB - Publikationen an der Universität Bielefeld