Rare neural correlations implement robotic conditioning with reward delays and disturbances

Soltoggio A, Lemme A, Reinhart F, Steil JJ (2013)
Frontiers in Neurorobotics 7: 6.

Download
OA
Zeitschriftenaufsatz | Veröffentlicht | Englisch
Volltext vorhanden für diesen Nachweis
Abstract / Bemerkung
Neural conditioning associates cues and actions with following rewards. The environments in which robots operate, however, are pervaded by a variety of disturbing stimuli and uncertain timing. In particular, variable reward delays make it difficult to reconstruct which previous actions are responsible for following rewards. Such an uncertainty is handled by biological neural networks, but represents a challenge for computational models, suggesting the lack of a satisfactory theory for robotic neural conditioning. The present study demonstrates the use of rare neural correlations in making correct associations between rewards and previous cues or actions. Rare correlations are functional in selecting sparse synapses to be eligible for later weight updates if a reward occurs. The repetition of this process singles out the associating and reward-triggering pathways, and thereby copes with distal rewards. The neural network displays macro-level classical and operant conditioning, which is demonstrated in an interactive real-life human-robot interaction. The proposed mechanism models realistic conditioning in humans and animals and implements similar behaviors in neuro-robotic platforms.
Stichworte
Erscheinungsjahr
Zeitschriftentitel
Frontiers in Neurorobotics
Band
7
Seite
6
ISSN
eISSN
Finanzierungs-Informationen
Article Processing Charge funded by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University.
PUB-ID

Zitieren

Soltoggio A, Lemme A, Reinhart F, Steil JJ. Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics. 2013;7:6.
Soltoggio, A., Lemme, A., Reinhart, F., & Steil, J. J. (2013). Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics, 7, 6. doi:10.3389/fnbot.2013.00006
Soltoggio, A., Lemme, A., Reinhart, F., and Steil, J. J. (2013). Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics 7, 6.
Soltoggio, A., et al., 2013. Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics, 7, p 6.
A. Soltoggio, et al., “Rare neural correlations implement robotic conditioning with reward delays and disturbances”, Frontiers in Neurorobotics, vol. 7, 2013, pp. 6.
Soltoggio, A., Lemme, A., Reinhart, F., Steil, J.J.: Rare neural correlations implement robotic conditioning with reward delays and disturbances. Frontiers in Neurorobotics. 7, 6 (2013).
Soltoggio, Andrea, Lemme, Andre, Reinhart, Felix, and Steil, Jochen J. “Rare neural correlations implement robotic conditioning with reward delays and disturbances”. Frontiers in Neurorobotics 7 (2013): 6.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2013-04-16T11:47:40Z

9 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

RM-SORN: a reward-modulated self-organizing recurrent neural network.
Aswolinskiy W, Pipa G., Front Comput Neurosci 9(), 2015
PMID: 25852533
Self-organizing neural integration of pose-motion features for human action recognition.
Parisi GI, Weber C, Wermter S., Front Neurorobot 9(), 2015
PMID: 26106323
Editorial: Neural plasticity for rich and uncertain robotic information streams.
Soltoggio A, van der Velde F., Front Neurorobot 9(), 2015
PMID: 26578947
Value and reward based learning in neurorobots.
Krichmar JL, Röhrbein F., Front Neurorobot 7(), 2013
PMID: 24062683

57 References

Daten bereitgestellt von Europe PubMed Central.

An embodied model of learning, plasticity, and reward
Alexander W., Sporns O.., 2002
Impaired learning and decreased cortical norepinephrine after bilateral locus coeruleus lesions.
Anlezark GM, Crow TJ, Greenway AP., Science 181(4100), 1973
PMID: 4724483
Cognitive developmental robotics as a new paradigm for the design of humanoid robots
Asada M., MacDormanb K., Ishigurob H., Kuniyoshic Y.., 2001
Simulation of cholinergic and noradrenergic modulation of behavior in uncertain environments.
Avery MC, Nitz DA, Chiba AA, Krichmar JL., Front Comput Neurosci 6(), 2012
PMID: 22319488
Molecular evidence for a relationship between LINE-1 elements and X chromosome inactivation: the Lyon repeat hypothesis.
Bailey JA, Carrel L, Chakravarti A, Eichler EE., Proc. Natl. Acad. Sci. U.S.A. 97(12), 2000
PMID: 10841562
Operant reward learning in Aplysia: neuronal correlates and mechanisms.
Brembs B, Lorenzetti FD, Reyes FD, Baxter DA, Byrne JH., Science 296(5573), 2002
PMID: 12040200
Classical conditioning in a simple withdrawal reflex in Aplysia californica.
Carew TJ, Walters ET, Kandel ER., J. Neurosci. 1(12), 1981
PMID: 7320755
Neuromodulation as a robot controller: a brain inspired strategy for controlling autonomous robots
Cox R., Krichmar J.., 2009
Metalearning and neuromodulation.
Doya K., Neural Netw 15(4-6), 2002
PMID: 12371507
Reinforcement learning with modulated spike timing dependent synaptic plasticity.
Farries MA, Fairhall AL., J. Neurophysiol. 98(6), 2007
PMID: 17928565
Computational models of neuromodulation.
Fellous JM, Linster C., Neural Comput 10(4), 1998
PMID: 9573404
Synaptic tagging and long-term potentiation.
Frey U, Morris RG., Nature 385(6616), 1997
PMID: 9020359

Gallistel C.., 1993
An identified neuron mediates the unconditioned stimulus in associative olfactory learning in honeybees
Hammer M.., 1993
Runaway synaptic modification in models of cortex: implications for Alzheimer’s disease
Hasselmo M.., 1994
Neuromodulation: acetylcholine and memory consolidation.
Hasselmo ME., Trends Cogn. Sci. (Regul. Ed.) 3(9), 1999
PMID: 10461198

Hull C.., 1943

Kamin L.., 1969
Heterosynaptic facilitation in neurones of the abdominal ganglion of Aplysia depilans.
Kandel ER, Tauc L., J. Physiol. (Lond.) 181(1), 1965
PMID: 5866283
Winner-take-all networks for physiological models of competitive learning
Kaski S., Kohonen T.., 1994
The neuromodulatory system: a framework for survival and adaptive behavior in a challenging world
Krichmar J.., 2008
A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.
Legenstein R, Pecevski D, Maass W., PLoS Comput. Biol. 4(10), 2008
PMID: 18846203
Cellular, synaptic and network effects of neuromodulation.
Marder E, Thirumalai V., Neural Netw 15(4-6), 2002
PMID: 12371506
Variations of box plots
McGill R., Turkey J., Larsen W.., 1978
Learning and memory in honeybees: from behavior to neural substrates.
Menzel R, Muller U., Annu. Rev. Neurosci. 19(), 1996
PMID: 8833448
A framework for mesencephalic dopamine systems based on predictive Hebbian learning.
Montague PR, Dayan P, Sejnowski TJ., J. Neurosci. 16(5), 1996
PMID: 8774460

Osherson D., Stob M., Weinstein S.., 1990
Synaptic tagging, evaluation of memories, and the distal reward problem.
Papper M, Kempter R, Leibold C., Learn. Mem. 18(1), 2010
PMID: 21191043

Pavlov I.., 1927
Reward-modulated Hebbian learning of decision making.
Pfeiffer M, Nessler B, Douglas RJ, Maass W., Neural Comput 22(6), 2010
PMID: 20141476
Learning with "relevance": using a third factor to stabilize Hebbian learning.
Porr B, Worgotter F., Neural Comput 19(10), 2007
PMID: 17716008
An imperfect dopaminergic error signal can drive temporal-difference learning.
Potjans W, Diesmann M, Morrison A., PLoS Comput. Biol. 7(5), 2011
PMID: 21589888
A spiking neural network model of an actor-critic learning agent.
Potjans W, Morrison A, Diesmann M., Neural Comput 21(2), 2009
PMID: 19196231
Making memories last: the synaptic tagging and capture hypothesis.
Redondo RL, Morris RG., Nat. Rev. Neurosci. 12(1), 2011
PMID: 21170072
Dopamine-dependent plasticity of corticostriatal synapses.
Reynolds JN, Wickens JR., Neural Netw 15(4-6), 2002
PMID: 12371508
A neural substrate of prediction and reward.
Schultz W, Dayan P, Montague PR., Science 275(5306), 1997
PMID: 9054347
Superstition in the pigeon.
SKINNER BF., J Exp Psychol 38(2), 1948
PMID: 18913665

Skinner B.., 1953
“Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios,”
Soltoggio A., Bullinaria J., Mattiussi C., Dürr P., Floreano D.., 2008
Solving the distal reward problem with rare correlations.
Soltoggio A, Steil JJ., Neural Comput 25(4), 2013
PMID: 23339615
“Learning at the edge of chaos: temporal coupling of spiking neurons controller for autonomous robotic,”
Soula H., Alwan A., Beslon G.., 2005
Neuromodulation and plasticity in an autonomous robot.
Sporns O, Alexander WH., Neural Netw 15(4-6), 2002
PMID: 12371525

Sutton R., Barto A.., 1998

Thorndike E.., 1911
iCub – the design and realization of an open humanoid platform for cognitive and neuroscience research
Tsakarakis N., Metta G., Sandini G., Vernon D., Beira R., Becchi F.., 2007
Spike-based reinforcement learning in continuous state and action space: when policy gradient methods fail.
Vasilaki E, Fremaux N, Urbanczik R, Senn W, Gerstner W., PLoS Comput. Biol. 5(12), 2009
PMID: 19997492
Coincidence detection in single dendritic spines mediated by calcium release.
Wang SS, Denk W, Hausser M., Nat. Neurosci. 3(12), 2000
PMID: 11100147
The psychology and neuroscience of forgetting.
Wixted JT., Annu Rev Psychol 55(), 2004
PMID: 14744216
Neuromodulation of reactive sensorimotor mappings as short-term memory mechanism in delayed response tasks
Ziemke T., Thieme M.., 2002

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

Quellen

PMID: 23565092
PubMed | Europe PMC

Suchen in

Google Scholar