Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding

Tan, Xiaoyu; Qu, Chao; Xiong, Junwu; Zhang, James; Qiu, Xihe; Jin, Yaochu

Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding

Tan X, Qu C, Xiong J, Zhang J, Qiu X, Jin Y (2024)
IEEE Transactions on Emerging Topics in Computational Intelligence: 1-13.

Zeitschriftenaufsatz | Veröffentlicht | Englisch

Download

Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!

DOI

https://doi.org/10.1109/TETCI.2024.3369636

Autor*in

Tan, Xiaoyu; Qu, Chao; Xiong, Junwu; Zhang, James; Qiu, Xihe; Jin, Yaochu^UniBi

Einrichtung

Technische Fakultät > AG Nature Inspired Computing and Engineering

Abstract / Bemerkung

Model-based reinforcement learning (MBRL) has shown its advantages in sample efficiency over model-free reinforcement learning (MFRL) by leveraging control-based domain knowledge. Despite the impressive results it achieves, MBRL is still outperformed by MFRL due to the lack of unlimited interactions with the environment. While imaginary data can be generated by imagining the trajectories of future states, a trade-off between the usage of data generation and the influence of model bias remains to be resolved. In this paper, we propose a simple and elegant off-policy model-based deep reinforcement learning algorithm with a model embedded in the framework of probabilistic reinforcement learning, called MEMB. To balance the sample-efficiency and model bias, we exploit both real and imaginary data in training. In particular, we embed the model in the policy update and learn value functions from the real data set. We also provide a theoretical analysis of MEMB with the Lipschitz continuity assumption on the model and policy, proving the reliability of the short-term imaginary rollout. Finally, we evaluate MEMB on several benchmarks and demonstrate that our algorithm can achieve state-of-the-art performance.

Stichworte

Model-based; reinforcement learning; deep reinforcement learning; machine learning

Erscheinungsjahr

2024

Zeitschriftentitel

IEEE Transactions on Emerging Topics in Computational Intelligence

Seite(n)

1-13

eISSN

2471-285X

Page URI

https://pub.uni-bielefeld.de/record/2987810

Zitieren

Tan X, Qu C, Xiong J, Zhang J, Qiu X, Jin Y. Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding. IEEE Transactions on Emerging Topics in Computational Intelligence. 2024:1-13.

Tan, X., Qu, C., Xiong, J., Zhang, J., Qiu, X., & Jin, Y. (2024). Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding. IEEE Transactions on Emerging Topics in Computational Intelligence, 1-13. https://doi.org/10.1109/TETCI.2024.3369636

Tan, Xiaoyu, Qu, Chao, Xiong, Junwu, Zhang, James, Qiu, Xihe, and Jin, Yaochu. 2024. “Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding”. IEEE Transactions on Emerging Topics in Computational Intelligence, 1-13.

Tan, X., Qu, C., Xiong, J., Zhang, J., Qiu, X., and Jin, Y. (2024). Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding. IEEE Transactions on Emerging Topics in Computational Intelligence, 1-13.

Tan, X., et al., 2024. Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding. IEEE Transactions on Emerging Topics in Computational Intelligence, , p 1-13.

X. Tan, et al., “Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding”, IEEE Transactions on Emerging Topics in Computational Intelligence, 2024, pp. 1-13.

Tan, X., Qu, C., Xiong, J., Zhang, J., Qiu, X., Jin, Y.: Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding. IEEE Transactions on Emerging Topics in Computational Intelligence. 1-13 (2024).

Tan, Xiaoyu, Qu, Chao, Xiong, Junwu, Zhang, James, Qiu, Xihe, and Jin, Yaochu. “Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding”. IEEE Transactions on Emerging Topics in Computational Intelligence (2024): 1-13.

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar

PUB - Publikationen an der Universität Bielefeld

Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding

Zitieren