Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers
Kenneweg P, Schulz A, Schroeder S, Hammer B (2022)
In: Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings. Yin H, Camacho D, Tino P (Eds); Lecture Notes in Computer Science, 13756. Cham: Springer International Publishing: 252-261.
Konferenzbeitrag
| Veröffentlicht | Englisch
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Herausgeber*in
Yin, Hujun;
Camacho, David;
Tino, Peter
Einrichtung
Projekt
Abstract / Bemerkung
Pretraining language models on large text corpora is a common practice in natural language processing. Fine-tuning of these models is then performed to achieve the best results on a variety of tasks. In this paper, we investigate the problem of catastrophic forgetting in transformer neural networks and question the common practice of fine-tuning with a flat learning rate for the entire network in this context. We perform a hyperparameter optimization process to find learning rate distributions that are better than a flat learning rate. We combine the learning rate distributions thus found and show that they generalize to better performance with respect to the problem of catastrophic forgetting. We validate these learning rate distributions with a variety of NLP benchmarks from the GLUE dataset. The source code is open-source and free software, available at https://github.com/TheMody/NAS-CatastrophicForgetting.
Erscheinungsjahr
2022
Titel des Konferenzbandes
Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings
Serien- oder Zeitschriftentitel
Lecture Notes in Computer Science
Band
13756
Seite(n)
252-261
Konferenz
23rd International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2022)
Konferenzort
Manchester, UK
Konferenzdatum
2022-11-24 – 2022-11-26
ISBN
978-3-031-21752-4
eISBN
978-3-031-21753-1
Page URI
https://pub.uni-bielefeld.de/record/2967096
Zitieren
Kenneweg P, Schulz A, Schroeder S, Hammer B. Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers. In: Yin H, Camacho D, Tino P, eds. Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings. Lecture Notes in Computer Science. Vol 13756. Cham: Springer International Publishing; 2022: 252-261.
Kenneweg, P., Schulz, A., Schroeder, S., & Hammer, B. (2022). Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers. In H. Yin, D. Camacho, & P. Tino (Eds.), Lecture Notes in Computer Science: Vol. 13756. Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings (pp. 252-261). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-21753-1_25
Kenneweg, Philip, Schulz, Alexander, Schroeder, Sarah, and Hammer, Barbara. 2022. “Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers”. In Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings, ed. Hujun Yin, David Camacho, and Peter Tino, 13756:252-261. Lecture Notes in Computer Science. Cham: Springer International Publishing.
Kenneweg, P., Schulz, A., Schroeder, S., and Hammer, B. (2022). “Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers” in Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings, Yin, H., Camacho, D., and Tino, P. eds. Lecture Notes in Computer Science, vol. 13756, (Cham: Springer International Publishing), 252-261.
Kenneweg, P., et al., 2022. Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers. In H. Yin, D. Camacho, & P. Tino, eds. Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings. Lecture Notes in Computer Science. no.13756 Cham: Springer International Publishing, pp. 252-261.
P. Kenneweg, et al., “Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers”, Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings, H. Yin, D. Camacho, and P. Tino, eds., Lecture Notes in Computer Science, vol. 13756, Cham: Springer International Publishing, 2022, pp.252-261.
Kenneweg, P., Schulz, A., Schroeder, S., Hammer, B.: Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers. In: Yin, H., Camacho, D., and Tino, P. (eds.) Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings. Lecture Notes in Computer Science. 13756, p. 252-261. Springer International Publishing, Cham (2022).
Kenneweg, Philip, Schulz, Alexander, Schroeder, Sarah, and Hammer, Barbara. “Intelligent Learning Rate Distribution to Reduce Catastrophic Forgetting in Transformers”. Intelligent Data Engineering and Automated Learning – IDEAL 2022. 23rd International Conference, IDEAL 2022, Manchester, UK, November 24–26, 2022, Proceedings. Ed. Hujun Yin, David Camacho, and Peter Tino. Cham: Springer International Publishing, 2022.Vol. 13756. Lecture Notes in Computer Science. 252-261.