Model Interpretability and Rationale Extraction by Input Mask Optimization

Brinner MF, Zarrieß S (2023)
In: Findings of the Association for Computational Linguistics: ACL 2023. Rogers A, Boyd-Graber J, Okazaki N (Eds); Toronto, Canada: Association for Computational Linguistics: 13722-13744.

Konferenzbeitrag | Englisch
 
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Herausgeber*in
Rogers, Anna; Boyd-Graber, Jordan; Okazaki, Naoaki
Abstract / Bemerkung
Concurrent with the rapid progress in neural network-based models in NLP, the need for creating explanations for the predictions of these black-box models has risen steadily. Yet, especially for complex inputs like texts or images, existing interpretability methods still struggle with deriving easily interpretable explanations that also accurately represent the basis for the model{'}s decision. To this end, we propose a new, model-agnostic method to generate extractive explanations for predictions made by neural networks, that is based on masking parts of the input which the model does not consider to be indicative of the respective class. The masking is done using gradient-based optimization combined with a new regularization scheme that enforces sufficiency, comprehensiveness, and compactness of the generated explanation. Our method achieves state-of-the-art results in a challenging paragraph-level rationale extraction task, showing that this task can be performed without training a specialized model. We further apply our method to image inputs and obtain high-quality explanations for image classifications, which indicates that the objectives for optimizing explanation masks in text generalize to inputs of other modalities.
Erscheinungsjahr
2023
Titel des Konferenzbandes
Findings of the Association for Computational Linguistics: ACL 2023
Seite(n)
13722-13744
Page URI
https://pub.uni-bielefeld.de/record/2984123

Zitieren

Brinner MF, Zarrieß S. Model Interpretability and Rationale Extraction by Input Mask Optimization. In: Rogers A, Boyd-Graber J, Okazaki N, eds. Findings of the Association for Computational Linguistics: ACL 2023. Toronto, Canada: Association for Computational Linguistics; 2023: 13722-13744.
Brinner, M. F., & Zarrieß, S. (2023). Model Interpretability and Rationale Extraction by Input Mask Optimization. In A. Rogers, J. Boyd-Graber, & N. Okazaki (Eds.), Findings of the Association for Computational Linguistics: ACL 2023 (pp. 13722-13744). Toronto, Canada: Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-acl.867
Brinner, Marc Felix, and Zarrieß, Sina. 2023. “Model Interpretability and Rationale Extraction by Input Mask Optimization”. In Findings of the Association for Computational Linguistics: ACL 2023, ed. Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, 13722-13744. Toronto, Canada: Association for Computational Linguistics.
Brinner, M. F., and Zarrieß, S. (2023). “Model Interpretability and Rationale Extraction by Input Mask Optimization” in Findings of the Association for Computational Linguistics: ACL 2023, Rogers, A., Boyd-Graber, J., and Okazaki, N. eds. (Toronto, Canada: Association for Computational Linguistics), 13722-13744.
Brinner, M.F., & Zarrieß, S., 2023. Model Interpretability and Rationale Extraction by Input Mask Optimization. In A. Rogers, J. Boyd-Graber, & N. Okazaki, eds. Findings of the Association for Computational Linguistics: ACL 2023. Toronto, Canada: Association for Computational Linguistics, pp. 13722-13744.
M.F. Brinner and S. Zarrieß, “Model Interpretability and Rationale Extraction by Input Mask Optimization”, Findings of the Association for Computational Linguistics: ACL 2023, A. Rogers, J. Boyd-Graber, and N. Okazaki, eds., Toronto, Canada: Association for Computational Linguistics, 2023, pp.13722-13744.
Brinner, M.F., Zarrieß, S.: Model Interpretability and Rationale Extraction by Input Mask Optimization. In: Rogers, A., Boyd-Graber, J., and Okazaki, N. (eds.) Findings of the Association for Computational Linguistics: ACL 2023. p. 13722-13744. Association for Computational Linguistics, Toronto, Canada (2023).
Brinner, Marc Felix, and Zarrieß, Sina. “Model Interpretability and Rationale Extraction by Input Mask Optimization”. Findings of the Association for Computational Linguistics: ACL 2023. Ed. Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki. Toronto, Canada: Association for Computational Linguistics, 2023. 13722-13744.

Link(s) zu Volltext(en)
Access Level
OA Open Access

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar