Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach
Attari N, Schlangen D, Heckmann M, Wersing H, Zarrieß S (2022)
In: Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics: 110-120.
Konferenzbeitrag
| Veröffentlicht | Englisch
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Attari, NaziaUniBi;
Schlangen, David;
Heckmann, Martin;
Wersing, Heiko;
Zarrieß, SinaUniBi
Einrichtung
Abstract / Bemerkung
State-of-the-art image captioning models achieve very good performance in generating descriptions for instances of visual categories and reasoning about them, e.g. imposing distinctiveness of the description in the context of distractors. In this work, we propose an inference mechanism that extends an instancelevel captioning model to generate coherent and informative descriptions for groups of visual objects from the same or different categories. We test our model in the domain of bird descriptions. We show that group-level descriptions generated by our method are (i) coherent, pulling together properties that are true for all or majority of its instances, and (ii) informative, as they allow an external BERT-based text classifier to identify the target category more accurately in comparison to single-instance captions and are preferred by human evaluators.
Erscheinungsjahr
2022
Titel des Konferenzbandes
Proceedings of the 15th International Conference on Natural Language Generation
Seite(n)
110-120
Konferenz
International Natural Language Generation Conference (INLG 2022)
Konferenzort
Waterville, Maine, USA and virtual meeting
Konferenzdatum
2022-07-18 – 2022-07-22
ISBN
978-1-955917-57-5
Page URI
https://pub.uni-bielefeld.de/record/2967312
Zitieren
Attari N, Schlangen D, Heckmann M, Wersing H, Zarrieß S. Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. In: Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics; 2022: 110-120.
Attari, N., Schlangen, D., Heckmann, M., Wersing, H., & Zarrieß, S. (2022). Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. Proceedings of the 15th International Conference on Natural Language Generation, 110-120. Stroudsburg, PA: Association for Computational Linguistics.
Attari, Nazia, Schlangen, David, Heckmann, Martin, Wersing, Heiko, and Zarrieß, Sina. 2022. “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”. In Proceedings of the 15th International Conference on Natural Language Generation, 110-120. Stroudsburg, PA: Association for Computational Linguistics.
Attari, N., Schlangen, D., Heckmann, M., Wersing, H., and Zarrieß, S. (2022). “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach” in Proceedings of the 15th International Conference on Natural Language Generation (Stroudsburg, PA: Association for Computational Linguistics), 110-120.
Attari, N., et al., 2022. Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. In Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics, pp. 110-120.
N. Attari, et al., “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”, Proceedings of the 15th International Conference on Natural Language Generation, Stroudsburg, PA: Association for Computational Linguistics, 2022, pp.110-120.
Attari, N., Schlangen, D., Heckmann, M., Wersing, H., Zarrieß, S.: Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. Proceedings of the 15th International Conference on Natural Language Generation. p. 110-120. Association for Computational Linguistics, Stroudsburg, PA (2022).
Attari, Nazia, Schlangen, David, Heckmann, Martin, Wersing, Heiko, and Zarrieß, Sina. “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”. Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics, 2022. 110-120.