Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach

Attari N, Schlangen D, Heckmann M, Wersing H, Zarrieß S (2022)
In: Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics: 110-120.

Konferenzbeitrag | Veröffentlicht | Englisch
 
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Attari, NaziaUniBi; Schlangen, David; Heckmann, Martin; Wersing, Heiko; Zarrieß, SinaUniBi
Abstract / Bemerkung
State-of-the-art image captioning models achieve very good performance in generating descriptions for instances of visual categories and reasoning about them, e.g. imposing distinctiveness of the description in the context of distractors. In this work, we propose an inference mechanism that extends an instancelevel captioning model to generate coherent and informative descriptions for groups of visual objects from the same or different categories. We test our model in the domain of bird descriptions. We show that group-level descriptions generated by our method are (i) coherent, pulling together properties that are true for all or majority of its instances, and (ii) informative, as they allow an external BERT-based text classifier to identify the target category more accurately in comparison to single-instance captions and are preferred by human evaluators.
Erscheinungsjahr
2022
Titel des Konferenzbandes
Proceedings of the 15th International Conference on Natural Language Generation
Seite(n)
110-120
Konferenz
International Natural Language Generation Conference (INLG 2022)
Konferenzort
Waterville, Maine, USA and virtual meeting
Konferenzdatum
2022-07-18 – 2022-07-22
ISBN
978-1-955917-57-5
Page URI
https://pub.uni-bielefeld.de/record/2967312

Zitieren

Attari N, Schlangen D, Heckmann M, Wersing H, Zarrieß S. Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. In: Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics; 2022: 110-120.
Attari, N., Schlangen, D., Heckmann, M., Wersing, H., & Zarrieß, S. (2022). Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. Proceedings of the 15th International Conference on Natural Language Generation, 110-120. Stroudsburg, PA: Association for Computational Linguistics.
Attari, Nazia, Schlangen, David, Heckmann, Martin, Wersing, Heiko, and Zarrieß, Sina. 2022. “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”. In Proceedings of the 15th International Conference on Natural Language Generation, 110-120. Stroudsburg, PA: Association for Computational Linguistics.
Attari, N., Schlangen, D., Heckmann, M., Wersing, H., and Zarrieß, S. (2022). “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach” in Proceedings of the 15th International Conference on Natural Language Generation (Stroudsburg, PA: Association for Computational Linguistics), 110-120.
Attari, N., et al., 2022. Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. In Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics, pp. 110-120.
N. Attari, et al., “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”, Proceedings of the 15th International Conference on Natural Language Generation, Stroudsburg, PA: Association for Computational Linguistics, 2022, pp.110-120.
Attari, N., Schlangen, D., Heckmann, M., Wersing, H., Zarrieß, S.: Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. Proceedings of the 15th International Conference on Natural Language Generation. p. 110-120. Association for Computational Linguistics, Stroudsburg, PA (2022).
Attari, Nazia, Schlangen, David, Heckmann, Martin, Wersing, Heiko, and Zarrieß, Sina. “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”. Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics, 2022. 110-120.

Link(s) zu Volltext(en)
Access Level
OA Open Access

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar
ISBN Suche