Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach

Attari, Nazia; Schlangen, David; Heckmann, Martin; Wersing, Heiko; Zarrieß, Sina

Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach

Attari N, Schlangen D, Heckmann M, Wersing H, Zarrieß S (2022)
In: Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics: 110-120.

Konferenzbeitrag | Veröffentlicht | Englisch

Download

Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!

URL

https://aclanthology.org/2022.inlg-main.9/

Autor*in

Attari, Nazia^UniBi; Schlangen, David; Heckmann, Martin; Wersing, Heiko; Zarrieß, Sina^UniBi

Einrichtung

Fakultät für Linguistik und Literaturwissenschaft > Arbeitsgruppe Angewandte Computerlinguistik

Abstract / Bemerkung

State-of-the-art image captioning models achieve very good performance in generating descriptions for instances of visual categories and reasoning about them, e.g. imposing distinctiveness of the description in the context of distractors. In this work, we propose an inference mechanism that extends an instancelevel captioning model to generate coherent and informative descriptions for groups of visual objects from the same or different categories. We test our model in the domain of bird descriptions. We show that group-level descriptions generated by our method are (i) coherent, pulling together properties that are true for all or majority of its instances, and (ii) informative, as they allow an external BERT-based text classifier to identify the target category more accurately in comparison to single-instance captions and are preferred by human evaluators.

Erscheinungsjahr

2022

Titel des Konferenzbandes

Proceedings of the 15th International Conference on Natural Language Generation

Seite(n)

110-120

Konferenz

International Natural Language Generation Conference (INLG 2022)

Konferenzort

Waterville, Maine, USA and virtual meeting

Konferenzdatum

2022-07-18 – 2022-07-22

ISBN

978-1-955917-57-5

Page URI

https://pub.uni-bielefeld.de/record/2967312

Zitieren

Attari N, Schlangen D, Heckmann M, Wersing H, Zarrieß S. Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. In: Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics; 2022: 110-120.

Attari, N., Schlangen, D., Heckmann, M., Wersing, H., & Zarrieß, S. (2022). Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. Proceedings of the 15th International Conference on Natural Language Generation, 110-120. Stroudsburg, PA: Association for Computational Linguistics.

Attari, Nazia, Schlangen, David, Heckmann, Martin, Wersing, Heiko, and Zarrieß, Sina. 2022. “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”. In Proceedings of the 15th International Conference on Natural Language Generation, 110-120. Stroudsburg, PA: Association for Computational Linguistics.

Attari, N., Schlangen, D., Heckmann, M., Wersing, H., and Zarrieß, S. (2022). “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach” in Proceedings of the 15th International Conference on Natural Language Generation (Stroudsburg, PA: Association for Computational Linguistics), 110-120.

Attari, N., et al., 2022. Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. In Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics, pp. 110-120.

N. Attari, et al., “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”, Proceedings of the 15th International Conference on Natural Language Generation, Stroudsburg, PA: Association for Computational Linguistics, 2022, pp.110-120.

Attari, N., Schlangen, D., Heckmann, M., Wersing, H., Zarrieß, S.: Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach. Proceedings of the 15th International Conference on Natural Language Generation. p. 110-120. Association for Computational Linguistics, Stroudsburg, PA (2022).

Attari, Nazia, Schlangen, David, Heckmann, Martin, Wersing, Heiko, and Zarrieß, Sina. “Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach”. Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics, 2022. 110-120.

Link(s) zu Volltext(en)

URL

https://aclanthology.org/2022.inlg-main.9/

Access Level

Open Access

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar
ISBN Suche

PUB - Publikationen an der Universität Bielefeld

Generating Coherent and Informative Descriptions for Groups of Visual Objects and Categories: A Simple Decoding Approach

Zitieren