Generating Landmark-based Manipulation Instructions from Image Pairs

Zarrieß, Sina; Voigt, Henrik; Schlangen, David; Sadler, Philipp

Generating Landmark-based Manipulation Instructions from Image Pairs

Zarrieß S, Voigt H, Schlangen D, Sadler P (2022)
In: Proceedings of the 15th International Conference on Natural Language Generation. Shaikh S, Ferreira T, Stent A (Eds); Stroudsburg, PA: Association for Computational Linguistics: 203-211.

Konferenzbeitrag | Veröffentlicht | Englisch

Download

Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!

URL

https://aclanthology.org/2022.inlg-main.16.pdf

Autor*in

Zarrieß, Sina^UniBi; Voigt, Henrik^UniBi; Schlangen, David; Sadler, Philipp

Herausgeber*in

Shaikh, Samira; Ferreira, Thiago; Stent, Amanda

Einrichtung

Fakultät für Linguistik und Literaturwissenschaft > Arbeitsgruppe Angewandte Computerlinguistik

Abstract / Bemerkung

We investigate the problem of generating landmark-based manipulation instructions (e.g. move the blue block so that it touches the red block on the right) from image pairs showing a before and an after state in a visual scene. We present a transformer model with difference attention heads that learns to attend to target and landmark objects in consecutive images via a difference key. Our model outperforms the state-of-the-art for instruction generation on the BLOCKSdataset and particularly improves the accuracy of generated target and landmark references. Furthermore, our model outperforms state-of-the-art models on a difference spotting dataset.

Erscheinungsjahr

2022

Titel des Konferenzbandes

Proceedings of the 15th International Conference on Natural Language Generation

Seite(n)

203-211

Konferenz

International Natural Language Generation Conference (INLG 2022)

Konferenzort

Waterville, Maine, USA

Konferenzdatum

2022-07-18 – 2022-07-22

ISBN

978-1-955917-57-5

Page URI

https://pub.uni-bielefeld.de/record/2967313

Zitieren

Zarrieß S, Voigt H, Schlangen D, Sadler P. Generating Landmark-based Manipulation Instructions from Image Pairs. In: Shaikh S, Ferreira T, Stent A, eds. Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics; 2022: 203-211.

Zarrieß, S., Voigt, H., Schlangen, D., & Sadler, P. (2022). Generating Landmark-based Manipulation Instructions from Image Pairs. In S. Shaikh, T. Ferreira, & A. Stent (Eds.), Proceedings of the 15th International Conference on Natural Language Generation (pp. 203-211). Stroudsburg, PA: Association for Computational Linguistics.

Zarrieß, Sina, Voigt, Henrik, Schlangen, David, and Sadler, Philipp. 2022. “Generating Landmark-based Manipulation Instructions from Image Pairs”. In Proceedings of the 15th International Conference on Natural Language Generation, ed. Samira Shaikh, Thiago Ferreira, and Amanda Stent, 203-211. Stroudsburg, PA: Association for Computational Linguistics.

Zarrieß, S., Voigt, H., Schlangen, D., and Sadler, P. (2022). “Generating Landmark-based Manipulation Instructions from Image Pairs” in Proceedings of the 15th International Conference on Natural Language Generation, Shaikh, S., Ferreira, T., and Stent, A. eds. (Stroudsburg, PA: Association for Computational Linguistics), 203-211.

Zarrieß, S., et al., 2022. Generating Landmark-based Manipulation Instructions from Image Pairs. In S. Shaikh, T. Ferreira, & A. Stent, eds. Proceedings of the 15th International Conference on Natural Language Generation. Stroudsburg, PA: Association for Computational Linguistics, pp. 203-211.

S. Zarrieß, et al., “Generating Landmark-based Manipulation Instructions from Image Pairs”, Proceedings of the 15th International Conference on Natural Language Generation, S. Shaikh, T. Ferreira, and A. Stent, eds., Stroudsburg, PA: Association for Computational Linguistics, 2022, pp.203-211.

Zarrieß, S., Voigt, H., Schlangen, D., Sadler, P.: Generating Landmark-based Manipulation Instructions from Image Pairs. In: Shaikh, S., Ferreira, T., and Stent, A. (eds.) Proceedings of the 15th International Conference on Natural Language Generation. p. 203-211. Association for Computational Linguistics, Stroudsburg, PA (2022).

Zarrieß, Sina, Voigt, Henrik, Schlangen, David, and Sadler, Philipp. “Generating Landmark-based Manipulation Instructions from Image Pairs”. Proceedings of the 15th International Conference on Natural Language Generation. Ed. Samira Shaikh, Thiago Ferreira, and Amanda Stent. Stroudsburg, PA: Association for Computational Linguistics, 2022. 203-211.

Link(s) zu Volltext(en)

URL

https://aclanthology.org/2022.inlg-main.16.pdf

Access Level

Open Access

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar
ISBN Suche

PUB - Publikationen an der Universität Bielefeld

Generating Landmark-based Manipulation Instructions from Image Pairs

Zitieren