From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain

Rojowiec R, Götze J, Sadler P, Voigt H, Zarrieß S, Schlangen D (2020)
In: Proceedings of the 13th International Conference on Natural Language Generation. Dublin, Ireland: Association for Computational Linguistics: 316-326.

Konferenzbeitrag | Englisch
 
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Rojowiec, Robin; Götze, Jana; Sadler, Philipp; Voigt, Henrik; Zarrieß, SinaUniBi; Schlangen, David
Abstract / Bemerkung
While certain types of instructions can be com-pactly expressed via images, there are situations where one might want to verbalise them, for example when directing someone. We investigate the task of Instruction Generation from Before/After Image Pairs which is to derive from images an instruction for effecting the implied change. For this, we make use of prior work on instruction following in a visual environment. We take an existing dataset, the BLOCKS data collected by Bisk et al. (2016) and investigate whether it is suitable for training an instruction generator as well. We find that it is, and investigate several simple baselines, taking these from the related task of image captioning. Through a series of experiments that simplify the task (by making image processing easier or completely side-stepping it; and by creating template-based targeted instructions), we investigate areas for improvement. We find that captioning models get some way towards solving the task, but have some difficulty with it, and future improvements must lie in the way the change is detected in the instruction.
Erscheinungsjahr
2020
Titel des Konferenzbandes
Proceedings of the 13th International Conference on Natural Language Generation
Seite(n)
316-326
Page URI
https://pub.uni-bielefeld.de/record/2955806

Zitieren

Rojowiec R, Götze J, Sadler P, Voigt H, Zarrieß S, Schlangen D. From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain. In: Proceedings of the 13th International Conference on Natural Language Generation. Dublin, Ireland: Association for Computational Linguistics; 2020: 316-326.
Rojowiec, R., Götze, J., Sadler, P., Voigt, H., Zarrieß, S., & Schlangen, D. (2020). From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain. Proceedings of the 13th International Conference on Natural Language Generation, 316-326. Dublin, Ireland: Association for Computational Linguistics.
Rojowiec, Robin, Götze, Jana, Sadler, Philipp, Voigt, Henrik, Zarrieß, Sina, and Schlangen, David. 2020. “From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain”. In Proceedings of the 13th International Conference on Natural Language Generation, 316-326. Dublin, Ireland: Association for Computational Linguistics.
Rojowiec, R., Götze, J., Sadler, P., Voigt, H., Zarrieß, S., and Schlangen, D. (2020). “From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain” in Proceedings of the 13th International Conference on Natural Language Generation (Dublin, Ireland: Association for Computational Linguistics), 316-326.
Rojowiec, R., et al., 2020. From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain. In Proceedings of the 13th International Conference on Natural Language Generation. Dublin, Ireland: Association for Computational Linguistics, pp. 316-326.
R. Rojowiec, et al., “From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain”, Proceedings of the 13th International Conference on Natural Language Generation, Dublin, Ireland: Association for Computational Linguistics, 2020, pp.316-326.
Rojowiec, R., Götze, J., Sadler, P., Voigt, H., Zarrieß, S., Schlangen, D.: From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain. Proceedings of the 13th International Conference on Natural Language Generation. p. 316-326. Association for Computational Linguistics, Dublin, Ireland (2020).
Rojowiec, Robin, Götze, Jana, Sadler, Philipp, Voigt, Henrik, Zarrieß, Sina, and Schlangen, David. “From "Before'' to "After'': Generating Natural Language Instructions from Image Pairs in a Simple Visual Domain”. Proceedings of the 13th International Conference on Natural Language Generation. Dublin, Ireland: Association for Computational Linguistics, 2020. 316-326.

Link(s) zu Volltext(en)
Access Level
OA Open Access

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar