A multimodal, multilabel approach to recognize emotions in oral history interviews
Viswanath A, Gref M, Hassan T, Schmidt C (2024)
In: Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE: 291–299.
Konferenzbeitrag
| Veröffentlicht | Englisch
Download
viswanath-etal-2024-ACII.pdf
247.33 KB
Autor*in
Viswanath, AnarghUniBi ;
Gref, Michael;
Hassan, Teena;
Schmidt, Christoph
Abstract / Bemerkung
In this paper, we present a multilabel approach for multimodal emotion recognition in the challenging but unexplored use case of analyzing oral history interviews. Oral history is a methodological tool where historical research into the past is conducted through recorded video interviews that reflect the narrator’s personal experiences. The analysis of emotional content in oral history interviews is helpful in studies to understand trauma and historical memories of individuals. The emotions present in these interviews are subtle, natural, and complex. The lack of self-reported labels necessitates the use of observer-reported emotion labels. However, the complexity of the emotions as well as the diversity in perception of emotions makes it difficult to describe these emotions using single categories. Furthermore, unimodal analysis relying only on one of facial expressions, vocal cues, or spoken words is inadequate given the multimodal nature of the narration. To address these challenges, this paper proposes a multilabel, multimodal approach to perform emotion recognition on the novel HdG dataset, consisting of German oral history interviews. The proposed approach utilizes the state-of-the-art Multimodal Transformer for the fusion of audio, textual, and visual features and extends the work to perform multilabel classification on the six Ekman emotion classes. Our approach achieves a mean AUC score of 0.74 and a mean balAcc score of 0.70, significantly outperforming previous unimodal multiclass methods and setting a benchmark for future multimodal emotion recognition research in this domain.
Stichworte
dililab
Erscheinungsjahr
2024
Titel des Konferenzbandes
Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII)
Seite(n)
291–299
Urheberrecht / Lizenzen
Konferenz
12th International Conference on Affective Computing and Intelligent Interaction (ACII 2024)
Konferenzort
Glasgow, UK
Konferenzdatum
2024-09-15 – 2024-09-18
Page URI
https://pub.uni-bielefeld.de/record/2992507
Zitieren
Viswanath A, Gref M, Hassan T, Schmidt C. A multimodal, multilabel approach to recognize emotions in oral history interviews. In: Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE; 2024: 291–299.
Viswanath, A., Gref, M., Hassan, T., & Schmidt, C. (2024). A multimodal, multilabel approach to recognize emotions in oral history interviews. Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII), 291–299. IEEE.
Viswanath, Anargh, Gref, Michael, Hassan, Teena, and Schmidt, Christoph. 2024. “A multimodal, multilabel approach to recognize emotions in oral history interviews”. In Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII), 291–299. IEEE.
Viswanath, A., Gref, M., Hassan, T., and Schmidt, C. (2024). “A multimodal, multilabel approach to recognize emotions in oral history interviews” in Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII) (IEEE), 291–299.
Viswanath, A., et al., 2024. A multimodal, multilabel approach to recognize emotions in oral history interviews. In Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, pp. 291–299.
A. Viswanath, et al., “A multimodal, multilabel approach to recognize emotions in oral history interviews”, Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII), IEEE, 2024, pp.291–299.
Viswanath, A., Gref, M., Hassan, T., Schmidt, C.: A multimodal, multilabel approach to recognize emotions in oral history interviews. Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII). p. 291–299. IEEE (2024).
Viswanath, Anargh, Gref, Michael, Hassan, Teena, and Schmidt, Christoph. “A multimodal, multilabel approach to recognize emotions in oral history interviews”. Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, 2024. 291–299.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International Public License (CC BY-SA 4.0):
Volltext(e)
Name
viswanath-etal-2024-ACII.pdf
247.33 KB
Access Level
Open Access
Zuletzt Hochgeladen
2024-09-26T14:45:52Z
MD5 Prüfsumme
5a1a6a7b1be8f82e7403fd75a9c326b4