Mapping the Audio Landscape for Innovative Music Sample Generation
Limberg C, Zhang Z, Gurrin C, Kongkachandra R, Schoeffmann K, Dang-Nguyen D-T, Rossetto L, Satoh S'ichi, Zhou L (2024)
In: Proceedings of the 2024 International Conference on Multimedia Retrieval. New York, NY, USA: ACM: 1207-1213.
Konferenzbeitrag
| Veröffentlicht | Englisch
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Limberg, ChristianUniBi ;
Zhang, Zhe;
Gurrin, Cathal;
Kongkachandra, Rachada;
Schoeffmann, Klaus;
Dang-Nguyen, Duc-Tien;
Rossetto, Luca;
Satoh, Shin'ichi;
Zhou, Liting
Einrichtung
Abstract / Bemerkung
This paper introduces the Generative Sample Map (GESAM), a novel two-stage unsupervised learning framework capable of generating high-quality and expressive audio samples for music production. Recent generative approaches based on language models rely on text prompts as conditions. However, fine nuances in musical audio samples can hardly be described in the modality of text. For addressing this shortcoming, we propose to learn a highly descriptive latent 2D audio map by a Variational Autoencoder (VAE) which is then utilized for conditioning a Transformer model. We demonstrate the Transformer model's ability to achieve high generation quality and compare its performance against two baseline models. By selecting points on the map that compresses the manifold of the audio training set into 2D, we enable a more natural interaction with the model. We showcase this capability through an interactive demo interface, which is accessible on our website https://limchr.github.io/gesam/
Erscheinungsjahr
2024
Titel des Konferenzbandes
Proceedings of the 2024 International Conference on Multimedia Retrieval
Seite(n)
1207-1213
Konferenz
ICMR '24: International Conference on Multimedia Retrieval
Konferenzort
Phuket Thailand
Konferenzdatum
2024-06-10 – 2024-06-14
ISBN
9798400706196
Page URI
https://pub.uni-bielefeld.de/record/2990342
Zitieren
Limberg C, Zhang Z, Gurrin C, et al. Mapping the Audio Landscape for Innovative Music Sample Generation. In: Proceedings of the 2024 International Conference on Multimedia Retrieval. New York, NY, USA: ACM; 2024: 1207-1213.
Limberg, C., Zhang, Z., Gurrin, C., Kongkachandra, R., Schoeffmann, K., Dang-Nguyen, D. - T., Rossetto, L., et al. (2024). Mapping the Audio Landscape for Innovative Music Sample Generation. Proceedings of the 2024 International Conference on Multimedia Retrieval, 1207-1213. New York, NY, USA: ACM. https://doi.org/10.1145/3652583.3657586
Limberg, Christian, Zhang, Zhe, Gurrin, Cathal, Kongkachandra, Rachada, Schoeffmann, Klaus, Dang-Nguyen, Duc-Tien, Rossetto, Luca, Satoh, Shin'ichi, and Zhou, Liting. 2024. “Mapping the Audio Landscape for Innovative Music Sample Generation”. In Proceedings of the 2024 International Conference on Multimedia Retrieval, 1207-1213. New York, NY, USA: ACM.
Limberg, C., Zhang, Z., Gurrin, C., Kongkachandra, R., Schoeffmann, K., Dang-Nguyen, D. - T., Rossetto, L., Satoh, S. 'ichi, and Zhou, L. (2024). “Mapping the Audio Landscape for Innovative Music Sample Generation” in Proceedings of the 2024 International Conference on Multimedia Retrieval (New York, NY, USA: ACM), 1207-1213.
Limberg, C., et al., 2024. Mapping the Audio Landscape for Innovative Music Sample Generation. In Proceedings of the 2024 International Conference on Multimedia Retrieval. New York, NY, USA: ACM, pp. 1207-1213.
C. Limberg, et al., “Mapping the Audio Landscape for Innovative Music Sample Generation”, Proceedings of the 2024 International Conference on Multimedia Retrieval, New York, NY, USA: ACM, 2024, pp.1207-1213.
Limberg, C., Zhang, Z., Gurrin, C., Kongkachandra, R., Schoeffmann, K., Dang-Nguyen, D.-T., Rossetto, L., Satoh, S.'ichi, Zhou, L.: Mapping the Audio Landscape for Innovative Music Sample Generation. Proceedings of the 2024 International Conference on Multimedia Retrieval. p. 1207-1213. ACM, New York, NY, USA (2024).
Limberg, Christian, Zhang, Zhe, Gurrin, Cathal, Kongkachandra, Rachada, Schoeffmann, Klaus, Dang-Nguyen, Duc-Tien, Rossetto, Luca, Satoh, Shin'ichi, and Zhou, Liting. “Mapping the Audio Landscape for Innovative Music Sample Generation”. Proceedings of the 2024 International Conference on Multimedia Retrieval. New York, NY, USA: ACM, 2024. 1207-1213.