Sound

Authors and titles for recent submissions, skipping first 20

[ total of 48 entries: 1-10 | 11-20 | 21-30 | 31-40 | 41-48 ]
[ showing 10 entries per page: fewer | more | all ]

Wed, 3 Dec 2025 (continued, showing last 7 of 12 entries)

[21] arXiv:2512.02432 [pdf, ps, other]: Title: Continual Learning for Singing Voice Separation with Human in the Loop Adaptation

Authors: Ankur Gupta, Anshul Rai, Archit Bansal, Vipul Arora

Comments: Proceedings of the 26th International Symposium on Frontiers of Research in Speech and Music, 2021

Subjects: Sound (cs.SD)
[22] arXiv:2512.02192 [pdf, ps, other]: Title: Story2MIDI: Emotionally Aligned Music Generation from Text

Authors: Mohammad Shokri, Alexandra C. Salem, Gabriel Levine, Johanna Devaney, Sarah Ita Levitan

Comments: 8 pages (6 pages of main text + 2 pages of references and appendices), 4 figures, 1 table. Presented at IEEE Big Data 2025 3rd Workshop on AI Music Generation (AIMG 2025)

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[23] arXiv:2512.02759 (cross-list from eess.AS) [pdf, ps, other]: Title: Towards Language-Independent Face-Voice Association with Multimodal Foundation Models

Authors: Aref Farhadipour, Teodora Vukovic, Volker Dellwo

Comments: This paper presents the system description of the UZH-CL team for the FAME2026 Challenge at ICASSP 2026. Our model achieved second place in the final ranking

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Image and Video Processing (eess.IV)
[24] arXiv:2512.02650 (cross-list from cs.CV) [pdf, ps, other]: Title: Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

Authors: Junwon Lee, Juhan Nam, Jiyoung Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[25] arXiv:2512.02593 (cross-list from cs.CL) [pdf, ps, other]: Title: Spoken Conversational Agents with Large Language Models

Authors: Chao-Han Huck Yang, Andreas Stolcke, Larry Heck

Comments: Accepted to EMNLP 2025 Tutorial

Subjects: Computation and Language (cs.CL); Multiagent Systems (cs.MA); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[26] arXiv:2512.02206 (cross-list from cs.LG) [pdf, ps, other]: Title: WhAM: Towards A Translative Model of Sperm Whale Vocalization

Authors: Orr Paradise, Pranav Muralikrishnan, Liangyuan Chen, Hugo Flores García, Bryan Pardo, Roee Diamant, David F. Gruber, Shane Gero, Shafi Goldwasser

Comments: NeurIPS 2025

Subjects: Machine Learning (cs.LG); Sound (cs.SD)
[27] arXiv:2512.02074 (cross-list from cs.CL) [pdf, ps, other]: Title: Dialect Identification Using Resource-Efficient Fine-Tuning Approaches

Authors: Zirui Lin, Haris Gulzar, Monnika Roslianna Busto, Akiko Masaki, Takeharu Eda, Kazuhiro Nakadai

Comments: Published in APSIPA ASC 2025

Subjects: Computation and Language (cs.CL); Sound (cs.SD)

Tue, 2 Dec 2025 (showing first 3 of 12 entries)

[28] arXiv:2512.01626 [pdf, ps, other]: Title: Parallel Delayed Memory Units for Enhanced Temporal Modeling in Biomedical and Bioacoustic Signal Analysis

Authors: Pengfei Sun, Wenyu Jiang, Paul Devos, Dick Botteldooren

Comments: Accepted for publication in IEEE Transactions on Audio, Speech and Language Processing, 2025

Journal-ref: IEEE Transactions on Audio, Speech and Language Processing, 2025

Subjects: Sound (cs.SD); Neural and Evolutionary Computing (cs.NE)
[29] arXiv:2512.01559 [pdf, ps, other]: Title: LLM2Fx-Tools: Tool Calling For Music Post-Production

Authors: Seungheon Doh, Junghyun Koo, Marco A. Martínez-Ramírez, Woosung Choi, Wei-Hsiang Liao, Qiyu Wu, Juhan Nam, Yuki Mitsufuji

Subjects: Sound (cs.SD)
[30] arXiv:2512.01537 [pdf, ps, other]: Title: Q2D2: A Geometry-Aware Audio Codec Leveraging Two-Dimensional Quantization

Authors: Tal Shuster, Eliya Nachmani

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (cs.LG); Signal Processing (eess.SP)

[ total of 48 entries: 1-10 | 11-20 | 21-30 | 31-40 | 41-48 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for recent submissions, skipping first 20

Wed, 3 Dec 2025 (continued, showing last 7 of 12 entries)

Tue, 2 Dec 2025 (showing first 3 of 12 entries)