We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions, skipping first 15

[ total of 48 entries: 1-10 | 6-15 | 16-25 | 26-35 | 36-45 | 46-48 ]
[ showing 10 entries per page: fewer | more | all ]

Wed, 3 Dec 2025 (showing first 10 of 12 entries)

[16]  arXiv:2512.02783 [pdf, ps, other]
Title: Exploring Definitions of Quality and Diversity in Sonic Measurement Spaces
Subjects: Sound (cs.SD); Neural and Evolutionary Computing (cs.NE)
[17]  arXiv:2512.02669 [pdf, ps, other]
Title: SAND Challenge: Four Approaches for Dysartria Severity Classification
Comments: 7 pages, 5 figures
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[18]  arXiv:2512.02652 [pdf, ps, other]
Title: Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[19]  arXiv:2512.02523 [pdf, ps, other]
Title: Generative Multi-modal Feedback for Singing Voice Synthesis Evaluation
Comments: 16 pages, 5 figures
Subjects: Sound (cs.SD); Machine Learning (cs.LG)
[20]  arXiv:2512.02515 [pdf, ps, other]
Title: VibOmni: Towards Scalable Bone-conduction Speech Enhancement on Earables
Comments: Submitted to TMC
Subjects: Sound (cs.SD)
[21]  arXiv:2512.02432 [pdf, ps, other]
Title: Continual Learning for Singing Voice Separation with Human in the Loop Adaptation
Comments: Proceedings of the 26th International Symposium on Frontiers of Research in Speech and Music, 2021
Subjects: Sound (cs.SD)
[22]  arXiv:2512.02192 [pdf, ps, other]
Title: Story2MIDI: Emotionally Aligned Music Generation from Text
Comments: 8 pages (6 pages of main text + 2 pages of references and appendices), 4 figures, 1 table. Presented at IEEE Big Data 2025 3rd Workshop on AI Music Generation (AIMG 2025)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[23]  arXiv:2512.02759 (cross-list from eess.AS) [pdf, ps, other]
Title: Towards Language-Independent Face-Voice Association with Multimodal Foundation Models
Comments: This paper presents the system description of the UZH-CL team for the FAME2026 Challenge at ICASSP 2026. Our model achieved second place in the final ranking
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Image and Video Processing (eess.IV)
[24]  arXiv:2512.02650 (cross-list from cs.CV) [pdf, ps, other]
Title: Hear What Matters! Text-conditioned Selective Video-to-Audio Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[25]  arXiv:2512.02593 (cross-list from cs.CL) [pdf, ps, other]
Title: Spoken Conversational Agents with Large Language Models
Comments: Accepted to EMNLP 2025 Tutorial
Subjects: Computation and Language (cs.CL); Multiagent Systems (cs.MA); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 48 entries: 1-10 | 6-15 | 16-25 | 26-35 | 36-45 | 46-48 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)