We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions, skipping first 17

[ total of 51 entries: 1-10 | 8-17 | 18-27 | 28-37 | 38-47 | 48-51 ]
[ showing 10 entries per page: fewer | more | all ]

Tue, 9 Dec 2025 (continued, showing last 2 of 19 entries)

[18]  arXiv:2512.06304 (cross-list from eess.AS) [pdf, ps, other]
Title: Degrading Voice: A Comprehensive Overview of Robust Voice Conversion Through Input Manipulation
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Sound (cs.SD)
[19]  arXiv:2512.05994 (cross-list from eess.AS) [pdf, ps, other]
Title: KidSpeak: A General Multi-purpose LLM for Kids' Speech Recognition and Screening
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Sound (cs.SD)

Mon, 8 Dec 2025

[20]  arXiv:2512.05592 [pdf, ps, other]
Title: The T12 System for AudioMOS Challenge 2025: Audio Aesthetics Score Prediction System Using KAN- and VERSA-based Models
Comments: Accepted by IEEE ASRU 2025
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[21]  arXiv:2512.05508 [pdf, ps, other]
Title: Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction
Comments: 8 pages
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[22]  arXiv:2512.05528 (cross-list from q-bio.NC) [pdf, ps, other]
Title: Decoding Selective Auditory Attention to Musical Elements in Ecologically Valid Music Listening
Subjects: Neurons and Cognition (q-bio.NC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[23]  arXiv:2512.05201 (cross-list from cs.NI) [pdf, ps, other]
Title: MuMeNet: A Network Simulator for Musical Metaverse Communications
Comments: To appear in 2025 IEEE 6th International Symposium on the Internet of Sounds (IS2) proceedings
Subjects: Networking and Internet Architecture (cs.NI); Sound (cs.SD)
[24]  arXiv:2512.05126 (cross-list from eess.AS) [pdf, ps, other]
Title: SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS Model
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)

Fri, 5 Dec 2025 (showing first 3 of 10 entries)

[25]  arXiv:2512.04847 [pdf, ps, other]
Title: Language Models as Semantic Teachers: Post-Training Alignment for Medical Audio Understanding
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
[26]  arXiv:2512.04827 [pdf, ps, other]
Title: Contract-Driven QoE Auditing for Speech and Singing Services: From MOS Regression to Service Graphs
Authors: Wenzhang Du
Comments: 11 pages, 3 figures
Subjects: Sound (cs.SD); Machine Learning (cs.LG)
[27]  arXiv:2512.04814 [pdf, ps, other]
Title: Shared Multi-modal Embedding Space for Face-Voice Association
Comments: Ranked 1st in Fame 2026 Challenge, ICASSP
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[ total of 51 entries: 1-10 | 8-17 | 18-27 | 28-37 | 38-47 | 48-51 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)