We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions

[ total of 97 entries: 1-25 | 26-50 | 51-75 | 76-97 ]
[ showing 25 entries per page: fewer | more | all ]

Wed, 4 Feb 2026

[1]  arXiv:2602.03817 [pdf, ps, other]
Title: Adaptive Evidence Weighting for Audio-Spatiotemporal Fusion
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
[2]  arXiv:2602.03549 [pdf, ps, other]
Title: EarResp-ANS : Audio-Based On-Device Respiration Rate Estimation on Earphones with Adaptive Noise Suppression
Comments: 31 pages, 11 figures
Subjects: Sound (cs.SD); Human-Computer Interaction (cs.HC)
[3]  arXiv:2602.03523 [pdf, ps, other]
Title: D3PIA: A Discrete Denoising Diffusion Model for Piano Accompaniment Generation From Lead sheet
Comments: Accepted at 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[4]  arXiv:2602.03420 [pdf, ps, other]
Title: CoCoEmo: Composable and Controllable Human-Like Emotional TTS via Activation Steering
Subjects: Sound (cs.SD); Machine Learning (cs.LG)
[5]  arXiv:2602.03355 [pdf, ps, other]
Title: PACE: Pretrained Audio Continual Learning
Comments: Accepted at ICLR 2026
Subjects: Sound (cs.SD); Machine Learning (cs.LG)
[6]  arXiv:2602.03307 [src]
Title: GRAM: Spatial general-purpose audio representations for real-world environments
Comments: I have accidentally uploaded a revised version of my old paper. I meant to revise arXiv:2506.00934 rather than upload a new version
Subjects: Sound (cs.SD)
[7]  arXiv:2602.03023 [pdf, ps, other]
Title: Rethinking Music Captioning with Music Metadata LLMs
Comments: Accepted to ICASSP 2026
Subjects: Sound (cs.SD); Machine Learning (cs.LG)
[8]  arXiv:2602.02955 [pdf, ps, other]
Title: Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation
Comments: 5 pages, 1 figure
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[9]  arXiv:2602.02738 [pdf, ps, other]
Title: When Noise Lowers The Loss: Rethinking Likelihood-Based Evaluation in Music Large Language Models
Comments: Accepted by IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2026
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
[10]  arXiv:2602.02591 [pdf, ps, other]
Title: VividVoice: A Unified Framework for Scene-Aware Visually-Driven Speech Synthesis
Comments: Accepted by ICASSP 2026
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
[11]  arXiv:2602.03624 (cross-list from eess.SP) [pdf, ps, other]
Title: A Multi-decoder Neural Tracking Method for Accurately Predicting Speech Intelligibility
Subjects: Signal Processing (eess.SP); Sound (cs.SD)
[12]  arXiv:2602.02725 (cross-list from cs.LG) [pdf, ps, other]
Title: Automated Dysphagia Screening Using Noninvasive Neck Acoustic Sensing
Comments: Accepted to 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026)
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[13]  arXiv:2602.02557 (cross-list from cs.LG) [pdf, ps, other]
Title: The Alignment Curse: Cross-Modality Jailbreak Transfer in Omni-Models
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Sound (cs.SD)

Tue, 3 Feb 2026 (showing first 12 of 29 entries)

[14]  arXiv:2602.02413 [pdf, ps, other]
Title: Masked Autoencoders as Universal Speech Enhancer
Subjects: Sound (cs.SD); Machine Learning (cs.LG)
[15]  arXiv:2602.02286 [pdf, ps, other]
Title: DFKI-Speech System for WildSpoof Challenge: A robust framework for SASV In-the-Wild
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[16]  arXiv:2602.01908 [pdf, ps, other]
Title: LipSody: Lip-to-Speech Synthesis with Enhanced Prosody Consistency
Comments: This paper has been accepted to ICASSP 2026
Subjects: Sound (cs.SD)
[17]  arXiv:2602.01879 [pdf, ps, other]
Title: Speaking Without Sound: Multi-speaker Silent Speech Voicing with Facial Inputs Only
Comments: This paper was presented at ICASSP 2025
Subjects: Sound (cs.SD)
[18]  arXiv:2602.01793 [pdf, ps, other]
Title: ParaGSE: Parallel Generative Speech Enhancement with Group-Vector-Quantization-based Neural Speech Codec
Authors: Fei Liu, Yang Ai
Comments: Accepted by ICASSP 2026
Subjects: Sound (cs.SD)
[19]  arXiv:2602.01727 [pdf, ps, other]
Title: Voting-based Pitch Estimation with Temporal and Frequential Alignment and Correlation Aware Selection
Comments: Accepted for ICASSP 2026
Subjects: Sound (cs.SD)
[20]  arXiv:2602.01645 [pdf, ps, other]
Title: Membership Inference Attack Against Music Diffusion Models via Generative Manifold Perturbation
Subjects: Sound (cs.SD)
[21]  arXiv:2602.01547 [pdf, ps, other]
Title: Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition
Comments: Accepted to 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[22]  arXiv:2602.01363 [pdf, ps, other]
Title: Causally Disentangled Contrastive Learning for Multilingual Speaker Embeddings
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[23]  arXiv:2602.01060 [pdf, ps, other]
Title: TLDiffGAN: A Latent Diffusion-GAN Framework with Temporal Information Fusion for Anomalous Sound Detection
Comments: Accepted by ICASSP 2026
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[24]  arXiv:2602.01032 [pdf, ps, other]
Title: HierCon: Hierarchical Contrastive Attention for Audio Deepfake Detection
Comments: Proceedings of The Web Conference 2026 (WWW'26), short track
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[25]  arXiv:2602.00744 [pdf, ps, other]
Title: ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation
Subjects: Sound (cs.SD)
[ total of 97 entries: 1-25 | 26-50 | 51-75 | 76-97 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2602, contact, help  (Access key information)