We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for recent submissions

[ total of 78 entries: 1-25 | 26-50 | 51-75 | 76-78 ]
[ showing 25 entries per page: fewer | more | all ]

Wed, 4 Feb 2026

[1]  arXiv:2602.03762 [pdf, ps, other]
Title: Conditional Flow Matching for Visually-Guided Acoustic Highlighting
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[2]  arXiv:2602.03398 [pdf, ps, other]
Title: A Unified SVD-Modal Solution for Sparse Sound Field Reconstruction with Hybrid Spherical-Linear Microphone Arrays
Comments: Accepted by ICASSP 2026
Subjects: Audio and Speech Processing (eess.AS)
[3]  arXiv:2602.03245 [pdf, ps, other]
Title: Mići Princ -- A Little Boy Teaching Speech Technologies the Chakavian Dialect
Comments: 2 figures, 14 pages, accepted and presented at JTDH 2024
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL)
[4]  arXiv:2602.02980 [pdf, ps, other]
Title: WST-X Series: Wavelet Scattering Transform for Interpretable Speech Deepfake Detection
Comments: Submitted to IEEE Signal Processing Letters
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Signal Processing (eess.SP)
[5]  arXiv:2602.02734 [pdf, ps, other]
[6]  arXiv:2602.02725 (cross-list from cs.LG) [pdf, ps, other]
Title: Automated Dysphagia Screening Using Noninvasive Neck Acoustic Sensing
Comments: Accepted to 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026)
Subjects: Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)

Tue, 3 Feb 2026 (showing first 19 of 24 entries)

[7]  arXiv:2602.01861 [pdf, ps, other]
Title: RIR-Former: Coordinate-Guided Transformer for Continuous Reconstruction of Room Impulse Responses
Comments: Accepted to International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2026. Equal contribution: Shaoheng Xu and Chunyi Sun
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)
[8]  arXiv:2602.01758 [pdf, ps, other]
Title: Short-wave admittance correction for a time-domain cochlear transmission line model
Comments: 22 pages, 7 figures
Subjects: Audio and Speech Processing (eess.AS); Biological Physics (physics.bio-ph)
[9]  arXiv:2602.01722 [pdf, ps, other]
Title: Joint Optimization of ASV and CM tasks: BTUEF Team's Submission for WildSpoof Challenge
Subjects: Audio and Speech Processing (eess.AS)
[10]  arXiv:2602.01634 [pdf, ps, other]
Title: HuPER: A Human-Inspired Framework for Phonetic Perception
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI)
[11]  arXiv:2602.01394 [pdf, ps, other]
Title: SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[12]  arXiv:2602.01008 [pdf, ps, other]
Title: Adapting Where It Matters: Depth-Aware Adaptation for Efficient Multilingual Speech Recognition in Low-Resource Languages
Comments: 13 pages
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[13]  arXiv:2602.00652 [pdf, ps, other]
Title: Solving Room Impulse Response Inverse Problems Using Flow Matching with Analytic Wiener Denoiser
Comments: Submitted to the Journal of the Acoustical Society of America (JASA)
Subjects: Audio and Speech Processing (eess.AS)
[14]  arXiv:2602.00648 [pdf, ps, other]
Title: High-Fidelity Generative Audio Compression at 0.275kbps
Comments: Technical Report
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[15]  arXiv:2602.02198 (cross-list from cs.CR) [pdf, ps, other]
Title: QuietPrint: Protecting 3D Printers Against Acoustic Side-Channel Attacks
Subjects: Cryptography and Security (cs.CR); Audio and Speech Processing (eess.AS)
[16]  arXiv:2602.01547 (cross-list from cs.SD) [pdf, ps, other]
Title: Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition
Comments: Accepted to 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[17]  arXiv:2602.01363 (cross-list from cs.SD) [pdf, ps, other]
Title: Causally Disentangled Contrastive Learning for Multilingual Speaker Embeddings
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[18]  arXiv:2602.01249 (cross-list from eess.SP) [pdf, ps, other]
Title: Generative AI in Signal Processing Education: An Audio Foundation Model Based Approach
Comments: accepted at IEEE EDUCON 2026
Subjects: Signal Processing (eess.SP); Audio and Speech Processing (eess.AS)
[19]  arXiv:2602.01060 (cross-list from cs.SD) [pdf, ps, other]
Title: TLDiffGAN: A Latent Diffusion-GAN Framework with Temporal Information Fusion for Anomalous Sound Detection
Comments: Accepted by ICASSP 2026
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[20]  arXiv:2602.01032 (cross-list from cs.SD) [pdf, ps, other]
Title: HierCon: Hierarchical Contrastive Attention for Audio Deepfake Detection
Comments: Proceedings of The Web Conference 2026 (WWW'26), short track
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[21]  arXiv:2602.01030 (cross-list from cs.CL) [pdf, ps, other]
Title: Bias in the Ear of the Listener: Assessing Sensitivity in Audio Language Models Across Linguistic, Demographic, and Positional Variations
Comments: Accepted as a long findings paper at EACL 2026
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[22]  arXiv:2602.00914 (cross-list from cs.CL) [pdf, ps, other]
Title: A Baseline Multimodal Approach to Emotion Recognition in Conversations
Comments: 10 pages
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[23]  arXiv:2602.00604 (cross-list from cs.SD) [pdf, ps, other]
Title: The TMU System for the XACLE Challenge: Training Large Audio Language Models with CLAP Pseudo-Labels
Comments: 3 pages; 2 figures; 2 tables; Accepted at ICASSP 2026 Workshop (SP Grand Challenges, GC-12: XACLE)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[24]  arXiv:2602.00594 (cross-list from cs.CL) [pdf, ps, other]
Title: Kanade: A Simple Disentangled Tokenizer for Spoken Language Modeling
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[25]  arXiv:2602.00568 (cross-list from cs.SD) [pdf, ps, other]
Title: Dual-View Predictive Diffusion: Lightweight Speech Enhancement via Spectrogram-Image Synergy
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[ total of 78 entries: 1-25 | 26-50 | 51-75 | 76-78 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, new, 2602, contact, help  (Access key information)