We gratefully acknowledge support from
the Simons Foundation and member institutions.

Audio and Speech Processing

Authors and titles for recent submissions, skipping first 9

[ total of 23 entries: 1-10 | 10-19 | 20-23 ]
[ showing 10 entries per page: fewer | more | all ]

Wed, 3 Dec 2025

[10]  arXiv:2512.02891 [pdf, ps, other]
Title: Perceptual evaluation of Acoustic Level of Detail in Virtual Acoustic Environments
Comments: This work has been submitted to Acoustics for possible publication. Template provided by MDPI
Subjects: Audio and Speech Processing (eess.AS)
[11]  arXiv:2512.02759 [pdf, ps, other]
Title: Towards Language-Independent Face-Voice Association with Multimodal Foundation Models
Comments: This paper presents the system description of the UZH-CL team for the FAME2026 Challenge at ICASSP 2026. Our model achieved second place in the final ranking
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Image and Video Processing (eess.IV)
[12]  arXiv:2512.02027 [pdf, ps, other]
Title: On the Difficulty of Token-Level Modeling of Dysfluency and Fluency Shaping Artifacts
Comments: 6 pages, 1 figure. Accepted to ASRU 2025. This is the arXiv preprint of the accepted paper
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[13]  arXiv:2512.02650 (cross-list from cs.CV) [pdf, ps, other]
Title: Hear What Matters! Text-conditioned Selective Video-to-Audio Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[14]  arXiv:2512.02593 (cross-list from cs.CL) [pdf, ps, other]
Title: Spoken Conversational Agents with Large Language Models
Comments: Accepted to EMNLP 2025 Tutorial
Subjects: Computation and Language (cs.CL); Multiagent Systems (cs.MA); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Tue, 2 Dec 2025

[15]  arXiv:2512.01466 [pdf, ps, other]
Title: Identifiability Conditions for Acoustic Feedback Cancellation with the Two-Channel Adaptive Feedback Canceller Algorithm
Comments: Accepted for publication in IEEE Open Journal of Signal Processing (OJSP)
Subjects: Audio and Speech Processing (eess.AS)
[16]  arXiv:2512.00937 [pdf, ps, other]
Title: Arabic TTS with FastPitch: Reproducible Baselines, Adversarial Training, and Oversmoothing Analysis
Authors: Lars Nippert
Subjects: Audio and Speech Processing (eess.AS)
[17]  arXiv:2512.00511 [pdf, ps, other]
Title: A Low-Complexity Speech Codec Using Parametric Dithering for ASR
Comments: 10 pages, 8 figures, Accepted 2026 Data Compression Conference
Subjects: Audio and Speech Processing (eess.AS)
[18]  arXiv:2512.00482 [pdf, ps, other]
Title: Beyond Performance: Probing Representation Dynamics In Speech Enhancement Models
Subjects: Audio and Speech Processing (eess.AS)

Mon, 1 Dec 2025 (showing first 1 of 5 entries)

[19]  arXiv:2511.23098 [pdf, ps, other]
Title: Group-Aware Partial Model Merging for Children's Automatic Speech Recognition
Comments: IEEE ASRU 2025 Workshop AI4CSL
Subjects: Audio and Speech Processing (eess.AS)
[ total of 23 entries: 1-10 | 10-19 | 20-23 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, new, 2512, contact, help  (Access key information)