We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions, skipping first 32

[ total of 44 entries: 1-50 | 33-44 ]
[ showing up to 50 entries per page: fewer | more ]

Tue, 2 Dec 2025

[33]  arXiv:2512.01626 [pdf, ps, other]
Title: Parallel Delayed Memory Units for Enhanced Temporal Modeling in Biomedical and Bioacoustic Signal Analysis
Comments: Accepted for publication in IEEE Transactions on Audio, Speech and Language Processing, 2025
Journal-ref: IEEE Transactions on Audio, Speech and Language Processing, 2025
Subjects: Sound (cs.SD); Neural and Evolutionary Computing (cs.NE)
[34]  arXiv:2512.01559 [pdf, ps, other]
Title: LLM2Fx-Tools: Tool Calling For Music Post-Production
Subjects: Sound (cs.SD)
[35]  arXiv:2512.01537 [pdf, ps, other]
Title: Q2D2: A Geometry-Aware Audio Codec Leveraging Two-Dimensional Quantization
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (cs.LG); Signal Processing (eess.SP)
[36]  arXiv:2512.00621 [pdf, ps, other]
Title: Melody or Machine: Detecting Synthetic Music with Dual-Stream Contrastive Learning
Comments: Accepted at Transactions on Machine Learning Research (TMLR)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[37]  arXiv:2512.00563 [pdf, ps, other]
Title: Explainable Multi-Modal Deep Learning for Automatic Detection of Lung Diseases from Respiratory Audio Signals
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[38]  arXiv:2512.00451 [pdf, ps, other]
Title: STCTS: Generative Semantic Compression for Ultra-Low Bitrate Speech via Explicit Text-Prosody-Timbre Decomposition
Comments: The complete source code and online speech reconstruction demo is publicly available at this https URL
Subjects: Sound (cs.SD); Multimedia (cs.MM)
[39]  arXiv:2512.00120 [pdf, ps, other]
Title: Art2Music: Generating Music for Art Images with Multi-modal Feeling Alignment
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[40]  arXiv:2512.00115 [pdf, ps, other]
Title: MoLT: Mixture of Layer-Wise Tokens for Efficient Audio-Visual Learning
Comments: 10 pages, 5 figures
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[41]  arXiv:2512.01443 (cross-list from cs.CL) [pdf, ps, other]
Title: MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification
Comments: 10 pages, 5 figures, 4 tables, LibriBrain Workshop, NeurIPS 2025
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD)
[42]  arXiv:2512.01428 (cross-list from eess.SP) [pdf, ps, other]
Title: Masked Symbol Modeling for Demodulation of Oversampled Baseband Communication Signals in Impulsive Noise-Dominated Channels
Authors: Oguz Bedir (1), Nurullah Sevim (1), Mostafa Ibrahim (2), Sabit Ekin (2 and 1) ((1) Electrical & Computer Engineering, Texas A&M University, USA, (2) Engineering Technology & Industrial Distribution, Texas A&M University, USA)
Comments: Accepted to the 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Workshop on AI and ML for Next-Generation Wireless Communications and Networking (AI4NextG), non-archival
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Sound (cs.SD)
[43]  arXiv:2512.01267 (cross-list from cs.MM) [pdf, ps, other]
Title: ZO-ASR: Zeroth-Order Fine-Tuning of Speech Foundation Models without Back-Propagation
Comments: 2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Subjects: Multimedia (cs.MM); Sound (cs.SD)
[44]  arXiv:2512.00883 (cross-list from cs.MM) [pdf, ps, other]
Title: Audio-Visual World Models: Towards Multisensory Imagination in Sight and Sound
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[ total of 44 entries: 1-50 | 33-44 ]
[ showing up to 50 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)