Sound

Authors and titles for recent submissions, skipping first 37

[ total of 44 entries: 1-5 | ... | 23-27 | 28-32 | 33-37 | 38-42 | 43-44 ]
[ showing 5 entries per page: fewer | more | all ]

Tue, 2 Dec 2025 (continued, showing 5 of 12 entries)

[38] arXiv:2512.00451 [pdf, ps, other]: Title: STCTS: Generative Semantic Compression for Ultra-Low Bitrate Speech via Explicit Text-Prosody-Timbre Decomposition

Authors: Siyu Wang, Haitao Li, Donglai Zhu

Comments: The complete source code and online speech reconstruction demo is publicly available at this https URL

Subjects: Sound (cs.SD); Multimedia (cs.MM)
[39] arXiv:2512.00120 [pdf, ps, other]: Title: Art2Music: Generating Music for Art Images with Multi-modal Feeling Alignment

Authors: Jiaying Hong, Ting Zhu, Thanet Markchom, Huizhi Liang

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[40] arXiv:2512.00115 [pdf, ps, other]: Title: MoLT: Mixture of Layer-Wise Tokens for Efficient Audio-Visual Learning

Authors: Kyeongha Rho, Hyeongkeun Lee, Jae Won Cho, Joon Son Chung

Comments: 10 pages, 5 figures

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[41] arXiv:2512.01443 (cross-list from cs.CL) [pdf, ps, other]: Title: MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification

Authors: Xabier de Zuazo, Ibon Saratxaga, Eva Navas

Comments: 10 pages, 5 figures, 4 tables, LibriBrain Workshop, NeurIPS 2025

Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD)
[42] arXiv:2512.01428 (cross-list from eess.SP) [pdf, ps, other]: Title: Masked Symbol Modeling for Demodulation of Oversampled Baseband Communication Signals in Impulsive Noise-Dominated Channels

Authors: Oguz Bedir (1), Nurullah Sevim (1), Mostafa Ibrahim (2), Sabit Ekin (2 and 1) ((1) Electrical & Computer Engineering, Texas A&M University, USA, (2) Engineering Technology & Industrial Distribution, Texas A&M University, USA)

Comments: Accepted to the 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Workshop on AI and ML for Next-Generation Wireless Communications and Networking (AI4NextG), non-archival

Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Sound (cs.SD)

[ total of 44 entries: 1-5 | ... | 23-27 | 28-32 | 33-37 | 38-42 | 43-44 ]
[ showing 5 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for recent submissions, skipping first 37

Tue, 2 Dec 2025 (continued, showing 5 of 12 entries)