Sound

Authors and titles for recent submissions, skipping first 7

[ total of 51 entries: 1-10 | 8-17 | 18-27 | 28-37 | 38-47 | 48-51 ]
[ showing 10 entries per page: fewer | more | all ]

Tue, 9 Dec 2025 (continued, showing 10 of 19 entries)

[8] arXiv:2512.06380 [pdf, ps, other]: Title: Protecting Bystander Privacy via Selective Hearing in LALMs

Authors: Xiao Zhan, Guangzhi Sun, Jose Such, Phil Woodland

Comments: Dataset: this https URL

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
[9] arXiv:2512.06259 [pdf, ps, other]: Title: Who Will Top the Charts? Multimodal Music Popularity Prediction via Adaptive Fusion of Modality Experts and Temporal Engagement Modeling

Authors: Yash Choudhary, Preeti Rao, Pushpak Bhattacharyya

Comments: 8 pages

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[10] arXiv:2512.06041 [pdf, ps, other]: Title: Technical Report of Nomi Team in the Environmental Sound Deepfake Detection Challenge 2026

Authors: Candy Olivia Mawalim, Haotian Zhang, Shogo Okada

Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[11] arXiv:2512.06040 [pdf, ps, other]: Title: Physics-Guided Deepfake Detection for Voice Authentication Systems

Authors: Alireza Mohammadi, Keshav Sood, Dhananjay Thiruvady, Asef Nazari

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[12] arXiv:2512.06022 [pdf, ps, other]: Title: DreamFoley: Scalable VLMs for High-Fidelity Video-to-Audio Generation

Authors: Fu Li, Weichao Zhao, You Li, Zhichao Zhou, Dongliang He

Comments: 10 pages; Bytedance

Subjects: Sound (cs.SD); Multimedia (cs.MM)
[13] arXiv:2512.07741 (cross-list from cs.LG) [pdf, ps, other]: Title: A multimodal Bayesian Network for symptom-level depression and anxiety prediction from voice and speech data

Authors: Agnes Norbury, George Fairs, Alexandra L. Georgescu, Matthew M. Nour, Emilia Molimpakis, Stefano Goria

Subjects: Machine Learning (cs.LG); Sound (cs.SD)
[14] arXiv:2512.07351 (cross-list from cs.CV) [pdf, ps, other]: Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake Detection

Authors: Sayeem Been Zaman, Wasimul Karim, Arefin Ittesafun Abian, Reem E. Mohamed, Md Rafiqul Islam, Asif Karim, Sami Azam

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
[15] arXiv:2512.07226 (cross-list from eess.AS) [pdf, ps, other]: Title: Unsupervised Single-Channel Audio Separation with Diffusion Source Priors

Authors: Runwu Shi, Chang Li, Jiang Wang, Rui Zhang, Nabeela Khan, Benjamin Yen, Takeshi Ashizawa, Kazuhiro Nakadai

Comments: 15 pages, 31 figures, accepted by The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026)

Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[16] arXiv:2512.07209 (cross-list from cs.MM) [pdf, ps, other]: Title: Coherent Audio-Visual Editing via Conditional Audio Generation Following Video Edits

Authors: Masato Ishii, Akio Hayakawa, Takashi Shibuya, Yuki Mitsufuji

Subjects: Multimedia (cs.MM); Machine Learning (cs.LG); Sound (cs.SD)
[17] arXiv:2512.06417 (cross-list from cs.LG) [pdf, ps, other]: Title: Hankel-FNO: Fast Underwater Acoustic Charting Via Physics-Encoded Fourier Neural Operator

Authors: Yifan Sun (1), Lei Cheng (1), Jianlong Li (1), Peter Gerstoft (2) ((1) College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China, (2) Scripps Institution of Oceanography, University of California San Diego, La Jolla, USA)

Subjects: Machine Learning (cs.LG); Sound (cs.SD)

[ total of 51 entries: 1-10 | 8-17 | 18-27 | 28-37 | 38-47 | 48-51 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.SD

Sound

Authors and titles for recent submissions, skipping first 7

Tue, 9 Dec 2025 (continued, showing 10 of 19 entries)