We gratefully acknowledge support from
the Simons Foundation and member institutions.

Sound

Authors and titles for recent submissions, skipping first 4

[ total of 47 entries: 1-10 | 5-14 | 15-24 | 25-34 | 35-44 | 45-47 ]
[ showing 10 entries per page: fewer | more | all ]

Wed, 10 Dec 2025 (continued, showing last 4 of 8 entries)

[5]  arXiv:2512.08006 [pdf, ps, other]
Title: Beyond Unified Models: A Service-Oriented Approach to Low Latency, Context Aware Phonemization for Real Time TTS
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[6]  arXiv:2512.07872 [pdf, ps, other]
Title: LocaGen: Sub-Sample Time-Delay Learning for Beam Localization
Comments: 7 pages
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[7]  arXiv:2512.07845 [pdf, ps, other]
Title: AudioScene: Integrating Object-Event Audio into 3D Scenes
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
[8]  arXiv:2512.08282 (cross-list from cs.CV) [pdf, ps, other]
Title: PAVAS: Physics-Aware Video-to-Audio Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)

Tue, 9 Dec 2025 (showing first 6 of 19 entries)

[9]  arXiv:2512.07627 [pdf, ps, other]
Title: Incorporating Structure and Chord Constraints in Symbolic Transformer-based Melodic Harmonization
Comments: Proceedings of the 6th Conference on AI Music Creativity (AIMC 2025), Brussels, Belgium, September 10th-12th
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Symbolic Computation (cs.SC)
[10]  arXiv:2512.07352 [pdf, ps, other]
Title: MultiAPI Spoof: A Multi-API Dataset and Local-Attention Network for Speech Anti-spoofing Detection
Subjects: Sound (cs.SD)
[11]  arXiv:2512.07168 [pdf, ps, other]
Title: JEPA as a Neural Tokenizer: Learning Robust Speech Representations with Density Adaptive Attention
Comments: UniReps: Unifying Representations in Neural Models (NeurIPS 2025 Workshop)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[12]  arXiv:2512.07005 [pdf, ps, other]
Title: Multi-Accent Mandarin Dry-Vocal Singing Dataset: Benchmark for Singing Accent Recognition
Comments: Accepted by ACMMM 2025
Journal-ref: Proceedings of the 33rd ACM International Conference on Multimedia (ACMMM 2025), Pages 12714-12721, October 27, 2025. Dublin, Ireland
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
[13]  arXiv:2512.06999 [pdf, ps, other]
Title: Singing Timbre Popularity Assessment Based on Multimodal Large Foundation Model
Comments: Accepted to ACMMM 2025 oral
Journal-ref: Proceedings of the 33rd ACM International Conference on Multimedia (ACMMM 2025), Pages 12227-12236
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI)
[14]  arXiv:2512.06890 [pdf, ps, other]
Title: What Needs to be Known in Order to Perform a Meaningful Scientific Comparison Between Animal Communications and Human Spoken Language
Authors: Roger K. Moore
Comments: 5 pages, 1 figure, Proc. Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR-24), Kos, Greece, 6 Sept. 2024
Journal-ref: Proc. Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR-24), pp 22-26, Kos, Greece, 6 Sept. 2024
Subjects: Sound (cs.SD)
[ total of 47 entries: 1-10 | 5-14 | 15-24 | 25-34 | 35-44 | 45-47 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)