We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 425

[ total of 778 entries: 1-10 | ... | 396-405 | 406-415 | 416-425 | 426-435 | 436-445 | 446-455 | 456-465 | ... | 776-778 ]
[ showing 10 entries per page: fewer | more | all ]

Wed, 3 Dec 2025 (continued, showing 10 of 141 entries)

[426]  arXiv:2512.02576 [pdf, ps, other]
Title: Co-speech Gesture Video Generation via Motion-Based Graph Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427]  arXiv:2512.02566 [pdf, ps, other]
Title: From Panel to Pixel: Zoom-In Vision-Language Pretraining from Biomedical Scientific Literature
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[428]  arXiv:2512.02554 [pdf, ps, other]
Title: OmniPerson: Unified Identity-Preserving Pedestrian Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429]  arXiv:2512.02541 [pdf, ps, other]
Title: AVGGT: Rethinking Global Attention for Accelerating VGGT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430]  arXiv:2512.02536 [pdf, ps, other]
Title: WeMMU: Enhanced Bridging of Vision-Language Models and Diffusion Models via Noisy Query Tokens
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[431]  arXiv:2512.02520 [pdf, ps, other]
Title: On the Problem of Consistent Anomalies in Zero-Shot Anomaly Detection
Authors: Tai Le-Gia
Comments: PhD Dissertation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[432]  arXiv:2512.02517 [pdf, ps, other]
Title: SkyMoE: A Vision-Language Foundation Model for Enhancing Geospatial Interpretation with Mixture of Experts
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[433]  arXiv:2512.02512 [pdf, ps, other]
Title: Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling
Comments: Accepted as a Tiny Paper at the 13th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2025), IIT Mandi, India. 3 pages, 1 figure
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[434]  arXiv:2512.02505 [pdf, ps, other]
Title: GeoDiT: A Diffusion-based Vision-Language Model for Geospatial Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435]  arXiv:2512.02498 [pdf, ps, other]
Title: dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 778 entries: 1-10 | ... | 396-405 | 406-415 | 416-425 | 426-435 | 436-445 | 446-455 | 456-465 | ... | 776-778 ]
[ showing 10 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)