We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 524

[ total of 603 entries: 1-50 | ... | 375-424 | 425-474 | 475-524 | 525-574 | 575-603 ]
[ showing 50 entries per page: fewer | more | all ]

Wed, 24 Dec 2025 (continued, showing 50 of 86 entries)

[525]  arXiv:2512.20557 [pdf, ps, other]
Title: Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[526]  arXiv:2512.20556 [pdf, ps, other]
Title: Multi-Grained Text-Guided Image Fusion for Multi-Exposure and Multi-Focus Scenarios
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527]  arXiv:2512.20538 [pdf, ps, other]
Title: AlignPose: Generalizable 6D Pose Estimation via Multi-view Feature-metric Alignment
Comments: 18 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528]  arXiv:2512.20531 [pdf, ps, other]
Title: SirenPose: Dynamic Scene Reconstruction via Geometric Supervision
Comments: Under submission
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[529]  arXiv:2512.20501 [pdf, ps, other]
Title: Bridging Modalities and Transferring Knowledge: Enhanced Multimodal Understanding and Recognition
Authors: Gorjan Radevski
Comments: Ph.D. manuscript; Supervisors/Mentors: Marie-Francine Moens and Tinne Tuytelaars
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530]  arXiv:2512.20487 [pdf, ps, other]
Title: Multi-temporal Adaptive Red-Green-Blue and Long-Wave Infrared Fusion for You Only Look Once-Based Landmine Detection from Unmanned Aerial Systems
Comments: 21 pages with 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531]  arXiv:2512.20479 [pdf, ps, other]
Title: UTDesign: A Unified Framework for Stylized Text Editing and Generation in Graphic Design Images
Comments: 22 pages, 25 figures, SIGGRAPH Asia 2025, Conference Paper
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532]  arXiv:2512.20451 [pdf, ps, other]
Title: Beyond Motion Pattern: An Empirical Study of Physical Forces for Human Motion Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533]  arXiv:2512.20432 [pdf, ps, other]
Title: High Dimensional Data Decomposition for Anomaly Detection of Textured Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[534]  arXiv:2512.20431 [pdf, ps, other]
Title: Skin Lesion Classification Using a Soft Voting Ensemble of Convolutional Neural Networks
Comments: Authors' version of the paper published in proceedings of ECCE, DOI: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[535]  arXiv:2512.20417 [pdf, ps, other]
Title: Chain-of-Anomaly Thoughts with Large Vision-Language Models
Comments: 2 pages, 3 figures, 1 table. Accepted for RECPAD 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[536]  arXiv:2512.20409 [pdf, ps, other]
Title: DETACH : Decomposed Spatio-Temporal Alignment for Exocentric Video and Ambient Sensors with Staged Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[537]  arXiv:2512.20377 [pdf, ps, other]
Title: SmartSplat: Feature-Smart Gaussians for Scalable Compression of Ultra-High-Resolution Images
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538]  arXiv:2512.20376 [pdf, ps, other]
Title: Linking Faces and Voices Across Languages: Insights from the FAME 2026 Challenge
Comments: Accepted at ICASSP 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539]  arXiv:2512.20362 [pdf, ps, other]
Title: CRAFT: Continuous Reasoning and Agentic Feedback Tuning for Multimodal Text-to-Image Generation
Comments: 37 pages, 42 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540]  arXiv:2512.20340 [pdf, ps, other]
Title: The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[541]  arXiv:2512.20296 [pdf, ps, other]
Title: TAVID: Text-Driven Audio-Visual Interactive Dialogue Generation
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[542]  arXiv:2512.20288 [pdf, ps, other]
Title: UbiQVision: Quantifying Uncertainty in XAI for Image Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[543]  arXiv:2512.20260 [pdf, ps, other]
Title: ${D}^{3}${ETOR}: ${D}$ebate-Enhanced Pseudo Labeling and Frequency-Aware Progressive ${D}$ebiasing for Weakly-Supervised Camouflaged Object ${D}$etection with Scribble Annotations
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[544]  arXiv:2512.20257 [pdf, ps, other]
Title: LADLE-MM: Limited Annotation based Detector with Learned Ensembles for Multimodal Misinformation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545]  arXiv:2512.20255 [pdf, ps, other]
Title: BiCoR-Seg: Bidirectional Co-Refinement Framework for High-Resolution Remote Sensing Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546]  arXiv:2512.20251 [pdf, ps, other]
Title: Degradation-Aware Metric Prompting for Hyperspectral Image Restoration
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[547]  arXiv:2512.20236 [pdf, ps, other]
Title: IndicDLP: A Foundational Dataset for Multi-Lingual and Multi-Domain Document Layout Parsing
Comments: Accepted in ICDAR 2025 (Oral Presentation) - Best Student Paper Runner-Up Award
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548]  arXiv:2512.20217 [pdf, ps, other]
Title: LiteFusion: Taming 3D Object Detectors from Vision-Based to Multi-Modal with Minimal Adaptation
Comments: 13 pages, 9 figures, 8 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549]  arXiv:2512.20213 [pdf, ps, other]
Title: JDPNet: A Network Based on Joint Degradation Processing for Underwater Image Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550]  arXiv:2512.20194 [pdf, ps, other]
Title: Generative Latent Coding for Ultra-Low Bitrate Image Compression
Comments: Accepted at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[551]  arXiv:2512.20174 [pdf, ps, other]
Title: Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark
Comments: CVPR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Information Retrieval (cs.IR)
[552]  arXiv:2512.20157 [pdf, ps, other]
Title: AMoE: Agglomerative Mixture-of-Experts Vision Foundation Model
Comments: 17 pages, 8 figures, 11 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[553]  arXiv:2512.20153 [pdf, ps, other]
Title: CoDi -- an exemplar-conditioned diffusion model for low-shot counting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554]  arXiv:2512.20148 [pdf, ps, other]
Title: Enhancing annotations for 5D apple pose estimation through 3D Gaussian Splatting (3DGS)
Comments: 33 pages, excluding appendices. 17 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[555]  arXiv:2512.20128 [pdf, ps, other]
Title: milliMamba: Specular-Aware Human Pose Estimation via Dual mmWave Radar with Multi-Frame Mamba Fusion
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556]  arXiv:2512.20120 [pdf, ps, other]
Title: HEART-VIT: Hessian-Guided Efficient Dynamic Attention and Token Pruning in Vision Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[557]  arXiv:2512.20117 [pdf, ps, other]
Title: DDAVS: Disentangled Audio Semantics and Delayed Bidirectional Alignment for Audio-Visual Segmentation
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[558]  arXiv:2512.20113 [src]
Title: Multi Modal Attention Networks with Uncertainty Quantification for Automated Concrete Bridge Deck Delamination Detection
Comments: the authors are going to substantially edit the paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[559]  arXiv:2512.20107 [pdf, ps, other]
Title: UMAMI: Unifying Masked Autoregressive Models and Deterministic Rendering for View Synthesis
Comments: Accepted to NeurIPS 2025. The first two authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[560]  arXiv:2512.20105 [pdf, ps, other]
Title: LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561]  arXiv:2512.20104 [pdf, ps, other]
Title: Effect of Activation Function and Model Optimizer on the Performance of Human Activity Recognition System Using Various Deep Learning Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562]  arXiv:2512.20088 [pdf, ps, other]
Title: Item Region-based Style Classification Network (IRSN): A Fashion Style Classifier Based on Domain Knowledge of Fashion Experts
Comments: This is a pre-print of an article published in Applied Intelligence. The final authenticated version is available online at: this https URL
Journal-ref: Applied Intelligence, Vol. 54, pp. 6197-6209 (2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[563]  arXiv:2512.20070 [pdf, ps, other]
Title: Progressive Learned Image Compression for Machine Perception
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[564]  arXiv:2512.20042 [pdf, ps, other]
Title: Beyond Vision: Contextually Enriched Image Captioning with Multi-Modal Retrieva
Comments: 7 pages, 5 figures. System description for the EVENTA Grand Challenge (Track 1) at ACM MM'25
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[565]  arXiv:2512.20033 [pdf, ps, other]
Title: FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[566]  arXiv:2512.20032 [pdf, ps, other]
Title: VALLR-Pin: Uncertainty-Factorized Visual Speech Recognition for Mandarin with Pinyin Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567]  arXiv:2512.20029 [pdf, ps, other]
Title: $\text{H}^2$em: Learning Hierarchical Hyperbolic Embeddings for Compositional Zero-Shot Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[568]  arXiv:2512.20026 [pdf, ps, other]
Title: MAPI-GNN: Multi-Activation Plane Interaction Graph Neural Network for Multimodal Medical Diagnosis
Comments: Accepted by Proceedings of the AAAI Conference on Artificial Intelligence 40 (AAAI-26)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569]  arXiv:2512.20025 [pdf, ps, other]
Title: A Contextual Analysis of Driver-Facing and Dual-View Video Inputs for Distraction Detection in Naturalistic Driving Environments
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[570]  arXiv:2512.20013 [pdf, ps, other]
Title: SegEarth-R2: Towards Comprehensive Language-guided Segmentation for Remote Sensing Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571]  arXiv:2512.20011 [pdf, ps, other]
Title: PaveSync: A Unified and Comprehensive Dataset for Pavement Distress Analysis and Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572]  arXiv:2512.20000 [pdf, ps, other]
Title: Few-Shot-Based Modular Image-to-Video Adapter for Diffusion Models
Comments: GitHub page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[573]  arXiv:2512.19990 [pdf, ps, other]
Title: A Dual-Branch Local-Global Framework for Cross-Resolution Land Cover Mapping
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574]  arXiv:2512.19989 [pdf, ps, other]
Title: A Novel CNN Gradient Boosting Ensemble for Guava Disease Detection
Comments: Accepted at IEEE ICCIT 2025. This is the author accepted manuscript
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[ total of 603 entries: 1-50 | ... | 375-424 | 425-474 | 475-524 | 525-574 | 575-603 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2601, contact, help  (Access key information)