We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 663

[ total of 603 entries: 1-50 | ... | 404-453 | 454-503 | 504-553 | 554-603 ]
[ showing 50 entries per page: fewer | more | all ]

Wed, 24 Dec 2025 (continued, showing last 50 of 86 entries)

[554]  arXiv:2512.20148 [pdf, ps, other]
Title: Enhancing annotations for 5D apple pose estimation through 3D Gaussian Splatting (3DGS)
Comments: 33 pages, excluding appendices. 17 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[555]  arXiv:2512.20128 [pdf, ps, other]
Title: milliMamba: Specular-Aware Human Pose Estimation via Dual mmWave Radar with Multi-Frame Mamba Fusion
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556]  arXiv:2512.20120 [pdf, ps, other]
Title: HEART-VIT: Hessian-Guided Efficient Dynamic Attention and Token Pruning in Vision Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[557]  arXiv:2512.20117 [pdf, ps, other]
Title: DDAVS: Disentangled Audio Semantics and Delayed Bidirectional Alignment for Audio-Visual Segmentation
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[558]  arXiv:2512.20113 [src]
Title: Multi Modal Attention Networks with Uncertainty Quantification for Automated Concrete Bridge Deck Delamination Detection
Comments: the authors are going to substantially edit the paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[559]  arXiv:2512.20107 [pdf, ps, other]
Title: UMAMI: Unifying Masked Autoregressive Models and Deterministic Rendering for View Synthesis
Comments: Accepted to NeurIPS 2025. The first two authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[560]  arXiv:2512.20105 [pdf, ps, other]
Title: LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561]  arXiv:2512.20104 [pdf, ps, other]
Title: Effect of Activation Function and Model Optimizer on the Performance of Human Activity Recognition System Using Various Deep Learning Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562]  arXiv:2512.20088 [pdf, ps, other]
Title: Item Region-based Style Classification Network (IRSN): A Fashion Style Classifier Based on Domain Knowledge of Fashion Experts
Comments: This is a pre-print of an article published in Applied Intelligence. The final authenticated version is available online at: this https URL
Journal-ref: Applied Intelligence, Vol. 54, pp. 6197-6209 (2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[563]  arXiv:2512.20070 [pdf, ps, other]
Title: Progressive Learned Image Compression for Machine Perception
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[564]  arXiv:2512.20042 [pdf, ps, other]
Title: Beyond Vision: Contextually Enriched Image Captioning with Multi-Modal Retrieva
Comments: 7 pages, 5 figures. System description for the EVENTA Grand Challenge (Track 1) at ACM MM'25
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[565]  arXiv:2512.20033 [pdf, ps, other]
Title: FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[566]  arXiv:2512.20032 [pdf, ps, other]
Title: VALLR-Pin: Uncertainty-Factorized Visual Speech Recognition for Mandarin with Pinyin Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567]  arXiv:2512.20029 [pdf, ps, other]
Title: $\text{H}^2$em: Learning Hierarchical Hyperbolic Embeddings for Compositional Zero-Shot Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[568]  arXiv:2512.20026 [pdf, ps, other]
Title: MAPI-GNN: Multi-Activation Plane Interaction Graph Neural Network for Multimodal Medical Diagnosis
Comments: Accepted by Proceedings of the AAAI Conference on Artificial Intelligence 40 (AAAI-26)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569]  arXiv:2512.20025 [pdf, ps, other]
Title: A Contextual Analysis of Driver-Facing and Dual-View Video Inputs for Distraction Detection in Naturalistic Driving Environments
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[570]  arXiv:2512.20013 [pdf, ps, other]
Title: SegEarth-R2: Towards Comprehensive Language-guided Segmentation for Remote Sensing Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571]  arXiv:2512.20011 [pdf, ps, other]
Title: PaveSync: A Unified and Comprehensive Dataset for Pavement Distress Analysis and Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572]  arXiv:2512.20000 [pdf, ps, other]
Title: Few-Shot-Based Modular Image-to-Video Adapter for Diffusion Models
Comments: GitHub page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[573]  arXiv:2512.19990 [pdf, ps, other]
Title: A Dual-Branch Local-Global Framework for Cross-Resolution Land Cover Mapping
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574]  arXiv:2512.19989 [pdf, ps, other]
Title: A Novel CNN Gradient Boosting Ensemble for Guava Disease Detection
Comments: Accepted at IEEE ICCIT 2025. This is the author accepted manuscript
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[575]  arXiv:2512.19982 [pdf, ps, other]
Title: WSD-MIL: Window Scale Decay Multiple Instance Learning for Whole Slide Image Classification
Authors: Le Feng, Li Xiao
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[576]  arXiv:2512.19954 [pdf, ps, other]
Title: HistoWAS: A Pathomics Framework for Large-Scale Feature-Wide Association Studies of Tissue Topology and Patient Outcomes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[577]  arXiv:2512.19949 [pdf, ps, other]
Title: How Much 3D Do Video Foundation Models Encode?
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[578]  arXiv:2512.19943 [pdf, ps, other]
Title: SE360: Semantic Edit in 360$^\circ$ Panoramas via Hierarchical Data Construction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[579]  arXiv:2512.19941 [pdf, ps, other]
Title: Block-Recurrent Dynamics in Vision Transformers
Comments: 25 pages, 15 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[580]  arXiv:2512.19934 [pdf, ps, other]
Title: Vehicle-centric Perception via Multimodal Structured Pre-training
Comments: Journal extension of VehicleMAE (AAAI 2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[581]  arXiv:2512.19928 [pdf, ps, other]
Title: Unified Brain Surface and Volume Registration
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[582]  arXiv:2512.19918 [pdf, ps, other]
Title: Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[583]  arXiv:2512.19871 [pdf, ps, other]
Title: HyGE-Occ: Hybrid View-Transformation with 3D Gaussian and Edge Priors for 3D Panoptic Occupancy Prediction
Comments: 11 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[584]  arXiv:2512.19850 [pdf, ps, other]
Title: RANSAC Scoring Functions: Analysis and Reality Check
Authors: A. Shekhovtsov
Subjects: Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
[585]  arXiv:2512.19823 [pdf, ps, other]
Title: Learning to Refocus with Video Diffusion Models
Comments: Code and data are available at this https URL . SIGGRAPH Asia 2025, Dec. 2025
Journal-ref: Proceedings of the SIGGRAPH Asia 2025, pp. 1-11, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[586]  arXiv:2512.19817 [pdf, ps, other]
Title: Generating the Past, Present and Future from a Motion-Blurred Image
Comments: Code and data are available at this https URL
Journal-ref: ACM Trans. Graph. (SIGGRAPH Asia 2025), vol. 44, no. 6, pp. 1-15, Dec. 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[587]  arXiv:2512.19711 [pdf, ps, other]
Title: PHANTOM: PHysical ANamorphic Threats Obstructing Connected Vehicle Mobility
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[588]  arXiv:2512.20618 (cross-list from cs.AI) [pdf, ps, other]
Title: LongVideoAgent: Multi-Agent Reasoning with Long Videos
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
[589]  arXiv:2512.20595 (cross-list from cs.CL) [pdf, ps, other]
Title: Cube Bench: A Benchmark for Spatial Visual Reasoning in MLLMs
Comments: 27 pages, 5 figures, 9 tables. Cube available at this https URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[590]  arXiv:2512.20464 (cross-list from physics.optics) [pdf, ps, other]
Title: Snapshot 3D image projection using a diffractive decoder
Comments: 22 Pages, 8 Figures
Subjects: Optics (physics.optics); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Applied Physics (physics.app-ph)
[591]  arXiv:2512.20436 (cross-list from eess.IV) [pdf, ps, other]
Title: Dual-Encoder Transformer-Based Multimodal Learning for Ischemic Stroke Lesion Segmentation Using Diffusion MRI
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[592]  arXiv:2512.20420 (cross-list from cs.LG) [pdf, ps, other]
Title: Simplifying Multi-Task Architectures Through Task-Specific Normalization
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[593]  arXiv:2512.20387 (cross-list from cs.AI) [pdf, ps, other]
Title: Generative Digital Twins: Vision-Language Simulation Models for Executable Industrial Systems
Comments: 10 pages, 9 figures
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[594]  arXiv:2512.20374 (cross-list from eess.IV) [pdf, ps, other]
Title: CLIP Based Region-Aware Feature Fusion for Automated BBPS Scoring in Colonoscopy Images
Comments: 12 pages, 9 figures, BMVC 2025 submission
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[595]  arXiv:2512.20350 (cross-list from cs.LG) [pdf, ps, other]
Title: Field-Space Attention for Structure-Preserving Earth System Transformers
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Mathematical Physics (math-ph)
[596]  arXiv:2512.20299 (cross-list from cs.RO) [pdf, ps, other]
Title: KnowVal: A Knowledge-Augmented and Value-Guided Autonomous Driving System
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[597]  arXiv:2512.20249 (cross-list from cs.LG) [pdf, ps, other]
Title: Unified Multimodal Brain Decoding via Cross-Subject Soft-ROI Fusion
Authors: Xuanyu Hu
Comments: 15 pages, 2 figures, 4 tables. Submitted to ICPR 2026
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[598]  arXiv:2512.20233 (cross-list from cs.LG) [pdf, ps, other]
Title: How I Met Your Bias: Investigating Bias Amplification in Diffusion Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[599]  arXiv:2512.20145 (cross-list from cs.CL) [pdf, ps, other]
Title: Retrieval-augmented Prompt Learning for Pre-trained Foundation Models
Comments: IEEE/ACM Transactions on Audio, Speech and Language Processing
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[600]  arXiv:2512.20129 (cross-list from cs.HC) [pdf, ps, other]
Title: Dreamcrafter: Immersive Editing of 3D Radiance Fields Through Flexible, Generative Inputs and Outputs
Comments: CHI 2025, Project page: this https URL
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[601]  arXiv:2512.20056 (cross-list from cs.AI) [pdf, ps, other]
Title: Towards Generative Location Awareness for Disaster Response: A Probabilistic Cross-view Geolocalization Approach
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[602]  arXiv:2512.19731 (cross-list from cs.LG) [pdf, ps, other]
Title: Exploring Deep-to-Shallow Transformable Neural Networks for Intelligent Embedded Systems
Comments: Accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[603]  arXiv:2512.18099 (cross-list from eess.AS) [pdf, ps, other]
Title: SAM Audio: Segment Anything in Audio
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV)
[ total of 603 entries: 1-50 | ... | 404-453 | 454-503 | 504-553 | 554-603 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2601, contact, help  (Access key information)