We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 694

[ total of 759 entries: 1-100 | ... | 395-494 | 495-594 | 595-694 | 695-759 ]
[ showing 100 entries per page: fewer | more | all ]

Wed, 3 Dec 2025 (continued, showing last 65 of 141 entries)

[695]  arXiv:2512.02497 [pdf, ps, other]
Title: A Large Scale Benchmark for Test Time Adaptation Methods in Medical Image Segmentation
Comments: 45 pages, 18 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[696]  arXiv:2512.02496 [pdf, ps, other]
Title: Attention-guided reference point shifting for Gaussian-mixture-based partial point set registration
Comments: 16 pages, 9 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[697]  arXiv:2512.02492 [pdf, ps, other]
Title: YingVideo-MV: Music-Driven Multi-Stage Video Generation
Comments: 18 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[698]  arXiv:2512.02487 [pdf, ps, other]
Title: Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[699]  arXiv:2512.02485 [pdf, ps, other]
Title: UCAgents: Unidirectional Convergence for Visual Evidence Anchored Multi-Agent Medical Decision-Making
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[700]  arXiv:2512.02482 [pdf, ps, other]
Title: G-SHARP: Gaussian Surgical Hardware Accelerated Real-time Pipeline
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[701]  arXiv:2512.02473 [pdf, ps, other]
Title: WorldPack: Compressed Memory Improves Spatial Consistency in Video World Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[702]  arXiv:2512.02469 [pdf, ps, other]
Title: TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution
Comments: Accepted in AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[703]  arXiv:2512.02458 [pdf, ps, other]
Title: Vision to Geometry: 3D Spatial Memory for Sequential Embodied MLLM Reasoning and Exploration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[704]  arXiv:2512.02457 [pdf, ps, other]
Title: Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation
Comments: Project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[705]  arXiv:2512.02456 [pdf, ps, other]
Title: See, Think, Learn: A Self-Taught Multimodal Reasoner
Comments: Winter Conference on Applications of Computer Vision 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[706]  arXiv:2512.02453 [pdf, ps, other]
Title: ClusterStyle: Modeling Intra-Style Diversity with Prototypical Clustering for Stylized Motion Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[707]  arXiv:2512.02450 [pdf, ps, other]
Title: HouseLayout3D: A Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild
Comments: NeurIPS 2025 (Datasets and Benchmarks Track) Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[708]  arXiv:2512.02448 [pdf, ps, other]
Title: nuScenes Revisited: Progress and Challenges in Autonomous Driving
Comments: 18 pages, 17 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[709]  arXiv:2512.02447 [pdf, ps, other]
Title: Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[710]  arXiv:2512.02441 [pdf, ps, other]
Title: Basis-Oriented Low-rank Transfer for Few-Shot and Test-Time Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[711]  arXiv:2512.02438 [pdf, ps, other]
Title: Boosting Medical Vision-Language Pretraining via Momentum Self-Distillation under Limited Computing Resources
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[712]  arXiv:2512.02437 [pdf, ps, other]
Title: LightHCG: a Lightweight yet powerful HSIC Disentanglement based Causal Glaucoma Detection Model framework
Authors: Daeyoung Kim
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[713]  arXiv:2512.02425 [pdf, ps, other]
Title: WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning
Comments: Project page : this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[714]  arXiv:2512.02423 [pdf, ps, other]
Title: GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning
Comments: 26 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[715]  arXiv:2512.02421 [pdf, ps, other]
Title: Generalizing Vision-Language Models with Dedicated Prompt Guidance
Comments: Accepted to AAAI26
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[716]  arXiv:2512.02413 [pdf, ps, other]
Title: MitUNet: Enhancing Floor Plan Recognition using a Hybrid Mix-Transformer and U-Net Architecture
Comments: 9 pages, 4 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[717]  arXiv:2512.02405 [pdf, ps, other]
Title: WISE: Weighted Iterative Society-of-Experts for Robust Multimodal Multi-Agent Debate
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[718]  arXiv:2512.02400 [pdf, ps, other]
Title: Nav-$R^2$ Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[719]  arXiv:2512.02395 [pdf, ps, other]
Title: Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch
Comments: 21 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[720]  arXiv:2512.02394 [pdf, ps, other]
Title: Reproducing and Extending RaDelft 4D Radar with Camera-Assisted Labels
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[721]  arXiv:2512.02392 [pdf, ps, other]
Title: From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[722]  arXiv:2512.02375 [pdf, ps, other]
Title: On-the-fly Feedback SfM: Online Explore-and-Exploit UAV Photogrammetry with Incremental Mesh Quality-Aware Indicator and Predictive Path Planning
Comments: This work was submitted to IEEE GRSM Journal for consideration.COPYRIGHT would be transferred once it get accepted
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[723]  arXiv:2512.02369 [pdf, ps, other]
Title: SAGE: Style-Adaptive Generalization for Privacy-Constrained Semantic Segmentation Across Domains
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[724]  arXiv:2512.02368 [pdf, ps, other]
Title: Multi-Domain Enhanced Map-Free Trajectory Prediction with Selective Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[725]  arXiv:2512.02364 [pdf, ps, other]
Title: Tackling Tuberculosis: A Comparative Dive into Machine Learning for Tuberculosis Detection
Journal-ref: Vol. 6, No. 1 (2024), Minnesota Undergraduate Research & Academic Journal (MURAJ)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[726]  arXiv:2512.02361 [pdf, ps, other]
Title: VACoT: Rethinking Visual Data Augmentation with VLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[727]  arXiv:2512.02359 [pdf, ps, other]
Title: WSCF-MVCC: Weakly-supervised Calibration-free Multi-view Crowd Counting
Comments: PRCV 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[728]  arXiv:2512.02351 [pdf, ps, other]
Title: Understanding and Harnessing Sparsity in Unified Multimodal Models
Comments: 13 pages, 13 figures, 8 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[729]  arXiv:2512.02344 [pdf, ps, other]
Title: A multi-weight self-matching visual explanation for cnns on sar images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[730]  arXiv:2512.02341 [pdf, ps, other]
Title: TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[731]  arXiv:2512.02339 [pdf, ps, other]
Title: Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
Comments: Accepted at NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[732]  arXiv:2512.02290 [pdf, ps, other]
Title: Enhancing Cross Domain SAR Oil Spill Segmentation via Morphological Region Perturbation and Synthetic Label-to-SAR Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[733]  arXiv:2512.02273 [pdf, ps, other]
Title: Progressive Image Restoration via Text-Conditioned Video Generation
Comments: First two authors contributed equally to this work. IEEE ICNC Accepted
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[734]  arXiv:2512.02268 [pdf, ps, other]
Title: Spatiotemporal Pyramid Flow Matching for Climate Emulation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Machine Learning (stat.ML)
[735]  arXiv:2512.02258 [pdf, ps, other]
Title: Exploring the Potentials of Spiking Neural Networks for Image Deraining
Comments: Accepted By AAAI2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[736]  arXiv:2512.02231 [pdf, ps, other]
Title: See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models
Comments: preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[737]  arXiv:2512.02224 [pdf, ps, other]
Title: Towards Unified Video Quality Assessment
Comments: 8 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[738]  arXiv:2512.02198 [pdf, ps, other]
Title: Multifractal Recalibration of Neural Networks for Medical Imaging Segmentation
Comments: 30 pages, 9 figures, journal paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[739]  arXiv:2512.02188 [pdf, ps, other]
Title: RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation
Comments: Submitted to Medical Image Analysis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[740]  arXiv:2512.02172 [pdf, ps, other]
Title: SplatSuRe: Selective Super-Resolution for Multi-view Consistent 3D Gaussian Splatting
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[741]  arXiv:2512.02162 [pdf, ps, other]
Title: Mapping of Lesion Images to Somatic Mutations
Authors: Rahul Mehta
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[742]  arXiv:2512.02161 [pdf, ps, other]
Title: FineGRAIN: Evaluating Failure Modes of Text-to-Image Models with Vision Language Model Judges
Comments: Accepted to NeurIPS 2025 Datasets and Benchmarks Track
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[743]  arXiv:2512.02152 [pdf, ps, other]
Title: Context-Enriched Contrastive Loss: Enhancing Presentation of Inherent Sample Connections in Contrastive Learning Framework
Comments: 13 pages, 7 figures. Published in IEEE Transactions on Multimedia. Code available at: this https URL
Journal-ref: IEEE Transactions on Multimedia, Vol. 27, pp. 429-441, December 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[744]  arXiv:2512.02055 [pdf, ps, other]
Title: Leveraging AI multimodal geospatial foundation models for improved near-real-time flood mapping at a global scale
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[745]  arXiv:2512.03028 (cross-list from cs.GR) [pdf, ps, other]
Title: SMP: Reusable Score-Matching Motion Priors for Physics-Based Character Control
Comments: 14 pages, 9 figures
Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[746]  arXiv:2512.02920 (cross-list from cs.LG) [pdf, ps, other]
Title: Learning Multimodal Embeddings for Traffic Accident Prediction and Causal Estimation
Comments: 17 pages. To appear in KDD'26 Datasets
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)
[747]  arXiv:2512.02787 (cross-list from cs.RO) [pdf, ps, other]
Title: Diagnose, Correct, and Learn from Manipulation Failures via Visual Symbols
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[748]  arXiv:2512.02719 (cross-list from cs.CL) [pdf, ps, other]
Title: Emergent Bayesian Behaviour and Optimal Cue Combination in LLMs
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
[749]  arXiv:2512.02651 (cross-list from cs.HC) [pdf, ps, other]
Title: Real-Time Multimodal Data Collection Using Smartwatches and Its Visualization in Education
Comments: Accepted in Technological Ecosystems for Enhancing Multiculturality (TEEM) 2025
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[750]  arXiv:2512.02636 (cross-list from cs.LG) [pdf, ps, other]
Title: Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[751]  arXiv:2512.02609 (cross-list from cs.RO) [pdf, ps, other]
Title: SAM2Grasp: Resolve Multi-modal Grasping via Prompt-conditioned Temporal Action Prediction
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[752]  arXiv:2512.02340 (cross-list from cs.AI) [pdf, ps, other]
Title: Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning: A Cognitive Science Perspective
Comments: 23 pages, 37 figures
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[753]  arXiv:2512.02306 (cross-list from cs.AI) [pdf, ps, other]
Title: OmniGuard: Unified Omni-Modal Guardrails with Deliberate Reasoning
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[754]  arXiv:2512.02293 (cross-list from cs.RO) [pdf, ps, other]
Title: VIGS-SLAM: Visual Inertial Gaussian Splatting SLAM
Comments: Project page: this https URL
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[755]  arXiv:2512.02280 (cross-list from cs.AI) [pdf, ps, other]
Title: Bridging the Gap: Toward Cognitive Autonomy in Artificial Intelligence
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[756]  arXiv:2512.02243 (cross-list from cs.CR) [pdf, ps, other]
Title: PhishSnap: Image-Based Phishing Detection Using Perceptual Hashing
Comments: IEE Standard Formatting, 3 pages, 3 figures
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[757]  arXiv:2512.02143 (cross-list from cs.GR) [pdf, ps, other]
Title: CoatFusion: Controllable Material Coating in Images
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[758]  arXiv:2512.02088 (cross-list from eess.IV) [pdf, ps, other]
Title: Comparing Baseline and Day-1 Diffusion MRI Using Multimodal Deep Embeddings for Stroke Outcome Prediction
Comments: 5 pages, 5 figures, 2 tables
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[759]  arXiv:2512.02062 (cross-list from cs.CR) [pdf, ps, other]
Title: Superpixel Attack: Enhancing Black-box Adversarial Attack with Image-driven Division Areas
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[ total of 759 entries: 1-100 | ... | 395-494 | 495-594 | 595-694 | 695-759 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)