We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 118

[ total of 737 entries: 1-250 | 119-368 | 369-618 | 619-737 ]
[ showing 250 entries per page: fewer | more | all ]

Thu, 11 Dec 2025

[119]  arXiv:2512.09925 [pdf, ps, other]
Title: GAINS: Gaussian-based Inverse Rendering from Sparse Multi-View Captures
Comments: 23 pages, 18 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120]  arXiv:2512.09924 [pdf, ps, other]
Title: ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning
Comments: Project Page: [this https URL](this https URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121]  arXiv:2512.09923 [pdf, ps, other]
Title: Splatent: Splatting Diffusion Latents for Novel View Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[122]  arXiv:2512.09913 [pdf, ps, other]
Title: NordFKB: a fine-grained benchmark dataset for geospatial AI in Norway
Comments: 8 pages, 2 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[123]  arXiv:2512.09907 [pdf, ps, other]
Title: VisualActBench: Can VLMs See and Act like a Human?
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[124]  arXiv:2512.09874 [pdf, ps, other]
Title: Benchmarking Document Parsers on Mathematical Formula Extraction from PDFs
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[125]  arXiv:2512.09871 [pdf, ps, other]
Title: Diffusion Posterior Sampler for Hyperspectral Unmixing with Spectral Variability Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[126]  arXiv:2512.09867 [pdf, ps, other]
Title: MedForget: Hierarchy-Aware Multimodal Unlearning Testbed for Medical AI
Comments: Dataset and Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[127]  arXiv:2512.09864 [pdf, ps, other]
Title: UniUGP: Unifying Understanding, Generation, and Planing For End-to-end Autonomous Driving
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128]  arXiv:2512.09847 [pdf, ps, other]
Title: From Detection to Anticipation: Online Understanding of Struggles across Various Tasks and Activities
Comments: Accepted by WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[129]  arXiv:2512.09824 [pdf, ps, other]
Title: Composing Concepts from Images and Videos via Concept-prompt Binding
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[130]  arXiv:2512.09814 [pdf, ps, other]
Title: DynaIP: Dynamic Image Prompt Adapter for Scalable Zero-shot Personalized Text-to-Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[131]  arXiv:2512.09806 [pdf, ps, other]
Title: CHEM: Estimating and Understanding Hallucinations in Deep Learning for Image Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[132]  arXiv:2512.09801 [pdf, ps, other]
Title: Modality-Specific Enhancement and Complementary Fusion for Semi-Supervised Multi-Modal Brain Tumor Segmentation
Comments: 9 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133]  arXiv:2512.09792 [pdf, ps, other]
Title: FastPose-ViT: A Vision Transformer for Real-Time Spacecraft Pose Estimation
Comments: Accepted to WACV 2026. Preprint version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[134]  arXiv:2512.09773 [pdf, other]
Title: Stylized Meta-Album: Group-bias injection with style transfer to study robustness against distribution shifts
Authors: Romain Mussard (UNIROUEN), Aurélien Gauffre (UGA), Ihsan Ullah, Thanh Gia Hieu Khuong (TAU, LISN), Massih-Reza Amini (UGA), Isabelle Guyon (TAU, LISN), Lisheng Sun-Hosoya (TAU, LISN)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135]  arXiv:2512.09700 [pdf, ps, other]
Title: LiM-YOLO: Less is More with Pyramid Level Shift and Normalized Auxiliary Branch for Ship Detection in Optical Remote Sensing Imagery
Comments: 16 pages, 8 figures, 9 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[136]  arXiv:2512.09687 [pdf, ps, other]
Title: Unconsciously Forget: Mitigating Memorization; Without Knowing What is being Memorized
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137]  arXiv:2512.09670 [pdf, ps, other]
Title: An Automated Tip-and-Cue Framework for Optimized Satellite Tasking and Visual Intelligence
Comments: Under review at IEEE Transactions on Geoscience and Remote Sensing (TGRS). 13 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Systems and Control (eess.SY)
[138]  arXiv:2512.09665 [pdf, ps, other]
Title: OxEnsemble: Fair Ensembles for Low-Data Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Machine Learning (cs.LG)
[139]  arXiv:2512.09663 [pdf, ps, other]
Title: IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140]  arXiv:2512.09646 [pdf, ps, other]
Title: VHOI: Controllable Video Generation of Human-Object Interactions from Sparse Trajectories via Motion Densification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141]  arXiv:2512.09644 [pdf, ps, other]
Title: Kaapana: A Comprehensive Open-Source Platform for Integrating AI in Medical Imaging Research Environments
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142]  arXiv:2512.09633 [pdf, ps, other]
Title: Benchmarking SAM2-based Trackers on FMOX
Journal-ref: 33rd International Conference on Artificial Intelligence and Cognitive Science (AICS 2025), December, 2025, Dublin, Ireland
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[143]  arXiv:2512.09626 [pdf, ps, other]
Title: Beyond Sequences: A Benchmark for Atomic Hand-Object Interaction Using a Static RNN Encoder
Comments: Code available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144]  arXiv:2512.09617 [pdf, ps, other]
Title: FROMAT: Multiview Material Appearance Transfer via Few-Shot Self-Attention Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[145]  arXiv:2512.09616 [pdf, ps, other]
Title: Rethinking Chain-of-Thought Reasoning for Videos
Comments: Technical report
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[146]  arXiv:2512.09592 [pdf, ps, other]
Title: CS3D: An Efficient Facial Expression Recognition via Event Vision
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[147]  arXiv:2512.09583 [pdf, ps, other]
Title: UnReflectAnything: RGB-Only Highlight Removal by Rendering Synthetic Specular Supervision
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148]  arXiv:2512.09580 [pdf, ps, other]
Title: Content-Adaptive Image Retouching Guided by Attribute-Based Text Representation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149]  arXiv:2512.09579 [pdf, ps, other]
Title: Hands-on Evaluation of Visual Transformers for Object Recognition and Detection
Journal-ref: 37th International Conference on Tools with Artificial Intelligence (ICTAI 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[150]  arXiv:2512.09576 [pdf, ps, other]
Title: Seeing Soil from Space: Towards Robust and Scalable Remote Soil Nutrient Analysis
Authors: David Seu (1), Nicolas Longepe (2), Gabriel Cioltea (1), Erik Maidik (1), Calin Andrei (1) ((1) CO2 Angels, Cluj-Napoca, Romania, (2) European Space Agency Phi-Lab, Frascati, Italy)
Comments: 23 pages, 13 figures, 13 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Geophysics (physics.geo-ph)
[151]  arXiv:2512.09573 [pdf, ps, other]
Title: Investigate the Low-level Visual Perception in Vision-Language based Image Quality Assessment
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152]  arXiv:2512.09565 [pdf, ps, other]
Title: From Graphs to Gates: DNS-HyXNet, A Lightweight and Deployable Sequential Model for Real-Time DNS Tunnel Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153]  arXiv:2512.09555 [pdf, ps, other]
Title: Building Reasonable Inference for Vision-Language Models in Blind Image Quality Assessment
Comments: Accepted to the ICONIP (International Conference on Neural Information Processing), 2025
Journal-ref: Building Reasonable Inference for Vision-Language Models in Blind Image Quality Assessment. In: Taniguchi, T., et al. Neural Information Processing. ICONIP 2025. Lecture Notes in Computer Science, vol 16310. Springer, Singapore
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[154]  arXiv:2512.09546 [pdf, ps, other]
Title: A Dual-Domain Convolutional Network for Hyperspectral Single-Image Super-Resolution
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155]  arXiv:2512.09525 [pdf, ps, other]
Title: Masked Registration and Autoencoding of CT Images for Predictive Tibia Reconstruction
Comments: DGM4MICCAI
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[156]  arXiv:2512.09497 [pdf, ps, other]
Title: Gradient-Guided Learning Network for Infrared Small Target Detection
Comments: Accepted by GRSL 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[157]  arXiv:2512.09492 [pdf, ps, other]
Title: StateSpace-SSL: Linear-Time Self-supervised Learning for Plant Disease Detection
Comments: Accepted to AAAI workshop (AgriAI 2026)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158]  arXiv:2512.09489 [pdf, ps, other]
Title: MODA: The First Challenging Benchmark for Multispectral Object Detection in Aerial Images
Comments: 8 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159]  arXiv:2512.09477 [pdf, ps, other]
Title: Color encoding in Latent Space of Stable Diffusion Models
Comments: 6 pages, 8 figures, Color Imaging Conference 33
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[160]  arXiv:2512.09471 [pdf, ps, other]
Title: Temporal-Spatial Tubelet Embedding for Cloud-Robust MSI Reconstruction using MSI-SAR Fusion: A Multi-Head Self-Attention Video Vision Transformer Approach
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[161]  arXiv:2512.09463 [pdf, ps, other]
Title: Privacy-Preserving Computer Vision for Industry: Three Case Studies in Human-Centric Manufacturing
Comments: Accepted to the AAAI26 HCM workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[162]  arXiv:2512.09461 [pdf, ps, other]
Title: Cytoplasmic Strings Analysis in Human Embryo Time-Lapse Videos using Deep Learning Framework
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[163]  arXiv:2512.09446 [pdf, ps, other]
Title: Defect-aware Hybrid Prompt Optimization via Progressive Tuning for Zero-Shot Multi-type Anomaly Detection and Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164]  arXiv:2512.09441 [pdf, ps, other]
Title: Representation Calibration and Uncertainty Guidance for Class-Incremental Learning based on Vision Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[165]  arXiv:2512.09435 [pdf, ps, other]
Title: UniPart: Part-Level 3D Generation with Unified 3D Geom-Seg Latents
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166]  arXiv:2512.09423 [pdf, ps, other]
Title: FunPhase: A Periodic Functional Autoencoder for Motion Generation via Phase Manifolds
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167]  arXiv:2512.09422 [pdf, ps, other]
Title: InfoMotion: A Graph-Based Approach to Video Dataset Distillation for Echocardiography
Comments: Accepted at MICAD 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168]  arXiv:2512.09418 [pdf, ps, other]
Title: Label-free Motion-Conditioned Diffusion Model for Cardiac Ultrasound Synthesis
Comments: Accepted at MICAD 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169]  arXiv:2512.09417 [pdf, ps, other]
Title: DirectSwap: Mask-Free Cross-Identity Training and Benchmarking for Expression-Consistent Video Head Swapping
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170]  arXiv:2512.09407 [pdf, ps, other]
Title: Generative Point Cloud Registration
Comments: 14 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171]  arXiv:2512.09402 [pdf, ps, other]
Title: Wasserstein-Aligned Hyperbolic Multi-View Clustering
Comments: 14 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172]  arXiv:2512.09393 [pdf, ps, other]
Title: Detection and Localization of Subdural Hematoma Using Deep Learning on Computed Tomography
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[173]  arXiv:2512.09383 [pdf, ps, other]
Title: Perception-Inspired Color Space Design for Photo White Balance Editing
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174]  arXiv:2512.09375 [pdf, ps, other]
Title: Log NeRF: Comparing Spaces for Learning Radiance Fields
Authors: Sihe Chen (Northeastern University), Luv Verma (Northeastern University), Bruce A. Maxwell (Northeastern University)
Comments: The 36th British Machine Vision Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[175]  arXiv:2512.09373 [pdf, ps, other]
Title: FUSER: Feed-Forward MUltiview 3D Registration Transformer and SE(3)$^N$ Diffusion Refinement
Comments: 13 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176]  arXiv:2512.09364 [pdf, ps, other]
Title: ASSIST-3D: Adapted Scene Synthesis for Class-Agnostic 3D Instance Segmentation
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177]  arXiv:2512.09363 [pdf, ps, other]
Title: StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178]  arXiv:2512.09354 [pdf, ps, other]
Title: Video-QTR: Query-Driven Temporal Reasoning Framework for Lightweight Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179]  arXiv:2512.09350 [pdf, ps, other]
Title: TextGuider: Training-Free Guidance for Text Rendering via Attention Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180]  arXiv:2512.09335 [pdf, ps, other]
Title: Relightable and Dynamic Gaussian Avatar Reconstruction from Monocular Video
Comments: 8 pages, 9 figures, published in ACM MM 2025
Journal-ref: In Proceedings of the 33rd ACM International Conference on Multimedia. 2025. p. 7405-7414
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[181]  arXiv:2512.09327 [pdf, ps, other]
Title: UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[182]  arXiv:2512.09315 [pdf, ps, other]
Title: Benchmarking Real-World Medical Image Classification with Noisy Labels: Challenges, Practice, and Outlook
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183]  arXiv:2512.09311 [pdf, ps, other]
Title: Transformer-Driven Multimodal Fusion for Explainable Suspiciousness Estimation in Visual Surveillance
Comments: 12 pages, 10 figures, IEEE Transaction on Image Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[184]  arXiv:2512.09307 [pdf, ps, other]
Title: From SAM to DINOv2: Towards Distilling Foundation Models to Lightweight Baselines for Generalized Polyp Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185]  arXiv:2512.09299 [pdf, ps, other]
Title: VABench: A Comprehensive Benchmark for Audio-Video Generation
Comments: 24 pages, 25 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[186]  arXiv:2512.09296 [pdf, ps, other]
Title: Traffic Scene Small Target Detection Method Based on YOLOv8n-SPTS Model for Autonomous Driving
Authors: Songhan Wu
Comments: 6 pages, 7 figures, 1 table. Accepted to The 2025 IEEE 3rd International Conference on Electrical, Automation and Computer Engineering (ICEACE), 2025. Code available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187]  arXiv:2512.09289 [pdf, ps, other]
Title: MelanomaNet: Explainable Deep Learning for Skin Lesion Classification
Comments: 7 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[188]  arXiv:2512.09282 [pdf, ps, other]
Title: FoundIR-v2: Optimizing Pre-Training Data Mixtures for Image Restoration Foundation Model
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[189]  arXiv:2512.09278 [pdf, ps, other]
Title: LoGoColor: Local-Global 3D Colorization for 360° Scenes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190]  arXiv:2512.09276 [pdf, ps, other]
Title: Dynamic Facial Expressions Analysis Based Parkinson's Disease Auxiliary Diagnosis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191]  arXiv:2512.09271 [pdf, ps, other]
Title: LongT2IBench: A Benchmark for Evaluating Long Text-to-Image Generation with Graph-structured Annotations
Comments: The paper has been accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192]  arXiv:2512.09270 [pdf, ps, other]
Title: MoRel: Long-Range Flicker-Free 4D Motion Modeling via Anchor Relay-based Bidirectional Blending with Hierarchical Densification
Comments: Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193]  arXiv:2512.09258 [pdf, ps, other]
Title: ROI-Packing: Efficient Region-Based Compression for Machine Vision
Journal-ref: International Conference on Multimedia Information Processing and Retrieval (MIPR), San Jose, CA, USA, 2025, pp. 233-238
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[194]  arXiv:2512.09251 [pdf, ps, other]
Title: GLACIA: Instance-Aware Positional Reasoning for Glacial Lake Segmentation via Multimodal Large Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[195]  arXiv:2512.09247 [pdf, ps, other]
Title: OmniPSD: Layered PSD Generation with Diffusion Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[196]  arXiv:2512.09244 [pdf, ps, other]
Title: A Clinically Interpretable Deep CNN Framework for Early Chronic Kidney Disease Prediction Using Grad-CAM-Based Explainable AI
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[197]  arXiv:2512.09235 [pdf, ps, other]
Title: Efficient Feature Compression for Machines with Global Statistics Preservation
Journal-ref: 2025 IEEE International Symposium on Circuits and Systems (ISCAS), London, United Kingdom, 2025, pp. 1-5
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[198]  arXiv:2512.09232 [pdf, ps, other]
Title: Enabling Next-Generation Consumer Experience with Feature Coding for Machines
Journal-ref: 2025 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 2025, pp. 1-4
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199]  arXiv:2512.09215 [pdf, ps, other]
Title: View-on-Graph: Zero-shot 3D Visual Grounding via Vision-Language Reasoning on Scene Graphs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[200]  arXiv:2512.09185 [pdf, ps, other]
Title: Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[201]  arXiv:2512.09172 [pdf, ps, other]
Title: Prompt-Based Continual Compositional Zero-Shot Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[202]  arXiv:2512.09164 [pdf, ps, other]
Title: WonderZoom: Multi-Scale 3D World Generation
Comments: Project website: this https URL The first two authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[203]  arXiv:2512.09162 [pdf, ps, other]
Title: GTAvatar: Bridging Gaussian Splatting and Texture Mapping for Relightable and Editable Gaussian Avatars
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[204]  arXiv:2512.09134 [pdf, ps, other]
Title: Integrated Pipeline for Coronary Angiography With Automated Lesion Profiling, Virtual Stenting, and 100-Vessel FFR Validation
Comments: 22 pages, 10 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[205]  arXiv:2512.09115 [pdf, ps, other]
Title: SuperF: Neural Implicit Fields for Multi-Image Super-Resolution
Comments: 23 pages, 13 figures, 8 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206]  arXiv:2512.09112 [pdf, ps, other]
Title: GimbalDiffusion: Gravity-Aware Camera Control for Video Generation
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[207]  arXiv:2512.09095 [pdf, ps, other]
Title: Food Image Generation on Multi-Noun Categories
Comments: Accepted by WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208]  arXiv:2512.09092 [pdf, ps, other]
Title: Explaining the Unseen: Multimodal Vision-Language Reasoning for Situational Awareness in Underground Mining Disasters
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[209]  arXiv:2512.09081 [pdf, ps, other]
Title: AgentComp: From Agentic Reasoning to Compositional Mastery in Text-to-Image Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[210]  arXiv:2512.09071 [pdf, ps, other]
Title: Adaptive Thresholding for Visual Place Recognition using Negative Gaussian Mixture Statistics
Comments: Accepted and presented at IEEE RoboticCC 2025. 4 pages short paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[211]  arXiv:2512.09069 [pdf, ps, other]
Title: KD-OCT: Efficient Knowledge Distillation for Clinical-Grade Retinal OCT Classification
Comments: 7 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[212]  arXiv:2512.09062 [pdf, ps, other]
Title: SIP: Site in Pieces- A Dataset of Disaggregated Construction-Phase 3D Scans for Semantic Segmentation and Scene Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[213]  arXiv:2512.09056 [pdf, ps, other]
Title: ConceptPose: Training-Free Zero-Shot Object Pose Estimation using Concept Vectors
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214]  arXiv:2512.09016 [pdf, ps, other]
Title: Learning to Remove Lens Flare in Event Camera
Comments: Preprint; 29 pages, 14 figures, 4 tables; Project Page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215]  arXiv:2512.09011 [pdf, ps, other]
Title: An Approach for Detection of Entities in Dynamic Media Contents
Comments: 12 pages, 8 figures
Journal-ref: Journal of Computer Science and Technology Studies, Vol. 5, No. 3, pp. 13-24, 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[216]  arXiv:2512.09010 [pdf, ps, other]
Title: Towards Lossless Ultimate Vision Token Compression for VLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[217]  arXiv:2512.09005 [pdf, ps, other]
Title: A Survey of Body and Face Motion: Datasets, Performance Evaluation Metrics and Generative Techniques
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[218]  arXiv:2512.09001 [pdf, ps, other]
Title: A Physics-Constrained, Design-Driven Methodology for Defect Dataset Generation in Optical Lithography
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[219]  arXiv:2512.08999 [pdf, ps, other]
Title: Diffusion Model Regularized Implicit Neural Representation for CT Metal Artifact Reduction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220]  arXiv:2512.08996 [pdf, ps, other]
Title: Demo: Generative AI helps Radiotherapy Planning with User Preference
Comments: Best paper in GenAI4Health at NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[221]  arXiv:2512.08991 [pdf, ps, other]
Title: Deterministic World Models for Verification of Closed-loop Vision-based Systems
Comments: 22 pages, 10 figures. Submitted to FM 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[222]  arXiv:2512.08989 [pdf, ps, other]
Title: Enhancing Knowledge Transfer in Hyperspectral Image Classification via Cross-scene Knowledge Integration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[223]  arXiv:2512.08987 [pdf, ps, other]
Title: 3DID: Direct 3D Inverse Design for Aerodynamics with Physics-Aware Optimization
Comments: Accepted at NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[224]  arXiv:2512.08986 [pdf, ps, other]
Title: Explainable Fundus Image Curation and Lesion Detection in Diabetic Retinopathy
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[225]  arXiv:2512.08985 [pdf, ps, other]
Title: An Efficient Test-Time Scaling Approach for Image Generation
Comments: 11 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226]  arXiv:2512.08984 [pdf, ps, other]
Title: RAG-HAR: Retrieval Augmented Generation-based Human Activity Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[227]  arXiv:2512.08983 [pdf, ps, other]
Title: HSCP: A Two-Stage Spectral Clustering Framework for Resource-Constrained UAV Identification
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[228]  arXiv:2512.08982 [pdf, ps, other]
Title: Consist-Retinex: One-Step Noise-Emphasized Consistency Training Accelerates High-Quality Retinex Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[229]  arXiv:2512.08981 [pdf, ps, other]
Title: Mitigating Bias with Words: Inducing Demographic Ambiguity in Face Recognition Templates by Text Encoding
Comments: Accepted at BMVC workshop (SRBS) 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[230]  arXiv:2512.08980 [pdf, ps, other]
Title: Training Multi-Image Vision Agents via End2End Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[231]  arXiv:2512.08979 [pdf, ps, other]
Title: What Happens When: Learning Temporal Orders of Events in Videos
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[232]  arXiv:2512.09920 (cross-list from cs.RO) [pdf, ps, other]
Title: LISN: Language-Instructed Social Navigation with VLM-based Controller Modulating
Comments: 8 pages
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[233]  arXiv:2512.09903 (cross-list from cs.RO) [pdf, ps, other]
Title: YOPO-Nav: Visual Navigation using 3DGS Graphs from One-Pass Videos
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[234]  arXiv:2512.09898 (cross-list from cs.RO) [pdf, ps, other]
Title: Visual Heading Prediction for Autonomous Aerial Vehicles
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA); Systems and Control (eess.SY)
[235]  arXiv:2512.09851 (cross-list from cs.RO) [pdf, ps, other]
Title: Simultaneous Tactile-Visual Perception for Learning Multimodal Robot Manipulation
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[236]  arXiv:2512.09841 (cross-list from cs.CL) [pdf, ps, other]
Title: ChronusOmni: Improving Time Awareness of Omni Large Language Models
Comments: Code available at this https URL
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[237]  arXiv:2512.09779 (cross-list from eess.IV) [pdf, ps, other]
Title: PathCo-LatticE: Pathology-Constrained Lattice-Of Experts Framework for Fully-supervised Few-Shot Cardiac MRI Segmentation
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[238]  arXiv:2512.09664 (cross-list from cs.DC) [pdf, ps, other]
Title: SynthPix: A lightspeed PIV images generator
Comments: Code: this https URL
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[239]  arXiv:2512.09610 (cross-list from cs.HC) [pdf, ps, other]
Title: ImageTalk: Designing a Multimodal AAC Text Generation System Driven by Image Recognition and Natural Language Generation
Comments: 24 pages, 10 figures
Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[240]  arXiv:2512.09607 (cross-list from cs.RO) [pdf, ps, other]
Title: UrbanNav: Learning Language-Guided Urban Navigation from Web-Scale Human Trajectories
Comments: 9 pages, 5 figures, accepted to AAAI 2026
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[241]  arXiv:2512.09510 (cross-list from cs.RO) [pdf, ps, other]
Title: ViTA-Seg: Vision Transformer for Amodal Segmentation in Robotics
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[242]  arXiv:2512.09469 (cross-list from quant-ph) [pdf, ps, other]
Title: LiePrune: Lie Group and Quantum Geometric Dual Representation for One-Shot Structured Pruning of Quantum Neural Networks
Comments: 7 pages, 2 figures
Subjects: Quantum Physics (quant-ph); Computer Vision and Pattern Recognition (cs.CV)
[243]  arXiv:2512.09447 (cross-list from cs.RO) [pdf, ps, other]
Title: Sequential Testing for Descriptor-Agnostic LiDAR Loop Closure in Repetitive Environments
Comments: 8 pages, 4 figures
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[244]  arXiv:2512.09406 (cross-list from cs.RO) [pdf, ps, other]
Title: H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos
Comments: 13 pages, 6 figures
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[245]  arXiv:2512.09376 (cross-list from cs.LG) [pdf, ps, other]
Title: Rates and architectures for learning geometrically non-trivial operators
Comments: 26 pages, 5 figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Differential Geometry (math.DG)
[246]  arXiv:2512.09343 (cross-list from cs.RO) [pdf, ps, other]
Title: Development and Testing for Perception Based Autonomous Landing of a Long-Range QuadPlane
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[247]  arXiv:2512.09340 (cross-list from cs.AI) [pdf, ps, other]
Title: Visual Categorization Across Minds and Models: Cognitive Analysis of Human Labeling and Neuro-Symbolic Integration
Comments: 12 pages, 3 figures. Research manuscript based on the final project for CS6795 (Introduction to Cognitive Science), Georgia Tech
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[248]  arXiv:2512.09309 (cross-list from cs.DC) [pdf, ps, other]
Title: A Distributed Framework for Privacy-Enhanced Vision Transformers on the Edge
Comments: 16 pages, 7 figures. Published in the Proceedings of the Tenth ACM/IEEE Symposium on Edge Computing (SEC '25), Dec 3-6, 2025, Washington, D.C., USA
Journal-ref: Proceedings of the Tenth ACM/IEEE Symposium on Edge Computing (SEC '25), 2025, Article 8, pp. 1-16
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[249]  arXiv:2512.09201 (cross-list from cs.GR) [pdf, ps, other]
Title: Residual Primitive Fitting of 3D Shapes with SuperFrusta
Comments: this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[250]  arXiv:2512.09094 (cross-list from eess.IV) [pdf, ps, other]
Title: Causal Attribution of Model Performance Gaps in Medical Imaging Under Distribution Shifts
Comments: Medical Imaging meets EurIPS Workshop: MedEurIPS 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Methodology (stat.ME)
[251]  arXiv:2512.08998 (cross-list from eess.IV) [pdf, ps, other]
Title: DermETAS-SNA LLM: A Dermatology Focused Evolutionary Transformer Architecture Search with StackNet Augmented LLM Assistant
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[252]  arXiv:2512.08992 (cross-list from eess.IV) [pdf, ps, other]
Title: Enhanced Chest Disease Classification Using an Improved CheXNet Framework with EfficientNetV2-M and Optimization-Driven Learning
Comments: 23 pages, 6 figures, 7 tables
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[253]  arXiv:2512.08990 (cross-list from eess.IV) [pdf, ps, other]
Title: Agreement Disagreement Guided Knowledge Transfer for Cross-Scene Hyperspectral Imaging
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Wed, 10 Dec 2025 (showing first 115 of 131 entries)

[254]  arXiv:2512.08931 [pdf, ps, other]
Title: Astra: General Interactive World Model with Autoregressive Denoising
Comments: Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[255]  arXiv:2512.08930 [pdf, ps, other]
Title: Selfi: Self Improving Reconstruction Engine via 3D Geometric Feature Alignment
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[256]  arXiv:2512.08924 [pdf, ps, other]
Title: Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257]  arXiv:2512.08922 [pdf, ps, other]
Title: Unified Diffusion Transformer for High-fidelity Text-Aware Image Restoration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258]  arXiv:2512.08912 [pdf, ps, other]
Title: LiDAS: Lighting-driven Dynamic Active Sensing for Nighttime Perception
Comments: Preprint. 12 pages, 9 figures. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[259]  arXiv:2512.08905 [pdf, ps, other]
Title: Self-Evolving 3D Scene Generation from a Single Image
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260]  arXiv:2512.08897 [pdf, ps, other]
Title: UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[261]  arXiv:2512.08889 [pdf, ps, other]
Title: No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers
Comments: Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[262]  arXiv:2512.08888 [pdf, ps, other]
Title: Accelerated Rotation-Invariant Convolution for UAV Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[263]  arXiv:2512.08881 [pdf, ps, other]
Title: SATGround: A Spatially-Aware Approach for Visual Grounding in Remote Sensing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264]  arXiv:2512.08873 [pdf, ps, other]
Title: Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image Captioning
Comments: 6 pages
Journal-ref: 2024 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
[265]  arXiv:2512.08860 [pdf, ps, other]
Title: Tri-Bench: Stress-Testing VLM Reliability on Spatial Reasoning under Camera Tilt and Object Interference
Authors: Amit Bendkhale
Comments: 6 pages, 3 figures. Code and data: this https URL Accepted to the AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[266]  arXiv:2512.08854 [pdf, ps, other]
Title: Generation is Required for Data-Efficient Perception
Comments: Preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[267]  arXiv:2512.08829 [pdf, ps, other]
Title: InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
Comments: 16 pages, 8 figures, conference or other essential info
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[268]  arXiv:2512.08820 [pdf, ps, other]
Title: Training-Free Dual Hyperbolic Adapters for Better Cross-Modal Reasoning
Comments: Accepted in IEEE Transactions on Multimedia (TMM)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[269]  arXiv:2512.08789 [pdf, ps, other]
Title: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance
Comments: 10 pages, 7 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[270]  arXiv:2512.08785 [pdf, ps, other]
Title: LoFA: Learning to Predict Personalized Priors for Fast Adaptation of Visual Generative Models
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[271]  arXiv:2512.08774 [pdf, ps, other]
Title: Refining Visual Artifacts in Diffusion Models via Explainable AI-based Flaw Activation Maps
Comments: 10 pages, 9 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[272]  arXiv:2512.08765 [pdf, ps, other]
Title: Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Comments: NeurlPS 2025. Code and data available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2512.08751 [pdf, ps, other]
Title: Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge Devices
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[274]  arXiv:2512.08747 [pdf, ps, other]
Title: A Scalable Pipeline Combining Procedural 3D Graphics and Guided Diffusion for Photorealistic Synthetic Training Data Generation in White Button Mushroom Segmentation
Comments: 20 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[275]  arXiv:2512.08738 [pdf, ps, other]
Title: Pose-Based Sign Language Spotting via an End-to-End Encoder Architecture
Comments: To appear at AACL-IJCNLP 2025 Workshop WSLP
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[276]  arXiv:2512.08733 [pdf, ps, other]
Title: Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[277]  arXiv:2512.08730 [pdf, ps, other]
Title: SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[278]  arXiv:2512.08700 [pdf, ps, other]
Title: Scale-invariant and View-relational Representation Learning for Full Surround Monocular Depth
Comments: Accepted at IEEE Robotics and Automation Letters (RA-L) 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279]  arXiv:2512.08697 [pdf, ps, other]
Title: What really matters for person re-identification? A Mixture-of-Experts Framework for Semantic Attribute Importance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[280]  arXiv:2512.08673 [pdf, ps, other]
Title: Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds
Comments: 16 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[281]  arXiv:2512.08648 [pdf, ps, other]
Title: Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank
Comments: 19 pages, 19 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[282]  arXiv:2512.08647 [pdf, ps, other]
Title: C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior Recognition
Authors: Keito Inoshita
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283]  arXiv:2512.08645 [pdf, ps, other]
Title: Chain-of-Image Generation: Toward Monitorable and Controllable Image Generation
Comments: 19 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284]  arXiv:2512.08639 [pdf, ps, other]
Title: Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
Comments: Under Review, 12 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[285]  arXiv:2512.08627 [pdf, ps, other]
Title: Trajectory Densification and Depth from Perspective-based Blur
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286]  arXiv:2512.08625 [pdf, ps, other]
Title: OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set Semantics
Comments: 8 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[287]  arXiv:2512.08606 [pdf, ps, other]
Title: Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot Learning
Comments: 14 pages, 8 figures, Association for the Advancement of Artificial Intelligence (AAAI2026, poster)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[288]  arXiv:2512.08589 [pdf, ps, other]
Title: Automated Pollen Recognition in Optical and Holographic Microscopy Images
Comments: 08 pages, 10 figures, 04 tables, 20 references. Date of Conference: 13-14 June 2025 Date Added to IEEE Xplore: 10 July 2025 Electronic ISBN: 979-8-3315-0969-9 Print on Demand(PoD) ISBN: 979-8-3315-0970-5 DOI: 10.1109/AICCONF64766.2025.11064260 Conference Location: Prague, Czech Republic Online Access: this https URL
Journal-ref: 2025 3rd Cognitive Models and Artificial Intelligence Conference (AICCONF), vol. 1, no. 1, pp. 1-8, Prague, Czech Republic, IEEE, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289]  arXiv:2512.08577 [pdf, ps, other]
Title: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[290]  arXiv:2512.08572 [pdf, ps, other]
Title: From Cells to Survival: Hierarchical Analysis of Cell Inter-Relations in Multiplex Microscopy for Lung Cancer Prognosis
Comments: 5 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291]  arXiv:2512.08569 [pdf, ps, other]
Title: Instance-Aware Test-Time Segmentation for Continual Domain Shifts
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[292]  arXiv:2512.08564 [pdf, ps, other]
Title: Modular Neural Image Signal Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[293]  arXiv:2512.08560 [pdf, ps, other]
Title: BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[294]  arXiv:2512.08557 [pdf, ps, other]
Title: SSCATeR: Sparse Scatter-Based Convolution Algorithm with Temporal Data Recycling for Real-Time 3D Object Detection in LiDAR Point Clouds
Comments: 22 Pages, 26 Figures, This work has been submitted to the IEEE Sensors Journal for possible publication
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2512.08547 [pdf, ps, other]
Title: An Iteration-Free Fixed-Point Estimator for Diffusion Inversion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296]  arXiv:2512.08542 [pdf, ps, other]
Title: A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
[297]  arXiv:2512.08537 [pdf, ps, other]
Title: Fast-ARDiff: An Entropy-informed Acceleration Framework for Continuous Space Autoregressive Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[298]  arXiv:2512.08535 [pdf, ps, other]
Title: Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[299]  arXiv:2512.08534 [pdf, ps, other]
Title: PaintFlow: A Unified Framework for Interactive Oil Paintings Editing and Generation
Comments: 14 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300]  arXiv:2512.08529 [pdf, ps, other]
Title: MVP: Multiple View Prediction Improves GUI Grounding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301]  arXiv:2512.08524 [pdf, ps, other]
Title: Beyond Real Weights: Hypercomplex Representations for Stable Quantization
Comments: Accepted in Winter Conference on Applications of Computer Vision (WACV) 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[302]  arXiv:2512.08511 [pdf, ps, other]
Title: Thinking with Images via Self-Calling Agent
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303]  arXiv:2512.08506 [pdf, ps, other]
Title: OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point Clouds
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2512.08505 [pdf, ps, other]
Title: Beyond the Noise: Aligning Prompts with Latent Representations in Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[305]  arXiv:2512.08503 [pdf, ps, other]
Title: Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[306]  arXiv:2512.08498 [pdf, ps, other]
Title: On-the-fly Large-scale 3D Reconstruction from Multi-Camera Rigs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2512.08486 [pdf, ps, other]
Title: Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned Interventions
Comments: Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308]  arXiv:2512.08478 [pdf, ps, other]
Title: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[309]  arXiv:2512.08477 [pdf, ps, other]
Title: ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Consistent Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[310]  arXiv:2512.08467 [pdf, ps, other]
Title: Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion Recovery
Comments: 8 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[311]  arXiv:2512.08445 [pdf, ps, other]
Title: Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[312]  arXiv:2512.08441 [pdf, ps, other]
Title: Leveraging Multispectral Sensors for Color Correction in Mobile Cameras
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313]  arXiv:2512.08439 [pdf, ps, other]
Title: LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314]  arXiv:2512.08430 [pdf, ps, other]
Title: SDT-6D: Fully Sparse Depth-Transformer for Staged End-to-End 6D Pose Estimation in Industrial Multi-View Bin Picking
Comments: Accepted to WACV 2026. Preprint version
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[315]  arXiv:2512.08410 [pdf, ps, other]
Title: Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316]  arXiv:2512.08406 [pdf, ps, other]
Title: SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[317]  arXiv:2512.08400 [pdf, ps, other]
Title: Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries
Comments: The paper has been accepted for publication at Northern Lights Deep Learning (NLDL) Conference 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318]  arXiv:2512.08397 [pdf, ps, other]
Title: Detection of Digital Facial Retouching utilizing Face Beauty Information
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[319]  arXiv:2512.08378 [pdf, ps, other]
Title: Simultaneous Enhancement and Noise Suppression under Complex Illumination Conditions
Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[320]  arXiv:2512.08374 [pdf, ps, other]
Title: The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321]  arXiv:2512.08362 [pdf, ps, other]
Title: SCU-CGAN: Enhancing Fire Detection through Synthetic Fire Image Generation and Dataset Augmentation
Comments: Accepted for main track at MobieSec 2024 (not published in the proceedings)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322]  arXiv:2512.08358 [pdf, ps, other]
Title: TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
Comments: Accepted by NeurIPS 2025. Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323]  arXiv:2512.08337 [pdf, ps, other]
Title: DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324]  arXiv:2512.08334 [pdf, ps, other]
Title: HybridSplat: Fast Reflection-baked Gaussian Tracing using Hybrid Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325]  arXiv:2512.08331 [pdf, ps, other]
Title: Bi^2MAC: Bimodal Bi-Adaptive Mask-Aware Convolution for Remote Sensing Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326]  arXiv:2512.08330 [pdf, ps, other]
Title: PointDico: Contrastive 3D Representation Learning Guided by Diffusion Models
Comments: Accepted by IJCNN 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[327]  arXiv:2512.08329 [pdf, ps, other]
Title: Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models
Comments: 32 pages, 17 figures, 1 table, 5 algorithms, preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[328]  arXiv:2512.08327 [pdf, ps, other]
Title: Low Rank Support Quaternion Matrix Machine
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[329]  arXiv:2512.08325 [pdf, ps, other]
Title: GeoDiffMM: Geometry-Guided Conditional Diffusion for Motion Magnification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[330]  arXiv:2512.08323 [pdf, ps, other]
Title: Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challenge
Comments: MICCAI 2024, 3DTeethLand, Challenge report, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[331]  arXiv:2512.08317 [pdf, ps, other]
Title: GeoDM: Geometry-aware Distribution Matching for Dataset Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[332]  arXiv:2512.08309 [pdf, ps, other]
Title: Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise in Infinite, Real-Time Terrain Generation
Authors: Alexander Goslin
Comments: Project website: this https URL Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
[333]  arXiv:2512.08294 [pdf, ps, other]
Title: OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334]  arXiv:2512.08282 [pdf, ps, other]
Title: PAVAS: Physics-Aware Video-to-Audio Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
[335]  arXiv:2512.08269 [pdf, ps, other]
Title: EgoX: Egocentric Video Generation from a Single Exocentric Video
Comments: 21 pages, project page : this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[336]  arXiv:2512.08262 [pdf, ps, other]
Title: RLCNet: An end-to-end deep learning framework for simultaneous online calibration of LiDAR, RADAR, and Camera
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[337]  arXiv:2512.08254 [pdf, ps, other]
Title: SFP: Real-World Scene Recovery Using Spatial and Frequency Priors
Comments: 10 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[338]  arXiv:2512.08253 [pdf, ps, other]
Title: Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[339]  arXiv:2512.08247 [pdf, ps, other]
Title: Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object Detection
Comments: AAAI-26
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[340]  arXiv:2512.08243 [pdf, ps, other]
Title: Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSI
Authors: Saeeda Naz, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences (UEAS), Swat, Pakistan)
Comments: 26 Pages, 10 Figures, 4 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[341]  arXiv:2512.08240 [pdf, ps, other]
Title: HybridToken-VLM: Hybrid Token Compression for Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[342]  arXiv:2512.08237 [pdf, ps, other]
Title: FastBEV++: Fast by Algorithm, Deployable by Design
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[343]  arXiv:2512.08229 [pdf, ps, other]
Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[344]  arXiv:2512.08228 [pdf, ps, other]
Title: MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[345]  arXiv:2512.08227 [pdf, ps, other]
Title: New VVC profiles targeting Feature Coding for Machines
Comments: Accepted for presentation at ICIP 2025 workshop on Coding for Machines
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[346]  arXiv:2512.08223 [pdf, ps, other]
Title: SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[347]  arXiv:2512.08221 [pdf, ps, other]
Title: VisKnow: Constructing Visual Knowledge Base for Object Understanding
Comments: 16 pages, 12 figures, 7 tables. Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[348]  arXiv:2512.08215 [pdf, ps, other]
Title: Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior Refinement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[349]  arXiv:2512.08198 [pdf, ps, other]
Title: Animal Re-Identification on Microcontrollers
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[350]  arXiv:2512.08180 [pdf, ps, other]
Title: GeoLoom: High-quality Geometric Diagram Generation from Textual Input
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[351]  arXiv:2512.08163 [pdf, ps, other]
Title: Accuracy Does Not Guarantee Human-Likeness in Monocular Depth Estimators
Comments: 22 pages, 12 figures, 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[352]  arXiv:2512.08161 [pdf, ps, other]
Title: Fourier-RWKV: A Multi-State Perception Network for Efficient Image Dehazing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[353]  arXiv:2512.08135 [pdf, ps, other]
Title: CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[354]  arXiv:2512.08075 [pdf, ps, other]
Title: Identification of Deforestation Areas in the Amazon Rainforest Using Change Detection Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[355]  arXiv:2512.08048 [pdf, ps, other]
Title: Mask to Adapt: Simple Random Masking Enables Robust Continual Test-Time Learning
Comments: ongoing work
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[356]  arXiv:2512.08042 [pdf, ps, other]
Title: Towards Sustainable Universal Deepfake Detection with Frequency-Domain Masking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[357]  arXiv:2512.08040 [pdf, ps, other]
Title: Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[358]  arXiv:2512.08038 [pdf, ps, other]
Title: SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity Classification
Comments: 20 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[359]  arXiv:2512.08016 [pdf, ps, other]
Title: FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[360]  arXiv:2512.07984 [pdf, ps, other]
Title: Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection
Comments: 13 pages, 7 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[361]  arXiv:2512.07951 [pdf, ps, other]
Title: Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Comments: Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[362]  arXiv:2512.07925 [pdf, ps, other]
Title: Near-real time fires detection using satellite imagery in Sudan conflict
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[363]  arXiv:2512.07838 [pdf, ps, other]
Title: Detection of Cyberbullying in GIF using AI
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[364]  arXiv:2512.08715 (cross-list from cs.PF) [pdf, ps, other]
Title: Multi-domain performance analysis with scores tailored to user preferences
Subjects: Performance (cs.PF); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[365]  arXiv:2512.08629 (cross-list from cs.AI) [pdf, ps, other]
Title: See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[366]  arXiv:2512.08545 (cross-list from cs.CL) [pdf, ps, other]
Title: Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
Comments: 22 pages, 2 tables, 9 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[367]  arXiv:2512.08500 (cross-list from cs.GR) [pdf, ps, other]
Title: Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[368]  arXiv:2512.08360 (cross-list from cs.NE) [pdf, ps, other]
Title: Conditional Morphogenesis: Emergent Generation of Structural Digits via Neural Cellular Automata
Authors: Ali Sakour
Comments: 13 pages, 5 figures. Code available at: this https URL
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[ total of 737 entries: 1-250 | 119-368 | 369-618 | 619-737 ]
[ showing 250 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)