We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

[ total of 749 entries: 1-749 ]
[ showing up to 1000 entries per page: fewer | more ]

Wed, 10 Dec 2025

[1]  arXiv:2512.08931 [pdf, ps, other]
Title: Astra: General Interactive World Model with Autoregressive Denoising
Comments: Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[2]  arXiv:2512.08930 [pdf, ps, other]
Title: Selfi: Self Improving Reconstruction Engine via 3D Geometric Feature Alignment
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[3]  arXiv:2512.08924 [pdf, ps, other]
Title: Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4]  arXiv:2512.08922 [pdf, ps, other]
Title: Unified Diffusion Transformer for High-fidelity Text-Aware Image Restoration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[5]  arXiv:2512.08912 [pdf, ps, other]
Title: LiDAS: Lighting-driven Dynamic Active Sensing for Nighttime Perception
Comments: Preprint. 12 pages, 9 figures. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[6]  arXiv:2512.08905 [pdf, ps, other]
Title: Self-Evolving 3D Scene Generation from a Single Image
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7]  arXiv:2512.08897 [pdf, ps, other]
Title: UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8]  arXiv:2512.08889 [pdf, ps, other]
Title: No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers
Comments: Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[9]  arXiv:2512.08888 [pdf, ps, other]
Title: Accelerated Rotation-Invariant Convolution for UAV Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[10]  arXiv:2512.08881 [pdf, ps, other]
Title: SATGround: A Spatially-Aware Approach for Visual Grounding in Remote Sensing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[11]  arXiv:2512.08873 [pdf, ps, other]
Title: Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image Captioning
Comments: 6 pages
Journal-ref: 2024 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
[12]  arXiv:2512.08860 [pdf, ps, other]
Title: Tri-Bench: Stress-Testing VLM Reliability on Spatial Reasoning under Camera Tilt and Object Interference
Authors: Amit Bendkhale
Comments: 6 pages, 3 figures. Code and data: this https URL Accepted to the AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13]  arXiv:2512.08854 [pdf, ps, other]
Title: Generation is Required for Data-Efficient Perception
Comments: Preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[14]  arXiv:2512.08829 [pdf, ps, other]
Title: InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
Comments: 16 pages, 8 figures, conference or other essential info
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15]  arXiv:2512.08820 [pdf, ps, other]
Title: Training-Free Dual Hyperbolic Adapters for Better Cross-Modal Reasoning
Comments: Accepted in IEEE Transactions on Multimedia (TMM)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[16]  arXiv:2512.08789 [pdf, ps, other]
Title: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance
Comments: 10 pages, 7 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[17]  arXiv:2512.08785 [pdf, ps, other]
Title: LoFA: Learning to Predict Personalized Priors for Fast Adaptation of Visual Generative Models
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18]  arXiv:2512.08774 [pdf, ps, other]
Title: Refining Visual Artifacts in Diffusion Models via Explainable AI-based Flaw Activation Maps
Comments: 10 pages, 9 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[19]  arXiv:2512.08765 [pdf, ps, other]
Title: Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Comments: NeurlPS 2025. Code and data available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20]  arXiv:2512.08751 [pdf, ps, other]
Title: Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge Devices
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[21]  arXiv:2512.08747 [pdf, ps, other]
Title: A Scalable Pipeline Combining Procedural 3D Graphics and Guided Diffusion for Photorealistic Synthetic Training Data Generation in White Button Mushroom Segmentation
Comments: 20 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22]  arXiv:2512.08738 [pdf, ps, other]
Title: Pose-Based Sign Language Spotting via an End-to-End Encoder Architecture
Comments: To appear at AACL-IJCNLP 2025 Workshop WSLP
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[23]  arXiv:2512.08733 [pdf, ps, other]
Title: Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[24]  arXiv:2512.08730 [pdf, ps, other]
Title: SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25]  arXiv:2512.08700 [pdf, ps, other]
Title: Scale-invariant and View-relational Representation Learning for Full Surround Monocular Depth
Comments: Accepted at IEEE Robotics and Automation Letters (RA-L) 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26]  arXiv:2512.08697 [pdf, ps, other]
Title: What really matters for person re-identification? A Mixture-of-Experts Framework for Semantic Attribute Importance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27]  arXiv:2512.08673 [pdf, ps, other]
Title: Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds
Comments: 16 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28]  arXiv:2512.08648 [pdf, ps, other]
Title: Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank
Comments: 19 pages, 19 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29]  arXiv:2512.08647 [pdf, ps, other]
Title: C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior Recognition
Authors: Keito Inoshita
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30]  arXiv:2512.08645 [pdf, ps, other]
Title: Chain-of-Image Generation: Toward Monitorable and Controllable Image Generation
Comments: 19 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31]  arXiv:2512.08639 [pdf, ps, other]
Title: Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
Comments: Under Review, 12 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[32]  arXiv:2512.08627 [pdf, ps, other]
Title: Trajectory Densification and Depth from Perspective-based Blur
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33]  arXiv:2512.08625 [pdf, ps, other]
Title: OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set Semantics
Comments: 8 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34]  arXiv:2512.08606 [pdf, ps, other]
Title: Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot Learning
Comments: 14 pages, 8 figures, Association for the Advancement of Artificial Intelligence (AAAI2026, poster)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[35]  arXiv:2512.08589 [pdf, ps, other]
Title: Automated Pollen Recognition in Optical and Holographic Microscopy Images
Comments: 08 pages, 10 figures, 04 tables, 20 references. Date of Conference: 13-14 June 2025 Date Added to IEEE Xplore: 10 July 2025 Electronic ISBN: 979-8-3315-0969-9 Print on Demand(PoD) ISBN: 979-8-3315-0970-5 DOI: 10.1109/AICCONF64766.2025.11064260 Conference Location: Prague, Czech Republic Online Access: this https URL
Journal-ref: 2025 3rd Cognitive Models and Artificial Intelligence Conference (AICCONF), vol. 1, no. 1, pp. 1-8, Prague, Czech Republic, IEEE, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36]  arXiv:2512.08577 [pdf, ps, other]
Title: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[37]  arXiv:2512.08572 [pdf, ps, other]
Title: From Cells to Survival: Hierarchical Analysis of Cell Inter-Relations in Multiplex Microscopy for Lung Cancer Prognosis
Comments: 5 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38]  arXiv:2512.08569 [pdf, ps, other]
Title: Instance-Aware Test-Time Segmentation for Continual Domain Shifts
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39]  arXiv:2512.08564 [pdf, ps, other]
Title: Modular Neural Image Signal Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40]  arXiv:2512.08560 [pdf, ps, other]
Title: BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41]  arXiv:2512.08557 [pdf, ps, other]
Title: SSCATeR: Sparse Scatter-Based Convolution Algorithm with Temporal Data Recycling for Real-Time 3D Object Detection in LiDAR Point Clouds
Comments: 22 Pages, 26 Figures, This work has been submitted to the IEEE Sensors Journal for possible publication
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42]  arXiv:2512.08547 [pdf, ps, other]
Title: An Iteration-Free Fixed-Point Estimator for Diffusion Inversion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43]  arXiv:2512.08542 [pdf, ps, other]
Title: A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
[44]  arXiv:2512.08537 [pdf, ps, other]
Title: Fast-ARDiff: An Entropy-informed Acceleration Framework for Continuous Space Autoregressive Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45]  arXiv:2512.08535 [pdf, ps, other]
Title: Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46]  arXiv:2512.08534 [pdf, ps, other]
Title: PaintFlow: A Unified Framework for Interactive Oil Paintings Editing and Generation
Comments: 14 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47]  arXiv:2512.08529 [pdf, ps, other]
Title: MVP: Multiple View Prediction Improves GUI Grounding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48]  arXiv:2512.08524 [pdf, ps, other]
Title: Beyond Real Weights: Hypercomplex Representations for Stable Quantization
Comments: Accepted in Winter Conference on Applications of Computer Vision (WACV) 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[49]  arXiv:2512.08511 [pdf, ps, other]
Title: Thinking with Images via Self-Calling Agent
Comments: Code would be released at this https URL soon
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50]  arXiv:2512.08506 [pdf, ps, other]
Title: OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point Clouds
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51]  arXiv:2512.08505 [pdf, ps, other]
Title: Beyond the Noise: Aligning Prompts with Latent Representations in Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[52]  arXiv:2512.08503 [pdf, ps, other]
Title: Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[53]  arXiv:2512.08498 [pdf, ps, other]
Title: On-the-fly Large-scale 3D Reconstruction from Multi-Camera Rigs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[54]  arXiv:2512.08486 [pdf, ps, other]
Title: Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned Interventions
Comments: Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[55]  arXiv:2512.08478 [pdf, ps, other]
Title: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[56]  arXiv:2512.08477 [pdf, ps, other]
Title: ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Consistent Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[57]  arXiv:2512.08467 [pdf, ps, other]
Title: Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion Recovery
Comments: 8 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[58]  arXiv:2512.08445 [pdf, ps, other]
Title: Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[59]  arXiv:2512.08441 [pdf, ps, other]
Title: Leveraging Multispectral Sensors for Color Correction in Mobile Cameras
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[60]  arXiv:2512.08439 [pdf, ps, other]
Title: LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[61]  arXiv:2512.08430 [pdf, ps, other]
Title: SDT-6D: Fully Sparse Depth-Transformer for Staged End-to-End 6D Pose Estimation in Industrial Multi-View Bin Picking
Comments: Accepted to WACV 2026. Preprint version
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[62]  arXiv:2512.08410 [pdf, ps, other]
Title: Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63]  arXiv:2512.08406 [pdf, ps, other]
Title: SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64]  arXiv:2512.08400 [pdf, ps, other]
Title: Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries
Comments: The paper has been accepted for publication at Northern Lights Deep Learning (NLDL) Conference 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[65]  arXiv:2512.08397 [pdf, ps, other]
Title: Detection of Digital Facial Retouching utilizing Face Beauty Information
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66]  arXiv:2512.08378 [pdf, ps, other]
Title: Simultaneous Enhancement and Noise Suppression under Complex Illumination Conditions
Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[67]  arXiv:2512.08374 [pdf, ps, other]
Title: The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[68]  arXiv:2512.08362 [pdf, ps, other]
Title: SCU-CGAN: Enhancing Fire Detection through Synthetic Fire Image Generation and Dataset Augmentation
Comments: Accepted for main track at MobieSec 2024 (not published in the proceedings)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[69]  arXiv:2512.08358 [pdf, ps, other]
Title: TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
Comments: Accepted by NeurIPS 2025. Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70]  arXiv:2512.08337 [pdf, ps, other]
Title: DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71]  arXiv:2512.08334 [pdf, ps, other]
Title: HybridSplat: Fast Reflection-baked Gaussian Tracing using Hybrid Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72]  arXiv:2512.08331 [pdf, ps, other]
Title: Bi^2MAC: Bimodal Bi-Adaptive Mask-Aware Convolution for Remote Sensing Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73]  arXiv:2512.08330 [pdf, ps, other]
Title: PointDico: Contrastive 3D Representation Learning Guided by Diffusion Models
Comments: Accepted by IJCNN 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[74]  arXiv:2512.08329 [pdf, ps, other]
Title: Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models
Comments: 32 pages, 17 figures, 1 table, 5 algorithms, preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[75]  arXiv:2512.08327 [pdf, ps, other]
Title: Low Rank Support Quaternion Matrix Machine
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[76]  arXiv:2512.08325 [pdf, ps, other]
Title: GeoDiffMM: Geometry-Guided Conditional Diffusion for Motion Magnification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77]  arXiv:2512.08323 [pdf, ps, other]
Title: Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challenge
Comments: MICCAI 2024, 3DTeethLand, Challenge report, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78]  arXiv:2512.08317 [pdf, ps, other]
Title: GeoDM: Geometry-aware Distribution Matching for Dataset Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[79]  arXiv:2512.08309 [pdf, ps, other]
Title: Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise in Infinite, Real-Time Terrain Generation
Authors: Alexander Goslin
Comments: Project website: this https URL Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
[80]  arXiv:2512.08294 [pdf, ps, other]
Title: OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[81]  arXiv:2512.08282 [pdf, ps, other]
Title: PAVAS: Physics-Aware Video-to-Audio Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
[82]  arXiv:2512.08269 [pdf, ps, other]
Title: EgoX: Egocentric Video Generation from a Single Exocentric Video
Comments: 21 pages, project page : this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83]  arXiv:2512.08262 [pdf, ps, other]
Title: RLCNet: An end-to-end deep learning framework for simultaneous online calibration of LiDAR, RADAR, and Camera
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[84]  arXiv:2512.08254 [pdf, ps, other]
Title: SFP: Real-World Scene Recovery Using Spatial and Frequency Priors
Comments: 10 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[85]  arXiv:2512.08253 [pdf, ps, other]
Title: Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[86]  arXiv:2512.08247 [pdf, ps, other]
Title: Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object Detection
Comments: AAAI-26
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[87]  arXiv:2512.08243 [pdf, ps, other]
Title: Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSI
Authors: Saeeda Naz, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences (UEAS), Swat, Pakistan)
Comments: 26 Pages, 10 Figures, 4 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[88]  arXiv:2512.08240 [pdf, ps, other]
Title: HybridToken-VLM: Hybrid Token Compression for Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[89]  arXiv:2512.08237 [pdf, ps, other]
Title: FastBEV++: Fast by Algorithm, Deployable by Design
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[90]  arXiv:2512.08229 [pdf, ps, other]
Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[91]  arXiv:2512.08228 [pdf, ps, other]
Title: MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[92]  arXiv:2512.08227 [pdf, ps, other]
Title: New VVC profiles targeting Feature Coding for Machines
Comments: Accepted for presentation at ICIP 2025 workshop on Coding for Machines
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93]  arXiv:2512.08223 [pdf, ps, other]
Title: SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94]  arXiv:2512.08221 [pdf, ps, other]
Title: VisKnow: Constructing Visual Knowledge Base for Object Understanding
Comments: 16 pages, 12 figures, 7 tables. Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95]  arXiv:2512.08215 [pdf, ps, other]
Title: Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior Refinement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[96]  arXiv:2512.08198 [pdf, ps, other]
Title: Animal Re-Identification on Microcontrollers
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[97]  arXiv:2512.08180 [pdf, ps, other]
Title: GeoLoom: High-quality Geometric Diagram Generation from Textual Input
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[98]  arXiv:2512.08163 [pdf, ps, other]
Title: Accuracy Does Not Guarantee Human-Likeness in Monocular Depth Estimators
Comments: 22 pages, 12 figures, 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[99]  arXiv:2512.08161 [pdf, ps, other]
Title: Fourier-RWKV: A Multi-State Perception Network for Efficient Image Dehazing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100]  arXiv:2512.08135 [pdf, ps, other]
Title: CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[101]  arXiv:2512.08075 [pdf, ps, other]
Title: Identification of Deforestation Areas in the Amazon Rainforest Using Change Detection Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102]  arXiv:2512.08048 [pdf, ps, other]
Title: Mask to Adapt: Simple Random Masking Enables Robust Continual Test-Time Learning
Comments: ongoing work
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[103]  arXiv:2512.08042 [pdf, ps, other]
Title: Towards Sustainable Universal Deepfake Detection with Frequency-Domain Masking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[104]  arXiv:2512.08040 [pdf, ps, other]
Title: Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105]  arXiv:2512.08038 [pdf, ps, other]
Title: SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity Classification
Comments: 20 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[106]  arXiv:2512.08016 [pdf, ps, other]
Title: FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[107]  arXiv:2512.07984 [pdf, ps, other]
Title: Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection
Comments: 13 pages, 7 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[108]  arXiv:2512.07951 [pdf, ps, other]
Title: Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Comments: Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[109]  arXiv:2512.07925 [pdf, ps, other]
Title: Near-real time fires detection using satellite imagery in Sudan conflict
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[110]  arXiv:2512.07838 [pdf, ps, other]
Title: Detection of Cyberbullying in GIF using AI
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[111]  arXiv:2512.08715 (cross-list from cs.PF) [pdf, ps, other]
Title: Multi-domain performance analysis with scores tailored to user preferences
Subjects: Performance (cs.PF); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[112]  arXiv:2512.08629 (cross-list from cs.AI) [pdf, ps, other]
Title: See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[113]  arXiv:2512.08545 (cross-list from cs.CL) [pdf, ps, other]
Title: Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
Comments: 22 pages, 2 tables, 9 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[114]  arXiv:2512.08500 (cross-list from cs.GR) [pdf, ps, other]
Title: Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[115]  arXiv:2512.08360 (cross-list from cs.NE) [pdf, ps, other]
Title: Conditional Morphogenesis: Emergent Generation of Structural Digits via Neural Cellular Automata
Authors: Ali Sakour
Comments: 13 pages, 5 figures. Code available at: this https URL
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[116]  arXiv:2512.08284 (cross-list from physics.geo-ph) [pdf, ps, other]
Title: Self-Reinforced Deep Priors for Reparameterized Full Waveform Inversion
Comments: Submitted to GEOPHYSICS
Subjects: Geophysics (physics.geo-ph); Computer Vision and Pattern Recognition (cs.CV)
[117]  arXiv:2512.08271 (cross-list from cs.RO) [pdf, ps, other]
Title: Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic Teleoperation
Comments: Published and Presented at 3rd Workshop on Human-Centric Multilateral Teleoperation in ICRA 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[118]  arXiv:2512.08216 (cross-list from eess.IV) [pdf, ps, other]
Title: Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[119]  arXiv:2512.08188 (cross-list from cs.RO) [pdf, ps, other]
Title: Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World Model
Comments: Website at this https URL
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[120]  arXiv:2512.08170 (cross-list from cs.RO) [pdf, ps, other]
Title: RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric Features
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[121]  arXiv:2512.08153 (cross-list from cs.LG) [pdf, ps, other]
Title: TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models
Authors: Zheng Ding, Weirui Ye
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[122]  arXiv:2512.08125 (cross-list from eess.IV) [pdf, ps, other]
Title: FlowSteer: Conditioning Flow Field for Consistent Image Restoration
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[123]  arXiv:2512.08099 (cross-list from math.NA) [pdf, ps, other]
Title: Generalizations of the Normalized Radon Cumulative Distribution Transform for Limited Data Recognition
Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
[124]  arXiv:2512.08029 (cross-list from cs.LG) [pdf, ps, other]
Title: CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[125]  arXiv:2512.07998 (cross-list from cs.RO) [pdf, ps, other]
Title: DIJIT: A Robotic Head for an Active Observer
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[126]  arXiv:2512.07981 (cross-list from cs.LG) [pdf, ps, other]
Title: CIP-Net: Continual Interpretable Prototype-based Network
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[127]  arXiv:2512.07976 (cross-list from cs.RO) [pdf, ps, other]
Title: VLD: Visual Language Goal Distance for Reinforcement Learning Navigation
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[128]  arXiv:2512.07969 (cross-list from cs.RO) [pdf, ps, other]
Title: Sparse Variable Projection in Robotic Perception: Exploiting Separable Structure for Efficient Nonlinear Optimization
Comments: 8 pages, submitted for review
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[129]  arXiv:2512.07884 (cross-list from cs.LG) [pdf, ps, other]
Title: GSPN-2: Efficient Parallel Sequence Modeling
Comments: NeurIPS 2025
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[130]  arXiv:2512.07855 (cross-list from cs.LG) [pdf, ps, other]
Title: LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer Model
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[131]  arXiv:2512.05791 (cross-list from physics.med-ph) [pdf, ps, other]
Title: Fast and Robust Diffusion Posterior Sampling for MR Image Reconstruction Using the Preconditioned Unadjusted Langevin Algorithm
Comments: Submitted to Magnetic Resonance in Medicine
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Probability (math.PR)

Tue, 9 Dec 2025

[132]  arXiv:2512.07834 [pdf, ps, other]
Title: Voxify3D: Pixel Art Meets Volumetric Rendering
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133]  arXiv:2512.07833 [pdf, ps, other]
Title: Relational Visual Similarity
Comments: Project page, data, and code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[134]  arXiv:2512.07831 [pdf, ps, other]
Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
Comments: Project Website this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135]  arXiv:2512.07829 [pdf, ps, other]
Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[136]  arXiv:2512.07826 [pdf, ps, other]
Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing
Comments: 38 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137]  arXiv:2512.07821 [pdf, ps, other]
Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[138]  arXiv:2512.07807 [pdf, ps, other]
Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale Scenes
Comments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[139]  arXiv:2512.07806 [pdf, ps, other]
Title: Multi-view Pyramid Transformer: Look Coarser to See Broader
Comments: Project page: see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140]  arXiv:2512.07802 [pdf, ps, other]
Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141]  arXiv:2512.07778 [pdf, ps, other]
Title: Distribution Matching Variational AutoEncoder
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142]  arXiv:2512.07776 [pdf, ps, other]
Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[143]  arXiv:2512.07760 [pdf, ps, other]
Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-Identification
Comments: Accepted to AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144]  arXiv:2512.07756 [pdf, ps, other]
Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[145]  arXiv:2512.07747 [pdf, ps, other]
Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146]  arXiv:2512.07745 [pdf, ps, other]
Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147]  arXiv:2512.07738 [pdf, ps, other]
Title: HLTCOE Evaluation Team at TREC 2025: VQA Track
Comments: 7 pages, 1 figure
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148]  arXiv:2512.07733 [pdf, ps, other]
Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149]  arXiv:2512.07730 [pdf, ps, other]
Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[150]  arXiv:2512.07729 [pdf, ps, other]
Title: Improving action classification with brain-inspired deep networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[151]  arXiv:2512.07720 [pdf, ps, other]
Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation
Comments: Project page: \url{this https URL}
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152]  arXiv:2512.07712 [pdf, ps, other]
Title: UnCageNet: Tracking and Pose Estimation of Caged Animal
Comments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, India
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153]  arXiv:2512.07703 [pdf, ps, other]
Title: PVeRA: Probabilistic Vector-Based Random Matrix Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[154]  arXiv:2512.07702 [pdf, ps, other]
Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[155]  arXiv:2512.07698 [pdf, ps, other]
Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data Only
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[156]  arXiv:2512.07674 [pdf, ps, other]
Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast Representations
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[157]  arXiv:2512.07668 [pdf, ps, other]
Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158]  arXiv:2512.07661 [pdf, ps, other]
Title: Optimization-Guided Diffusion for Interactive Scene Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159]  arXiv:2512.07652 [pdf, ps, other]
Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[160]  arXiv:2512.07651 [pdf, ps, other]
Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline Method
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161]  arXiv:2512.07628 [pdf, ps, other]
Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162]  arXiv:2512.07606 [pdf, ps, other]
Title: Decomposition Sampling for Efficient Region Annotations in Active Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163]  arXiv:2512.07599 [pdf, ps, other]
Title: Online Segment Any 3D Thing as Instance Tracking
Comments: NeurIPS 2025, Code is at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164]  arXiv:2512.07596 [pdf, ps, other]
Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[165]  arXiv:2512.07590 [pdf, ps, other]
Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166]  arXiv:2512.07584 [pdf, ps, other]
Title: LongCat-Image Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167]  arXiv:2512.07580 [pdf, ps, other]
Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168]  arXiv:2512.07568 [pdf, ps, other]
Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[169]  arXiv:2512.07564 [pdf, ps, other]
Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models
Comments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[170]  arXiv:2512.07527 [pdf, ps, other]
Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[171]  arXiv:2512.07514 [pdf, ps, other]
Title: MeshRipple: Structured Autoregressive Generation of Artist-Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172]  arXiv:2512.07504 [pdf, ps, other]
Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points
Comments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[173]  arXiv:2512.07503 [pdf, ps, other]
Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174]  arXiv:2512.07500 [pdf, ps, other]
Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175]  arXiv:2512.07498 [pdf, ps, other]
Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior
Comments: 16 pages (including appendix)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176]  arXiv:2512.07480 [pdf, ps, other]
Title: Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177]  arXiv:2512.07469 [pdf, ps, other]
Title: Unified Video Editing with Temporal Reasoner
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178]  arXiv:2512.07426 [pdf, ps, other]
Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processing
Comments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[179]  arXiv:2512.07415 [pdf, ps, other]
Title: Data-driven Exploration of Mobility Interaction Patterns
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[180]  arXiv:2512.07410 [pdf, ps, other]
Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181]  arXiv:2512.07394 [pdf, ps, other]
Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric Video
Comments: webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182]  arXiv:2512.07391 [pdf, ps, other]
Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency Monitoring
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183]  arXiv:2512.07385 [pdf, ps, other]
Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184]  arXiv:2512.07383 [pdf, ps, other]
Title: LogicCBMs: Logic-Enhanced Concept-Based Learning
Comments: 18 pages, 19 figures, WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185]  arXiv:2512.07381 [pdf, ps, other]
Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic Objects
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[186]  arXiv:2512.07379 [pdf, ps, other]
Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and Efficiency
Comments: 22 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187]  arXiv:2512.07360 [pdf, ps, other]
Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic Segmentation
Comments: Accepted to WACV2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[188]  arXiv:2512.07351 [pdf, ps, other]
Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
[189]  arXiv:2512.07348 [pdf, ps, other]
Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190]  arXiv:2512.07345 [pdf, ps, other]
Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian Splatting
Comments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191]  arXiv:2512.07338 [pdf, ps, other]
Title: Generalized Referring Expression Segmentation on Aerial Photos
Comments: Submitted to IEEE J-STARS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192]  arXiv:2512.07331 [pdf, ps, other]
Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision Transformers
Authors: Kanishk Awadhiya
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193]  arXiv:2512.07328 [pdf, ps, other]
Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[194]  arXiv:2512.07305 [pdf, ps, other]
Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195]  arXiv:2512.07302 [pdf, ps, other]
Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[196]  arXiv:2512.07276 [pdf, ps, other]
Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery
Comments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[197]  arXiv:2512.07275 [pdf, ps, other]
Title: Effective Attention-Guided Multi-Scale Medical Network for Skin Lesion Segmentation
Comments: The paper has been accepted by BIBM 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[198]  arXiv:2512.07273 [pdf, ps, other]
Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language Translation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199]  arXiv:2512.07269 [pdf, ps, other]
Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[200]  arXiv:2512.07253 [pdf, ps, other]
Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement
Comments: 18 pages, 8 figures, and 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[201]  arXiv:2512.07251 [pdf, ps, other]
Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202]  arXiv:2512.07247 [pdf, ps, other]
Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing
Comments: 40 pages, 34 figures, 18 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[203]  arXiv:2512.07245 [pdf, ps, other]
Title: Zero-Shot Textual Explanations via Translating Decision-Critical Features
Comments: 11+6 pages, 8 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204]  arXiv:2512.07241 [pdf, ps, other]
Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network Architecture
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205]  arXiv:2512.07237 [pdf, ps, other]
Title: Unified Camera Positional Encoding for Controlled Video Generation
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206]  arXiv:2512.07234 [pdf, ps, other]
Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[207]  arXiv:2512.07230 [pdf, ps, other]
Title: STRinGS: Selective Text Refinement in Gaussian Splatting
Comments: Accepted to WACV 2026. Project Page, see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208]  arXiv:2512.07229 [pdf, ps, other]
Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category Discovery
Comments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[209]  arXiv:2512.07228 [pdf, ps, other]
Title: Towards Robust Protective Perturbation against DeepFake Face Swapping
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[210]  arXiv:2512.07215 [pdf, ps, other]
Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[211]  arXiv:2512.07211 [pdf, ps, other]
Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point Clouds
Comments: 8 pages, 8 figures, 5 tables, ICCR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[212]  arXiv:2512.07206 [pdf, ps, other]
Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CT
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[213]  arXiv:2512.07203 [pdf, ps, other]
Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning
Comments: 7 pages, 1 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214]  arXiv:2512.07201 [pdf, ps, other]
Title: Understanding Diffusion Models via Code Execution
Authors: Cheng Yu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[215]  arXiv:2512.07198 [pdf, ps, other]
Title: Generating Storytelling Images with Rich Chains-of-Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[216]  arXiv:2512.07197 [pdf, ps, other]
Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting
Comments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217]  arXiv:2512.07192 [pdf, ps, other]
Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image Compression
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[218]  arXiv:2512.07191 [pdf, ps, other]
Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field Correction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219]  arXiv:2512.07190 [pdf, ps, other]
Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220]  arXiv:2512.07186 [pdf, ps, other]
Title: START: Spatial and Textual Learning for Chart Understanding
Comments: WACV2026 Camera Ready
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[221]  arXiv:2512.07171 [pdf, ps, other]
Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image Restoration
Comments: 21 pages, 11 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222]  arXiv:2512.07170 [pdf, ps, other]
Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[223]  arXiv:2512.07166 [pdf, ps, other]
Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing
Comments: 9 pages,7figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[224]  arXiv:2512.07165 [pdf, ps, other]
Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225]  arXiv:2512.07155 [pdf, ps, other]
Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented Metrics
Comments: Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226]  arXiv:2512.07141 [pdf, ps, other]
Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[227]  arXiv:2512.07136 [pdf, ps, other]
Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[228]  arXiv:2512.07135 [pdf, ps, other]
Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[229]  arXiv:2512.07128 [pdf, ps, other]
Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIP
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[230]  arXiv:2512.07126 [pdf, ps, other]
Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-On
Comments: 16 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231]  arXiv:2512.07110 [pdf, ps, other]
Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232]  arXiv:2512.07107 [pdf, ps, other]
Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D Supervision
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233]  arXiv:2512.07078 [pdf, ps, other]
Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object Detection
Comments: 16 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[234]  arXiv:2512.07076 [pdf, ps, other]
Title: Context-measure: Contextualizing Metric for Camouflage
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[235]  arXiv:2512.07065 [pdf, ps, other]
Title: Persistent Homology-Guided Frequency Filtering for Image Compression
Comments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compression
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236]  arXiv:2512.07062 [pdf, ps, other]
Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[237]  arXiv:2512.07052 [pdf, ps, other]
Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238]  arXiv:2512.07051 [pdf, ps, other]
Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation
Comments: 11 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[239]  arXiv:2512.07037 [pdf, ps, other]
Title: Evaluating and Preserving High-level Fidelity in Super-Resolution
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[240]  arXiv:2512.07034 [pdf, ps, other]
Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent Cues
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[241]  arXiv:2512.06981 [pdf, ps, other]
Title: Selective Masking based Self-Supervised Learning for Image Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[242]  arXiv:2512.06949 [pdf, ps, other]
Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology
Comments: 19 pages, 5 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243]  arXiv:2512.06921 [pdf, ps, other]
Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification
Comments: Accepted by IEEE ICIA 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[244]  arXiv:2512.06905 [pdf, ps, other]
Title: Scaling Zero-Shot Reference-to-Video Generation
Comments: Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245]  arXiv:2512.06888 [pdf, ps, other]
Title: Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[246]  arXiv:2512.06886 [pdf, ps, other]
Title: Balanced Learning for Domain Adaptive Semantic Segmentation
Comments: Accepted by International Conference on Machine Learning (ICML 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[247]  arXiv:2512.06885 [pdf, ps, other]
Title: JoPano: Unified Panorama Generation via Joint Modeling
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[248]  arXiv:2512.06882 [pdf, ps, other]
Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian Fusion
Comments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon request
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249]  arXiv:2512.06877 [pdf, ps, other]
Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene Classification
Comments: Accepted and presented in ICSPIS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250]  arXiv:2512.06870 [pdf, ps, other]
Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective
Comments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[251]  arXiv:2512.06866 [pdf, ps, other]
Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
Comments: Accepted by NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[252]  arXiv:2512.06865 [pdf, ps, other]
Title: Spatial Retrieval Augmented Autonomous Driving
Comments: Demo Page: this https URL with open sourced code, dataset, and checkpoints
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253]  arXiv:2512.06864 [pdf, ps, other]
Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training
Comments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[254]  arXiv:2512.06862 [pdf, ps, other]
Title: Omni-Referring Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[255]  arXiv:2512.06849 [pdf, ps, other]
Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT
Comments: In submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[256]  arXiv:2512.06845 [pdf, ps, other]
Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257]  arXiv:2512.06840 [pdf, ps, other]
Title: CADE: Continual Weakly-supervised Video Anomaly Detection with Ensembles
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258]  arXiv:2512.06838 [pdf, ps, other]
Title: SparseCoop: Cooperative Perception with Kinematic-Grounded Queries
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259]  arXiv:2512.06818 [pdf, ps, other]
Title: MeshSplatting: Differentiable Rendering with Opaque Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260]  arXiv:2512.06811 [pdf, ps, other]
Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language Models
Comments: Accepted by AAAI 2026(Oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[261]  arXiv:2512.06810 [pdf, ps, other]
Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[262]  arXiv:2512.06802 [pdf, ps, other]
Title: VDOT: Efficient Unified Video Creation via Optimal Transport Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263]  arXiv:2512.06793 [pdf, ps, other]
Title: Generalized Geometry Encoding Volume for Real-time Stereo Matching
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264]  arXiv:2512.06783 [pdf, ps, other]
Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-Videos
Comments: 16 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[265]  arXiv:2512.06774 [pdf, ps, other]
Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[266]  arXiv:2512.06769 [pdf, ps, other]
Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[267]  arXiv:2512.06763 [pdf, ps, other]
Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control Algorithms
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268]  arXiv:2512.06759 [pdf, ps, other]
Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors
Comments: 12 pages,13figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[269]  arXiv:2512.06750 [pdf, ps, other]
Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[270]  arXiv:2512.06746 [pdf, ps, other]
Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[271]  arXiv:2512.06738 [pdf, ps, other]
Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain Adaptation
Comments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[272]  arXiv:2512.06736 [pdf, ps, other]
Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2512.06726 [pdf, ps, other]
Title: The Role of Entropy in Visual Grounding: Analysis and Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[274]  arXiv:2512.06689 [pdf, ps, other]
Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and Separation
Comments: Accepted to ASRU 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[275]  arXiv:2512.06684 [pdf, ps, other]
Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[276]  arXiv:2512.06674 [pdf, ps, other]
Title: RunawayEvil: Jailbreaking the Image-to-Video Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[277]  arXiv:2512.06673 [pdf, ps, other]
Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[278]  arXiv:2512.06663 [pdf, ps, other]
Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279]  arXiv:2512.06662 [pdf, ps, other]
Title: Personalized Image Descriptions from Attention Sequences
Comments: 10 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[280]  arXiv:2512.06657 [pdf, ps, other]
Title: TextMamba: Scene Text Detector with Mamba
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[281]  arXiv:2512.06642 [pdf, ps, other]
Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution
Comments: 21 pages, 7 figures, 3 table
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[282]  arXiv:2512.06613 [pdf, ps, other]
Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
Authors: Yueying Ke
Comments: 10 pages, 6 figures, 2 tables, IEEE conference format. Submitted as course project
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283]  arXiv:2512.06612 [pdf, ps, other]
Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics
Comments: Neurips 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284]  arXiv:2512.06598 [pdf, ps, other]
Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake Champlain
Comments: 23 pages, 15 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285]  arXiv:2512.06581 [pdf, ps, other]
Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286]  arXiv:2512.06575 [pdf, ps, other]
Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules
Authors: Fariza Dahes
Comments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LG
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[287]  arXiv:2512.06565 [pdf, ps, other]
Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose Estimation
Authors: Xiujin Liu
Comments: 1 figures, 2 tables, 14pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288]  arXiv:2512.06562 [pdf, ps, other]
Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[289]  arXiv:2512.06560 [pdf, ps, other]
Title: Bridging spatial awareness and global context in medical image segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2512.06531 [pdf, ps, other]
Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images
Authors: Sayan Das (1), Arghadip Biswas (2) ((1) IIIT Delhi, (2) Jadavpur University)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[291]  arXiv:2512.06530 [pdf, ps, other]
Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-Generalization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[292]  arXiv:2512.06521 [pdf, ps, other]
Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife Images
Authors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)
Comments: 31 pages + appendix
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[293]  arXiv:2512.06504 [pdf, ps, other]
Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[294]  arXiv:2512.06485 [pdf, ps, other]
Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based Interaction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2512.06447 [pdf, ps, other]
Title: Towards Stable Cross-Domain Depression Recognition under Missing Modalities
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296]  arXiv:2512.06438 [pdf, ps, other]
Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297]  arXiv:2512.06434 [pdf, ps, other]
Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular Screening
Comments: 8 pages, 2 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[298]  arXiv:2512.06426 [pdf, ps, other]
Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition
Comments: 12 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[299]  arXiv:2512.06424 [pdf, ps, other]
Title: DragMesh: Interactive 3D Generation Made Easy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300]  arXiv:2512.06422 [pdf, ps, other]
Title: A Perception CNN for Facial Expression Recognition
Comments: in IEEE Transactions on Image Processing (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301]  arXiv:2512.06421 [pdf, ps, other]
Title: Rethinking Training Dynamics in Scale-wise Autoregressive Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[302]  arXiv:2512.06400 [pdf, ps, other]
Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene Enhancement
Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303]  arXiv:2512.06379 [pdf, ps, other]
Title: OCFER-Net: Recognizing Facial Expression in Online Learning System
Authors: Yi Huo, Lei Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2512.06377 [pdf, ps, other]
Title: VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System
Authors: Yi Huo, Yun Ge
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[305]  arXiv:2512.06376 [pdf, ps, other]
Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306]  arXiv:2512.06373 [pdf, ps, other]
Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning
Comments: The project page is [this url](this https URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2512.06368 [pdf, ps, other]
Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308]  arXiv:2512.06363 [pdf, ps, other]
Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[309]  arXiv:2512.06358 [pdf, ps, other]
Title: Rectifying Latent Space for Generative Single-Image Reflection Removal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310]  arXiv:2512.06353 [pdf, ps, other]
Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search
Comments: Code and Supplementary Material could be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[311]  arXiv:2512.06345 [pdf, ps, other]
Title: CLUENet: Cluster Attention Makes Neural Networks Have Eyes
Comments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312]  arXiv:2512.06344 [pdf, ps, other]
Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313]  arXiv:2512.06332 [pdf, ps, other]
Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314]  arXiv:2512.06330 [pdf, ps, other]
Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[315]  arXiv:2512.06328 [pdf, ps, other]
Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models
Comments: Accepted as an Oral presentation at AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316]  arXiv:2512.06306 [pdf, ps, other]
Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[317]  arXiv:2512.06290 [pdf, ps, other]
Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification
Comments: 17 pages, 5 figures
Journal-ref: ICDAR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318]  arXiv:2512.06282 [pdf, ps, other]
Title: A Sleep Monitoring System Based on Audio, Video and Depth Information
Comments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[319]  arXiv:2512.06281 [pdf, ps, other]
Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320]  arXiv:2512.06276 [pdf, ps, other]
Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[321]  arXiv:2512.06275 [pdf, ps, other]
Title: FacePhys: State of the Heart Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322]  arXiv:2512.06269 [pdf, ps, other]
Title: TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting
Authors: Quan Tran, Tuan Dang
Comments: 10 pages
Journal-ref: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323]  arXiv:2512.06258 [pdf, ps, other]
Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324]  arXiv:2512.06255 [pdf, ps, other]
Title: Language-driven Fine-grained Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325]  arXiv:2512.06251 [pdf, ps, other]
Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326]  arXiv:2512.06232 [pdf, ps, other]
Title: Opinion: Learning Intuitive Physics May Require More than Visual Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[327]  arXiv:2512.06230 [pdf, ps, other]
Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328]  arXiv:2512.06221 [pdf, ps, other]
Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility Study
Authors: Alena Makarova
Comments: 15 pages, 13 figures. Reproducibility study
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329]  arXiv:2512.06206 [pdf, ps, other]
Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning
Comments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URL
Journal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[330]  arXiv:2512.06190 [pdf, ps, other]
Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food Drying
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[331]  arXiv:2512.06185 [pdf, ps, other]
Title: SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling
Authors: Ankit Gupta, Christoph Adami, Emily Dolson (Michigan State University)
Comments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary material
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[332]  arXiv:2512.06179 [pdf, ps, other]
Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[333]  arXiv:2512.06174 [pdf, ps, other]
Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334]  arXiv:2512.06171 [pdf, ps, other]
Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection
Comments: 11 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335]  arXiv:2512.06158 [pdf, ps, other]
Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation
Comments: 15 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[336]  arXiv:2512.06105 [pdf, ps, other]
Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report Generation
Comments: AAAI-26-AIA
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[337]  arXiv:2512.06103 [pdf, ps, other]
Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack Detection
Comments: Accepted in IEEE T-BIOM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[338]  arXiv:2512.06096 [pdf, ps, other]
Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[339]  arXiv:2512.06080 [pdf, ps, other]
Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light
Comments: SIGGRAPH Asia 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[340]  arXiv:2512.06065 [pdf, ps, other]
Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[341]  arXiv:2512.06058 [pdf, ps, other]
Title: Representation Learning for Point Cloud Understanding
Authors: Siming Yan
Comments: 181 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[342]  arXiv:2512.06032 [pdf, ps, other]
Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[343]  arXiv:2512.06024 [pdf, ps, other]
Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
[344]  arXiv:2512.06020 [pdf, ps, other]
Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation
Comments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[345]  arXiv:2512.06014 [pdf, ps, other]
Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[346]  arXiv:2512.06013 [pdf, ps, other]
Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViT
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[347]  arXiv:2512.06012 [pdf, ps, other]
Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive Manufacturing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[348]  arXiv:2512.06010 [pdf, other]
Title: Fast and Flexible Robustness Certificates for Semantic Segmentation
Authors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[349]  arXiv:2512.06006 [pdf, ps, other]
Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[350]  arXiv:2512.06003 [pdf, ps, other]
Title: PrunedCaps: A Case For Primary Capsules Discrimination
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[351]  arXiv:2512.05996 [pdf, ps, other]
Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and Counting
Comments: 18 pages, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
[352]  arXiv:2512.05993 [pdf, ps, other]
[353]  arXiv:2512.05991 [pdf, ps, other]
Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[354]  arXiv:2512.05988 [pdf, ps, other]
Title: VG3T: Visual Geometry Grounded Gaussian Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[355]  arXiv:2512.05987 [pdf, ps, other]
Title: Adaptive Dataset Quantization: A New Direction for Dataset Pruning
Authors: Chenyue Yu, Jianyu Yu
Comments: Accepted by ICCPR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[356]  arXiv:2512.05969 [pdf, ps, other]
Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices
Authors: Hokin Deng
Comments: See $\href{this https URL}{results}$ and $\href{this https URL}{code}$
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[357]  arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]
Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[358]  arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]
Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray Segmentation
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[359]  arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]
Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics Framework
Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[360]  arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]
Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[361]  arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]
Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces
Authors: Nikita Gabdullin
Comments: 9 pages, 5 figures, 1 table, 4 equations
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[362]  arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]
Title: Human Geometry Distribution for 3D Animation Generation
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[363]  arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]
Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World Models
Comments: 23 pages, 8 figures, 3 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
[364]  arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]
Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[365]  arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]
Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness Likelihood
Comments: Accepted to WACV 2026
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[366]  arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]
Title: A Geometric Unification of Concept Learning with Concept Cones
Comments: 22 pages
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[367]  arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]
Title: Affine Subspace Models and Clustering for Patch-Based Image Denoising
Comments: Asilomar Conference on Signals, Systems, and Computers 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[368]  arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]
Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty Metrics
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[369]  arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]
Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[370]  arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]
Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket Search
Comments: This work plans to be submitted to the IEEE for possible publication
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[371]  arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]
Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning
Comments: Code: this https URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[372]  arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]
Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous Driving
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[373]  arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]
Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep Analysis
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[374]  arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]
Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme Patients
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[375]  arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]
Title: VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Comments: Project page: this https URL
Journal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[376]  arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]
Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge
Comments: 2025 NeurIPS Behavior Challenge 1st place solution
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[377]  arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]
Title: Dynamic Visual SLAM using a General 3D Prior
Comments: 8 pages
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[378]  arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]
Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge Devices
Comments: 9Pages, 3 figure, Politeknik Negeri Banyuwangi
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[379]  arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]
Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association
Comments: FAME 2026 Technical Report
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[380]  arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]
Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics
Comments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-concept
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[381]  arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]
Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG Data
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[382]  arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]
Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[383]  arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]
Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine Learning
Comments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
[384]  arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]
Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network
Authors: Xiao Li
Comments: in Chinese language
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[385]  arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]
Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[386]  arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]
Title: Vector Quantization using Gaussian Variational Autoencoder
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[387]  arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]
Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[388]  arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]
Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[389]  arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]
Title: Semantic Temporal Single-photon LiDAR
Comments: 14 pages, 5 figures. And any comment is welcome
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
[390]  arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]
Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation
Comments: NeurIPS Black in AI workshop - 2022
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Mon, 8 Dec 2025

[391]  arXiv:2512.05965 [pdf, ps, other]
Title: EditThinker: Unlocking Iterative Reasoning for Any Image Editor
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[392]  arXiv:2512.05960 [pdf, ps, other]
Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[393]  arXiv:2512.05941 [pdf, ps, other]
Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[394]  arXiv:2512.05937 [pdf, ps, other]
Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV Perception
Comments: 8 pages, 2 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[395]  arXiv:2512.05936 [pdf, ps, other]
Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition
Comments: 8 pages, 8 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[396]  arXiv:2512.05928 [pdf, ps, other]
Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face Recognition
Comments: 18 pages, 17 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[397]  arXiv:2512.05927 [pdf, ps, other]
Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[398]  arXiv:2512.05922 [pdf, ps, other]
Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology Segmentation
Comments: Note: Khang Le and Anh Mai Vu contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[399]  arXiv:2512.05920 [pdf, ps, other]
Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[400]  arXiv:2512.05905 [pdf, ps, other]
Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[401]  arXiv:2512.05866 [pdf, ps, other]
Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator
Comments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[402]  arXiv:2512.05859 [pdf, ps, other]
Title: Edit-aware RAW Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403]  arXiv:2512.05853 [pdf, ps, other]
Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404]  arXiv:2512.05830 [pdf, ps, other]
Title: Phase-OTDR Event Detection Using Image-Based Data Transformation and Deep Learning
Comments: 22 pages, 11 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[405]  arXiv:2512.05814 [pdf, ps, other]
Title: UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease Detection
Comments: The code is already available on GitHub: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[406]  arXiv:2512.05809 [pdf, ps, other]
Title: Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling
Comments: Extended abstract at World Modeling Workshop 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[407]  arXiv:2512.05802 [pdf, ps, other]
Title: Bring Your Dreams to Life: Continual Text-to-Video Customization
Comments: Accepted to AAAI2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408]  arXiv:2512.05783 [pdf, ps, other]
Title: Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[409]  arXiv:2512.05774 [pdf, ps, other]
Title: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding
Comments: Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[410]  arXiv:2512.05762 [pdf, ps, other]
Title: FNOPT: Resolution-Agnostic, Self-Supervised Cloth Simulation using Meta-Optimization with Fourier Neural Operators
Comments: Accepted for WACV
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[411]  arXiv:2512.05759 [pdf, ps, other]
Title: Label-Efficient Point Cloud Segmentation with Active Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[412]  arXiv:2512.05754 [pdf, ps, other]
Title: USV: Unified Sparsification for Accelerating Video Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[413]  arXiv:2512.05746 [pdf, ps, other]
Title: HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[414]  arXiv:2512.05740 [pdf, ps, other]
Title: Distilling Expert Surgical Knowledge: How to train local surgical VLMs for anatomy explanation in Complete Mesocolic Excision
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[415]  arXiv:2512.05710 [pdf, ps, other]
Title: Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[416]  arXiv:2512.05698 [pdf, ps, other]
Title: OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning
Comments: The 40th Annual AAAI Conference on Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[417]  arXiv:2512.05683 [pdf, ps, other]
Title: Physics-Informed Graph Neural Network with Frequency-Aware Learning for Optical Aberration Correction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[418]  arXiv:2512.05674 [pdf, ps, other]
Title: Hyperspectral Unmixing with 3D Convolutional Sparse Coding and Projected Simplex Volume Maximization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[419]  arXiv:2512.05672 [pdf, ps, other]
Title: InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[420]  arXiv:2512.05669 [pdf, ps, other]
Title: Deep Learning-Based Real-Time Sequential Facial Expression Analysis Using Geometric Features
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[421]  arXiv:2512.05663 [pdf, ps, other]
Title: LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[422]  arXiv:2512.05651 [pdf, ps, other]
Title: Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[423]  arXiv:2512.05635 [pdf, ps, other]
Title: Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[424]  arXiv:2512.05613 [pdf, ps, other]
Title: DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[425]  arXiv:2512.05610 [pdf, ps, other]
Title: NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projections
Comments: 19 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[426]  arXiv:2512.05597 [pdf, ps, other]
Title: Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token Prediction
Comments: 10 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427]  arXiv:2512.05593 [pdf, ps, other]
Title: Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer
Comments: Accepted to 3DV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[428]  arXiv:2512.05571 [pdf, ps, other]
Title: MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical Imaging
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429]  arXiv:2512.05564 [pdf, ps, other]
Title: ProPhy: Progressive Physical Alignment for Dynamic World Simulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430]  arXiv:2512.05557 [pdf, ps, other]
Title: 2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence Consistency
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[431]  arXiv:2512.05546 [pdf, ps, other]
Title: Conscious Gaze: Adaptive Attention Mechanisms for Hallucination Mitigation in Vision-Language Models
Comments: 6 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[432]  arXiv:2512.05539 [pdf, ps, other]
Title: Ideal Observer for Segmentation of Dead Leaves Images
Comments: 41 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Methodology (stat.ME)
[433]  arXiv:2512.05529 [pdf, ps, other]
Title: See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors
Comments: The first two authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[434]  arXiv:2512.05524 [pdf, ps, other]
Title: VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435]  arXiv:2512.05515 [pdf, ps, other]
Title: DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis
Comments: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[436]  arXiv:2512.05513 [pdf, ps, other]
Title: Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[437]  arXiv:2512.05511 [pdf, ps, other]
Title: Rethinking Infrared Small Target Detection: A Foundation-Driven Efficient Paradigm
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[438]  arXiv:2512.05494 [pdf, ps, other]
Title: Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation
Comments: Accepted to AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[439]  arXiv:2512.05492 [pdf, ps, other]
Title: WaterWave: Bridging Underwater Image Enhancement into Video Streams via Wavelet-based Temporal Consistency Field
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[440]  arXiv:2512.05482 [pdf, ps, other]
Title: Concept-based Explainable Data Mining with VLM for 3D Detection
Authors: Mai Tsujimoto
Comments: 28 pages including appendix. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[441]  arXiv:2512.05481 [pdf, ps, other]
Title: UniFS: Unified Multi-Contrast MRI Reconstruction via Frequency-Spatial Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[442]  arXiv:2512.05478 [pdf, ps, other]
Title: EmoStyle: Emotion-Driven Image Stylization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[443]  arXiv:2512.05468 [pdf, ps, other]
Title: University Building Recognition Dataset in Thailand for the mission-oriented IoT sensor system
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[444]  arXiv:2512.05446 [pdf, ps, other]
Title: TED-4DGS: Temporally Activated and Embedding-based Deformation for 4DGS Compression
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[445]  arXiv:2512.05422 [pdf, ps, other]
Title: ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[446]  arXiv:2512.05418 [pdf, ps, other]
Title: Performance Evaluation of Deep Learning for Tree Branch Segmentation in Autonomous Forestry Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[447]  arXiv:2512.05415 [pdf, ps, other]
Title: Moving object detection from multi-depth images with an attention-enhanced CNN
Comments: 14 pages, 22 figures, submitted to PASJ
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[448]  arXiv:2512.05412 [pdf, ps, other]
Title: YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[449]  arXiv:2512.05410 [pdf, ps, other]
Title: Genetic Algorithms For Parameter Optimization for Disparity Map Generation of Radiata Pine Branch Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[450]  arXiv:2512.05398 [pdf, ps, other]
Title: The Dynamic Prior: Understanding 3D Structures for Casual Dynamic Videos
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451]  arXiv:2512.05394 [pdf, ps, other]
Title: Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[452]  arXiv:2512.05391 [pdf, ps, other]
Title: LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453]  arXiv:2512.05385 [pdf, ps, other]
Title: ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[454]  arXiv:2512.05362 [pdf, ps, other]
Title: PoolNet: Deep Learning for 2D to 3D Video Process Validation
Comments: All code related to this paper can be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[455]  arXiv:2512.05359 [pdf, ps, other]
Title: Group Orthogonal Low-Rank Adaptation for RGB-T Tracking
Comments: 13 pages, 8 figures. Accepted by AAAI 2026. Extended version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456]  arXiv:2512.05354 [pdf, ps, other]
Title: SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training
Comments: project page this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[457]  arXiv:2512.05343 [pdf, ps, other]
Title: SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[458]  arXiv:2512.05277 [pdf, ps, other]
Title: From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[459]  arXiv:2512.05272 [pdf, ps, other]
Title: Inferring Compositional 4D Scenes without Ever Seeing One
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460]  arXiv:2512.05268 [pdf, ps, other]
Title: CARD: Correlation Aware Restoration with Diffusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461]  arXiv:2512.05259 [pdf, ps, other]
Title: Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462]  arXiv:2512.05240 [pdf, ps, other]
Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463]  arXiv:2512.05209 [pdf, ps, other]
Title: DEAR: Dataset for Evaluating the Aesthetics of RenderingDEAR: Dataset for Evaluating the Aesthetics of Rendering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464]  arXiv:2512.05198 [pdf, ps, other]
Title: Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion Models
Comments: 16 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[465]  arXiv:2512.05172 [pdf, ps, other]
Title: Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[466]  arXiv:2512.05152 [pdf, ps, other]
Title: EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer Models
Comments: 6pages, 5figures, published to 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[467]  arXiv:2512.05150 [pdf, ps, other]
Title: TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
Comments: arxiv v0
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[468]  arXiv:2512.05145 [pdf, ps, other]
Title: Self-Improving VLM Judges Without Human Annotations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469]  arXiv:2512.05140 [pdf, other]
Title: FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation
Authors: Georges Le Bellier (CEDRIC - VERTIGO, Cnam), Nicolas Audebert (LaSTIG, IGN, CEDRIC - VERTIGO)
Comments: 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2026, Tucson (AZ), United States
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[470]  arXiv:2512.05139 [pdf, ps, other]
Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[471]  arXiv:2512.05137 [pdf, ps, other]
Title: ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[472]  arXiv:2512.05136 [pdf, ps, other]
Title: Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[473]  arXiv:2512.05134 [pdf, ps, other]
Title: InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models
Authors: Zihao Wu
Comments: 8 pages main, 8 pages appendix, 16 figures, 5 tables. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[474]  arXiv:2512.05132 [pdf, ps, other]
Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[475]  arXiv:2512.05131 [pdf, ps, other]
Title: AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[476]  arXiv:2512.05959 (cross-list from cs.CL) [pdf, ps, other]
Title: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Comments: Preprint
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[477]  arXiv:2512.05955 (cross-list from cs.RO) [pdf, ps, other]
Title: SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[478]  arXiv:2512.05932 (cross-list from cs.RO) [pdf, ps, other]
Title: Physically-Based Simulation of Automotive LiDAR
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[479]  arXiv:2512.05824 (cross-list from cs.AI) [pdf, ps, other]
Title: Multimodal Oncology Agent for IDH1 Mutation Prediction in Low-Grade Glioma
Authors: Hafsa Akebli (1), Adam Shephard (2), Vincenzo Della Mea (1), Nasir Rajpoot (2 and 3) ((1) University of Udine, Udine, Italy, (2) University of Warwick, Coventry, UK, (3) Histofy Ltd, Coventry, UK)
Comments: 4 pages, 2 figures
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[480]  arXiv:2512.05812 (cross-list from cs.RO) [pdf, ps, other]
Title: Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation
Comments: This work has been submitted to the IEEE for possible publication
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[481]  arXiv:2512.05665 (cross-list from cs.CL) [pdf, ps, other]
Title: Interleaved Latent Visual Reasoning with Selective Perceptual Modeling
Comments: 11 pages, 6 figures. Code available at this https URL
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[482]  arXiv:2512.05438 (cross-list from cs.HC) [pdf, ps, other]
Title: EXR: An Interactive Immersive EHR Visualization in Extended Reality
Comments: 11 pages, 6 figures. Preprint version. This paper has been accepted to IEEE ICIR 2025. This is the author-prepared version and not the final published version. The final version will appear in IEEE Xplo
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[483]  arXiv:2512.05299 (cross-list from eess.SY) [pdf, ps, other]
Title: ARCAS: An Augmented Reality Collision Avoidance System with SLAM-Based Tracking for Enhancing VRU Safety
Comments: 8 pages, 3 figures, 1 table
Subjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Robotics (cs.RO); Image and Video Processing (eess.IV)
[484]  arXiv:2512.05126 (cross-list from eess.AS) [pdf, ps, other]
Title: SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS Model
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)

Fri, 5 Dec 2025

[485]  arXiv:2512.05115 [pdf, ps, other]
Title: Light-X: Generative 4D Video Rendering with Camera and Illumination Control
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[486]  arXiv:2512.05113 [pdf, ps, other]
Title: Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting
Comments: WACV 2026. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[487]  arXiv:2512.05112 [pdf, ps, other]
Title: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[488]  arXiv:2512.05111 [pdf, ps, other]
Title: ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[489]  arXiv:2512.05110 [pdf, ps, other]
Title: ShadowDraw: From Any Object to Shadow-Drawing Compositional Art
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[490]  arXiv:2512.05106 [pdf, ps, other]
Title: NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Robotics (cs.RO)
[491]  arXiv:2512.05104 [pdf, ps, other]
Title: EvoIR: Towards All-in-One Image Restoration via Evolutionary Frequency Modulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[492]  arXiv:2512.05098 [pdf, ps, other]
Title: SA-IQA: Redefining Image Quality Assessment for Spatial Aesthetics with Multi-Dimensional Rewards
Authors: Yuan Gao, Jin Song
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[493]  arXiv:2512.05091 [pdf, ps, other]
Title: Visual Reasoning Tracer: Object-Level Grounded Reasoning Benchmark
Comments: Technical Report; Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[494]  arXiv:2512.05081 [pdf, ps, other]
Title: Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[495]  arXiv:2512.05079 [pdf, ps, other]
Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[496]  arXiv:2512.05076 [pdf, ps, other]
Title: BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[497]  arXiv:2512.05060 [pdf, ps, other]
Title: 4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer
Comments: Code: this https URL, Webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[498]  arXiv:2512.05044 [pdf, ps, other]
Title: Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Comments: 18 Pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[499]  arXiv:2512.05039 [pdf, ps, other]
Title: Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual Encoding
Comments: Submitted for review CVPR-2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[500]  arXiv:2512.05025 [pdf, ps, other]
Title: RAMEN: Resolution-Adjustable Multimodal Encoder for Earth Observation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[501]  arXiv:2512.05021 [pdf, ps, other]
Title: HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[502]  arXiv:2512.05016 [pdf, ps, other]
Title: Generative Neural Video Compression via Video Diffusion Prior
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[503]  arXiv:2512.05006 [pdf, ps, other]
Title: Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects
Comments: conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[504]  arXiv:2512.05000 [pdf, ps, other]
Title: Reflection Removal through Efficient Adaptation of Diffusion Transformers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[505]  arXiv:2512.04996 [pdf, ps, other]
Title: A dynamic memory assignment strategy for dilation-based ICP algorithm on embedded GPUs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[506]  arXiv:2512.04981 [pdf, ps, other]
Title: Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[507]  arXiv:2512.04970 [pdf, ps, other]
Title: Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks
Comments: UniReps Workshop 2025, 12 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[508]  arXiv:2512.04969 [pdf, ps, other]
Title: Rethinking the Use of Vision Transformers for AI-Generated Image Detection
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[509]  arXiv:2512.04967 [pdf, ps, other]
Title: Balanced Few-Shot Episodic Learning for Accurate Retinal Disease Diagnosis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[510]  arXiv:2512.04963 [pdf, ps, other]
Title: GeoPE:A Unified Geometric Positional Embedding for Structured Tensors
Authors: Yupu Yao, Bowen Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[511]  arXiv:2512.04952 [pdf, ps, other]
Title: FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via Neural Action Tokenization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[512]  arXiv:2512.04943 [pdf, ps, other]
Title: Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[513]  arXiv:2512.04939 [pdf, ps, other]
Title: LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[514]  arXiv:2512.04927 [pdf, ps, other]
Title: Virtually Unrolling the Herculaneum Papyri by Diffeomorphic Spiral Fitting
Authors: Paul Henderson
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[515]  arXiv:2512.04926 [pdf, ps, other]
Title: Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[516]  arXiv:2512.04904 [pdf, ps, other]
Title: ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[517]  arXiv:2512.04890 [pdf, ps, other]
Title: Equivariant Symmetry-Aware Head Pose Estimation for Fetal MRI
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[518]  arXiv:2512.04888 [pdf, ps, other]
Title: You Only Train Once (YOTO): A Retraining-Free Object Detection Framework
Comments: This manuscript was first submitted to the Engineering (Elsevier Journal). The preprint version was posted to arXiv afterwards to facilitate open access and community feedback
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[519]  arXiv:2512.04883 [pdf, ps, other]
Title: SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[520]  arXiv:2512.04875 [pdf, ps, other]
Title: SP-Det: Self-Prompted Dual-Text Fusion for Generalized Multi-Label Lesion Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[521]  arXiv:2512.04862 [pdf, ps, other]
Title: Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing
Comments: * Equal contribution. Minor figure corrections compared to the ICCV 2025 version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[522]  arXiv:2512.04857 [pdf, ps, other]
Title: Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[523]  arXiv:2512.04837 [pdf, ps, other]
Title: A Sanity Check for Multi-In-Domain Face Forgery Detection in the Real World
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[524]  arXiv:2512.04832 [pdf, ps, other]
Title: Tokenizing Buildings: A Transformer for Layout Synthesis
Comments: 8 pages, 1 page References, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[525]  arXiv:2512.04830 [pdf, ps, other]
Title: FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis
Comments: Novel View Synthesis, Driving Scene, Free Trajectory, Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[526]  arXiv:2512.04821 [pdf, ps, other]
Title: LatentFM: A Latent Flow Matching Approach for Generative Medical Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527]  arXiv:2512.04815 [pdf, ps, other]
Title: RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS
Comments: arXiv admin note: substantial text overlap with arXiv:2506.02751
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528]  arXiv:2512.04810 [pdf, ps, other]
Title: EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[529]  arXiv:2512.04786 [pdf, ps, other]
Title: LaFiTe: A Generative Latent Field for 3D Native Texturing
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530]  arXiv:2512.04784 [pdf, ps, other]
Title: PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531]  arXiv:2512.04761 [pdf, ps, other]
Title: Order Matters: 3D Shape Generation from Sequential VR Sketches
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532]  arXiv:2512.04734 [pdf, ps, other]
Title: MT-Depth: Multi-task Instance feature analysis for the Depth Completion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533]  arXiv:2512.04733 [pdf, ps, other]
Title: E3AD: An Emotion-Aware Vision-Language-Action Model for Human-Centric End-to-End Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[534]  arXiv:2512.04728 [pdf, ps, other]
Title: Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[535]  arXiv:2512.04699 [pdf, ps, other]
Title: OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution
Comments: Accepted as TCSVT, 15 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[536]  arXiv:2512.04686 [pdf, ps, other]
Title: Towards Cross-View Point Correspondence in Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[537]  arXiv:2512.04678 [pdf, ps, other]
Title: Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538]  arXiv:2512.04677 [pdf, ps, other]
Title: Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539]  arXiv:2512.04660 [pdf, ps, other]
Title: I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540]  arXiv:2512.04643 [pdf, ps, other]
Title: SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[541]  arXiv:2512.04619 [pdf, ps, other]
Title: Denoise to Track: Harnessing Video Diffusion Priors for Robust Correspondence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[542]  arXiv:2512.04599 [pdf, ps, other]
Title: Malicious Image Analysis via Vision-Language Segmentation Fusion: Detection, Element, and Location in One-shot
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543]  arXiv:2512.04597 [pdf, ps, other]
Title: When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[544]  arXiv:2512.04585 [pdf, ps, other]
Title: SAM3-I: Segment Anything with Instructions
Comments: Preliminary results; work in progress
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545]  arXiv:2512.04581 [pdf, ps, other]
Title: Infrared UAV Target Tracking with Dynamic Feature Refinement and Global Contextual Attention Knowledge Distillation
Comments: Accepted by IEEE TMM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546]  arXiv:2512.04576 [pdf, ps, other]
Title: TARDis: Time Attenuated Representation Disentanglement for Incomplete Multi-Modal Tumor Segmentation and Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547]  arXiv:2512.04568 [pdf, ps, other]
Title: Prompt2Craft: Generating Functional Craft Assemblies with LLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548]  arXiv:2512.04564 [pdf, ps, other]
Title: Dataset creation for supervised deep learning-based analysis of microscopic images -- review of important considerations and recommendations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549]  arXiv:2512.04563 [pdf, ps, other]
Title: COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550]  arXiv:2512.04554 [pdf, ps, other]
Title: Counterfeit Answers: Adversarial Forgery against OCR-Free Document Visual Question Answering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[551]  arXiv:2512.04542 [pdf, ps, other]
Title: Gaussian Entropy Fields: Driving Adaptive Sparsity in 3D Gaussian Optimization
Comments: 28 pages,11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[552]  arXiv:2512.04540 [pdf, ps, other]
Title: VideoMem: Enhancing Ultra-Long Video Understanding via Adaptive Memory Management
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[553]  arXiv:2512.04537 [pdf, ps, other]
Title: X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554]  arXiv:2512.04536 [pdf, ps, other]
Title: Detection of Intoxicated Individuals from Facial Video Sequences via a Recurrent Fusion Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[555]  arXiv:2512.04534 [pdf, ps, other]
Title: Refaçade: Editing Object with Given Reference Texture
Authors: Youze Huang (1), Penghui Ruan (2), Bojia Zi (3), Xianbiao Qi (4), Jianan Wang (5), Rong Xiao (4) ((1) University of Electronic Science and Technology of China, (2) The Hong Kong Polytechnic University, (3) The Chinese University of Hong Kong, (4) IntelliFusion Inc., (5) Astribot Inc.)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556]  arXiv:2512.04532 [pdf, ps, other]
Title: PhyVLLM: Physics-Guided Video Language Model with Motion-Appearance Disentanglement
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[557]  arXiv:2512.04528 [pdf, ps, other]
Title: Auto3R: Automated 3D Reconstruction and Scanning via Data-driven Uncertainty Quantification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[558]  arXiv:2512.04522 [pdf, ps, other]
Title: Identity Clue Refinement and Enhancement for Visible-Infrared Person Re-Identification
Comments: 14 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[559]  arXiv:2512.04521 [pdf, ps, other]
Title: WiFi-based Cross-Domain Gesture Recognition Using Attention Mechanism
Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[560]  arXiv:2512.04520 [pdf, ps, other]
Title: Boundary-Aware Test-Time Adaptation for Zero-Shot Medical Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561]  arXiv:2512.04519 [pdf, ps, other]
Title: VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562]  arXiv:2512.04515 [pdf, ps, other]
Title: EgoLCD: Egocentric Video Generation with Long Context Diffusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[563]  arXiv:2512.04511 [pdf, ps, other]
Title: DuGI-MAE: Improving Infrared Mask Autoencoders via Dual-Domain Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[564]  arXiv:2512.04504 [pdf, ps, other]
Title: UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[565]  arXiv:2512.04499 [pdf, ps, other]
Title: Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[566]  arXiv:2512.04496 [pdf, ps, other]
Title: Shift-Window Meets Dual Attention: A Multi-Model Architecture for Specular Highlight Removal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567]  arXiv:2512.04487 [pdf, ps, other]
Title: Controllable Long-term Motion Generation with Extended Joint Targets
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[568]  arXiv:2512.04485 [pdf, ps, other]
Title: Not All Birds Look The Same: Identity-Preserving Generation For Birds
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569]  arXiv:2512.04483 [pdf, ps, other]
Title: DeRA: Decoupled Representation Alignment for Video Tokenization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[570]  arXiv:2512.04461 [pdf, ps, other]
Title: UniTS: Unified Time Series Generative Model for Remote Sensing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571]  arXiv:2512.04459 [pdf, ps, other]
Title: dVLM-AD: Enhance Diffusion Vision-Language-Model for Driving via Controllable Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572]  arXiv:2512.04456 [pdf, ps, other]
Title: GuidNoise: Single-Pair Guided Diffusion for Generalized Noise Synthesis
Comments: AAAI2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[573]  arXiv:2512.04451 [pdf, ps, other]
Title: StreamEQA: Towards Streaming Video Understanding for Embodied Scenarios
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574]  arXiv:2512.04441 [pdf, ps, other]
Title: MindDrive: An All-in-One Framework Bridging World Models and Vision-Language Model for End-to-End Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[575]  arXiv:2512.04426 [pdf, ps, other]
Title: Self-Paced and Self-Corrective Masked Prediction for Movie Trailer Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[576]  arXiv:2512.04425 [pdf, ps, other]
Title: Explainable Parkinsons Disease Gait Recognition Using Multimodal RGB-D Fusion and Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[577]  arXiv:2512.04421 [pdf, ps, other]
Title: UTrice: Unifying Primitives in Differentiable Ray Tracing and Rasterization via Triangles for Particle-Based 3D Scenes
Comments: 13 pages, 10 figures, submitted to CVPR2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[578]  arXiv:2512.04413 [pdf, ps, other]
Title: Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection
Comments: 12 pages, 8 figures, 11 tables
Journal-ref: IEEE Transactions on Geoscience and Remote Sensing 63 (2025) 1-11
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[579]  arXiv:2512.04397 [pdf, ps, other]
Title: Performance Evaluation of Transfer Learning Based Medical Image Classification Techniques for Disease Detection
Journal-ref: 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Copenhagen, Denmark, 2025, pp. 1-5
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[580]  arXiv:2512.04395 [pdf, ps, other]
Title: Fourier-Attentive Representation Learning: A Fourier-Guided Framework for Few-Shot Generalization in Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[581]  arXiv:2512.04390 [pdf, ps, other]
Title: FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring
Comments: 20 pages, 15 figures. Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[582]  arXiv:2512.04358 [pdf, ps, other]
Title: MAFNet:Multi-frequency Adaptive Fusion Network for Real-time Stereo Matching
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[583]  arXiv:2512.04356 [pdf, ps, other]
Title: Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment
Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2026. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[584]  arXiv:2512.04331 [pdf, ps, other]
Title: Open Set Face Forgery Detection via Dual-Level Evidence Collection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[585]  arXiv:2512.04329 [pdf, ps, other]
Title: A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[586]  arXiv:2512.04323 [pdf, ps, other]
Title: Bayes-DIC Net: Estimating Digital Image Correlation Uncertainty with Bayesian Neural Networks
Comments: 17 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG)
[587]  arXiv:2512.04315 [pdf, ps, other]
Title: SyncTrack4D: Cross-Video Motion Alignment and Video Synchronization for Multi-Video 4D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[588]  arXiv:2512.04314 [pdf, ps, other]
Title: DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[589]  arXiv:2512.04313 [pdf, ps, other]
Title: Mind-to-Face: Neural-Driven Photorealistic Avatar Synthesis via EEG Decoding
Comments: 16 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[590]  arXiv:2512.04311 [pdf, ps, other]
Title: Real-time Cricket Sorting By Sex
Comments: 13 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[591]  arXiv:2512.04309 [pdf, ps, other]
Title: Text-Only Training for Image Captioning with Retrieval Augmentation and Modality Gap Correction
Comments: Submitted to CVPR 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[592]  arXiv:2512.04305 [pdf, ps, other]
Title: How (Mis)calibrated is Your Federated CLIP and What To Do About It?
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[593]  arXiv:2512.04303 [pdf, ps, other]
Title: Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications
Comments: Accepted in 3DV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[594]  arXiv:2512.04284 [pdf, ps, other]
Title: Learning Single-Image Super-Resolution in the JPEG Compressed Domain
Comments: 7 pages, 4 figures, 2 tables, SEEDS Workshop, ICIP 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[595]  arXiv:2512.04283 [pdf, ps, other]
Title: Plug-and-Play Image Restoration with Flow Matching: A Continuous Viewpoint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[596]  arXiv:2512.04282 [pdf, ps, other]
Title: Inference-time Stochastic Refinement of GRU-Normalizing Flow for Real-time Video Motion Transfer
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[597]  arXiv:2512.04267 [pdf, ps, other]
Title: UniLight: A Unified Representation for Lighting
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[598]  arXiv:2512.04248 [pdf, ps, other]
Title: MVRoom: Controllable 3D Indoor Scene Generation with Multi-View Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[599]  arXiv:2512.04238 [pdf, ps, other]
Title: 6 Fingers, 1 Kidney: Natural Adversarial Medical Images Reveal Critical Weaknesses of Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[600]  arXiv:2512.04222 [pdf, ps, other]
Title: ReasonX: MLLM-Guided Intrinsic Image Decomposition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[601]  arXiv:2512.04221 [pdf, ps, other]
Title: MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[602]  arXiv:2512.04219 [pdf, ps, other]
Title: Generalized Event Partonomy Inference with Structured Hierarchical Predictive Learning
Comments: 16 pages, 7 figures, 3 tables. Under Review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[603]  arXiv:2512.04187 [pdf, ps, other]
Title: OnSight Pathology: A real-time platform-agnostic computational pathology companion for histopathology
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[604]  arXiv:2512.04175 [pdf, ps, other]
Title: Beyond Flicker: Detecting Kinematic Inconsistencies for Generalizable Deepfake Video Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[605]  arXiv:2512.05117 (cross-list from cs.LG) [pdf, ps, other]
Title: The Universal Weight Subspace Hypothesis
Comments: 37 pages
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[606]  arXiv:2512.05116 (cross-list from cs.LG) [pdf, ps, other]
Title: Value Gradient Guidance for Flow Matching Alignment
Comments: Accepted at NeurIPS 2025; 26 pages, 20 figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[607]  arXiv:2512.05114 (cross-list from cs.LG) [pdf, ps, other]
Title: Deep infant brain segmentation from multi-contrast MRI
Comments: 8 pages, 8 figures, 1 table, website at this https URL, presented at the 2025 IEEE Asilomar Conference on Signals, Systems, and Computers
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[608]  arXiv:2512.05103 (cross-list from cs.LG) [pdf, ps, other]
Title: TV2TV: A Unified Framework for Interleaved Language and Video Generation
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[609]  arXiv:2512.05094 (cross-list from cs.RO) [pdf, ps, other]
Title: From Generated Human Videos to Physically Plausible Robot Trajectories
Comments: For project website, see this https URL
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[610]  arXiv:2512.04814 (cross-list from cs.SD) [pdf, ps, other]
Title: Shared Multi-modal Embedding Space for Face-Voice Association
Comments: Ranked 1st in Fame 2026 Challenge, ICASSP
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[611]  arXiv:2512.04763 (cross-list from cs.LG) [pdf, ps, other]
Title: MemLoRA: Distilling Expert Adapters for On-Device Memory Systems
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[612]  arXiv:2512.04705 (cross-list from cs.CC) [pdf, ps, other]
Title: Hardware-aware Neural Architecture Search of Early Exiting Networks on Edge Accelerators
Comments: Submitted to IEEE Transactions on Emerging Topics in Computing
Subjects: Computational Complexity (cs.CC); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
[613]  arXiv:2512.04625 (cross-list from cs.LG) [pdf, ps, other]
Title: Rethinking Decoupled Knowledge Distillation: A Predictive Distribution Perspective
Comments: Accepted to IEEE TNNLS
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[614]  arXiv:2512.04556 (cross-list from cs.GR) [pdf, ps, other]
Title: Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex
Comments: 10 pages, 7 figures
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[615]  arXiv:2512.04464 (cross-list from cs.LG) [pdf, ps, other]
Title: Feature Engineering vs. Deep Learning for Automated Coin Grading: A Comparative Study on Saint-Gaudens Double Eagles
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[616]  arXiv:2512.04385 (cross-list from cs.LG) [pdf, ps, other]
Title: STeP-Diff: Spatio-Temporal Physics-Informed Diffusion Models for Mobile Fine-Grained Pollution Forecasting
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[617]  arXiv:2512.04264 (cross-list from cs.LG) [pdf, ps, other]
Title: Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[618]  arXiv:2512.04092 (cross-list from physics.soc-ph) [pdf, ps, other]
Title: The changing surface of the world's roads
Subjects: Physics and Society (physics.soc-ph); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[619]  arXiv:2512.04087 (cross-list from q-bio.NC) [pdf, ps, other]
Title: Human-Centred Evaluation of Text-to-Image Generation Models for Self-expression of Mental Distress: A Dataset Based on GPT-4o
Authors: Sui He, Shenbin Qian
Subjects: Neurons and Cognition (q-bio.NC); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)

Thu, 4 Dec 2025

[620]  arXiv:2512.04085 [pdf, ps, other]
Title: Unique Lives, Shared World: Learning from Single-Life Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[621]  arXiv:2512.04084 [pdf, ps, other]
Title: SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[622]  arXiv:2512.04082 [pdf, ps, other]
Title: PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[623]  arXiv:2512.04069 [pdf, ps, other]
Title: SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[624]  arXiv:2512.04048 [pdf, ps, other]
Title: Stable Signer: Hierarchical Sign Language Generative Model
Comments: 12 pages, 7 figures. More Demo at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Computers and Society (cs.CY)
[625]  arXiv:2512.04040 [pdf, ps, other]
Title: RELIC: Interactive Video World Model with Long-Horizon Memory
Comments: 22 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[626]  arXiv:2512.04039 [pdf, ps, other]
Title: Fast & Efficient Normalizing Flows and Applications of Image Generative Models
Authors: Sandeep Nagar
Comments: PhD Thesis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[627]  arXiv:2512.04025 [pdf, ps, other]
Title: PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation
Comments: Tech report
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[628]  arXiv:2512.04021 [pdf, ps, other]
Title: C3G: Learning Compact 3D Representations with 2K Gaussians
Comments: Project Page : this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[629]  arXiv:2512.04019 [pdf, ps, other]
Title: Ultra-lightweight Neural Video Representation Compression
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[630]  arXiv:2512.04015 [pdf, ps, other]
Title: Learning Group Actions In Disentangled Latent Image Representations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[631]  arXiv:2512.04012 [pdf, ps, other]
Title: Emergent Outlier View Rejection in Visual Geometry Grounded Transformers
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[632]  arXiv:2512.04007 [pdf, ps, other]
Title: On the Temporality for Sketch Representation Learning
Comments: Preprint submitted to Pattern Recognition Letters
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[633]  arXiv:2512.04000 [pdf, ps, other]
Title: Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[634]  arXiv:2512.03996 [pdf, ps, other]
Title: Highly Efficient Test-Time Scaling for T2I Diffusion Models with Text Embedding Perturbation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[635]  arXiv:2512.03992 [pdf, ps, other]
Title: DIQ-H: Evaluating Hallucination Persistence in VLMs Under Temporal Visual Degradation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[636]  arXiv:2512.03981 [pdf, ps, other]
Title: DirectDrag: High-Fidelity, Mask-Free, Prompt-Free Drag-based Image Editing via Readout-Guided Feature Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[637]  arXiv:2512.03979 [pdf, ps, other]
Title: BlurDM: A Blur Diffusion Model for Image Deblurring
Comments: NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[638]  arXiv:2512.03964 [pdf, ps, other]
Title: Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization
Comments: 17 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[639]  arXiv:2512.03963 [pdf, ps, other]
Title: TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[640]  arXiv:2512.03939 [pdf, ps, other]
Title: MUT3R: Motion-aware Updating Transformer for Dynamic 3D Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[641]  arXiv:2512.03932 [pdf, ps, other]
Title: Beyond the Ground Truth: Enhanced Supervision for Image Restoration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[642]  arXiv:2512.03918 [pdf, ps, other]
Title: UniMo: Unifying 2D Video and 3D Human Motion with an Autoregressive Framework
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[643]  arXiv:2512.03905 [pdf, ps, other]
Title: Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence
Comments: Code: this https URL, Project: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[644]  arXiv:2512.03883 [pdf, ps, other]
Title: Dual Cross-Attention Siamese Transformer for Rectal Tumor Regrowth Assessment in Watch-and-Wait Endoscopy
Comments: 6 pages, 5 figures, 1 table, submitted to ISBI conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[645]  arXiv:2512.03869 [pdf, ps, other]
Title: An Automated Framework for Large-Scale Graph-Based Cerebrovascular Analysis
Comments: Submitted to ISBI 2026. 6 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[646]  arXiv:2512.03862 [pdf, ps, other]
Title: Diminishing Returns in Self-Supervised Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[647]  arXiv:2512.03854 [pdf, ps, other]
Title: Prostate biopsy whole slide image dataset from an underrepresented Middle Eastern population
Comments: 13 pages, 2 figures and 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[648]  arXiv:2512.03852 [pdf, ps, other]
Title: Traffic Image Restoration under Adverse Weather via Frequency-Aware Mamba
Comments: 12pages, 13 figures, 5tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[649]  arXiv:2512.03848 [pdf, ps, other]
Title: PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[650]  arXiv:2512.03844 [pdf, ps, other]
Title: CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation
Comments: 34 pages, 24 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[651]  arXiv:2512.03837 [pdf, ps, other]
Title: Heatmap Pooling Network for Action Recognition from RGB Videos
Comments: Final Version of IEEE Transactions on Pattern Analysis and Machine Intelligence
Journal-ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[652]  arXiv:2512.03834 [pdf, ps, other]
Title: Lean Unet: A Compact Model for Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[653]  arXiv:2512.03827 [pdf, ps, other]
Title: A Robust Camera-based Method for Breath Rate Measurement
Comments: 9 pages, 4 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[654]  arXiv:2512.03817 [pdf, ps, other]
Title: HieroGlyphTranslator: Automatic Recognition and Translation of Egyptian Hieroglyphs to English
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[655]  arXiv:2512.03796 [pdf, ps, other]
Title: LSRS: Latent Scale Rejection Sampling for Visual Autoregressive Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[656]  arXiv:2512.03794 [pdf, ps, other]
Title: AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition
Comments: 15 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[657]  arXiv:2512.03751 [pdf, ps, other]
Title: Research on Brain Tumor Classification Method Based on Improved ResNet34 Network
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[658]  arXiv:2512.03749 [pdf, ps, other]
Title: Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[659]  arXiv:2512.03746 [pdf, ps, other]
Title: Thinking with Programming Vision: Towards a Unified View for Thinking with Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[660]  arXiv:2512.03745 [pdf, ps, other]
Title: Dual-level Modality Debiasing Learning for Unsupervised Visible-Infrared Person Re-Identification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[661]  arXiv:2512.03730 [pdf, ps, other]
Title: Out-of-the-box: Black-box Causal Attacks on Object Detectors
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[662]  arXiv:2512.03724 [pdf, ps, other]
Title: PosA-VLA: Enhancing Action Generation via Pose-Conditioned Anchor Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[663]  arXiv:2512.03715 [pdf, ps, other]
Title: DINO-RotateMatch: A Rotation-Aware Deep Framework for Robust Image Matching in Large-Scale 3D Reconstruction
Comments: 9 pages, 5 figures, 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[664]  arXiv:2512.03701 [pdf, ps, other]
Title: Structured Uncertainty Similarity Score (SUSS): Learning a Probabilistic, Interpretable, Perceptual Metric Between Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[665]  arXiv:2512.03687 [pdf, ps, other]
Title: Active Visual Perception: Opportunities and Challenges
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[666]  arXiv:2512.03683 [pdf, ps, other]
Title: GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[667]  arXiv:2512.03673 [pdf, ps, other]
Title: ConvRot: Rotation-Based Plug-and-Play 4-bit Quantization for Diffusion Transformers
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[668]  arXiv:2512.03667 [pdf, ps, other]
Title: Colon-X: Advancing Intelligent Colonoscopy from Multimodal Understanding to Clinical Reasoning
Comments: Technical report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[669]  arXiv:2512.03666 [pdf, ps, other]
Title: ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos
Comments: 26 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[670]  arXiv:2512.03663 [pdf, ps, other]
Title: Multi-Scale Visual Prompting for Lightweight Small-Image Classification
Authors: Salim Khazem
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[671]  arXiv:2512.03643 [pdf, ps, other]
Title: Optical Context Compression Is Just (Bad) Autoencoding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[672]  arXiv:2512.03640 [pdf, ps, other]
Title: MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms
Journal-ref: MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15521. Springer, Singapore
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[673]  arXiv:2512.03625 [pdf, ps, other]
Title: FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[674]  arXiv:2512.03621 [pdf, ps, other]
Title: ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[675]  arXiv:2512.03619 [pdf, ps, other]
Title: LAMP: Language-Assisted Motion Planning for Controllable Video Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[676]  arXiv:2512.03601 [pdf, ps, other]
Title: Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding
Comments: Accepted to NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[677]  arXiv:2512.03598 [pdf, ps, other]
Title: Memory-Guided Point Cloud Completion for Dental Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[678]  arXiv:2512.03597 [pdf, ps, other]
Title: HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation
Comments: 6 pages, 4 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[679]  arXiv:2512.03593 [pdf, ps, other]
Title: CloseUpAvatar: High-Fidelity Animatable Full-Body Avatars with Mixture of Multi-Scale Textures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[680]  arXiv:2512.03592 [pdf, ps, other]
Title: Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding
Authors: Guang Yang, Lei Fan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[681]  arXiv:2512.03590 [pdf, ps, other]
Title: Beyond Boundary Frames: Audio-Visual Semantic Guidance for Context-Aware Video Interpolation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[682]  arXiv:2512.03580 [pdf, ps, other]
Title: Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[683]  arXiv:2512.03577 [pdf, ps, other]
Title: Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning
Comments: 6 pages, 2 figures. Camera-ready version accepted for IEEE BIBM 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[684]  arXiv:2512.03575 [pdf, ps, other]
Title: UniComp: Rethinking Video Compression Through Informational Uniqueness
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[685]  arXiv:2512.03574 [pdf, ps, other]
Title: Global-Local Aware Scene Text Editing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[686]  arXiv:2512.03566 [pdf, ps, other]
Title: GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models
Comments: Accepted by ACM MM Asia2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[687]  arXiv:2512.03558 [pdf, ps, other]
Title: CartoMapQA: A Fundamental Benchmark Dataset Evaluating Vision-Language Models on Cartographic Map Understanding
Comments: Accepted at SIGSPATIAL 2025 (Best paper candidates), 15 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[688]  arXiv:2512.03553 [pdf, ps, other]
Title: Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching
Comments: Accepted at KDD 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[689]  arXiv:2512.03542 [pdf, ps, other]
Title: V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time Intervention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[690]  arXiv:2512.03540 [pdf, ps, other]
Title: CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation
Comments: Accepted by ACM Multimedia 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[691]  arXiv:2512.03534 [pdf, ps, other]
Title: Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation
Comments: Visualizations are available at the website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[692]  arXiv:2512.03532 [pdf, ps, other]
Title: OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[693]  arXiv:2512.03520 [pdf, ps, other]
Title: FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation
Comments: 15 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[694]  arXiv:2512.03510 [pdf, ps, other]
Title: CSMapping: Scalable Crowdsourced Semantic Mapping and Topology Inference for Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[695]  arXiv:2512.03509 [pdf, ps, other]
Title: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[696]  arXiv:2512.03508 [pdf, ps, other]
Title: Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation
Comments: ICCV 2025 (poster)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[697]  arXiv:2512.03500 [pdf, ps, other]
Title: EEA: Exploration-Exploitation Agent for Long Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[698]  arXiv:2512.03499 [pdf, ps, other]
Title: NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[699]  arXiv:2512.03479 [pdf, ps, other]
Title: Towards Object-centric Understanding for Instructional Videos
Authors: Wenliang Guo, Yu Kong
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[700]  arXiv:2512.03477 [pdf, ps, other]
Title: Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis
Comments: 10 pages, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[701]  arXiv:2512.03474 [pdf, ps, other]
Title: Procedural Mistake Detection via Action Effect Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[702]  arXiv:2512.03470 [pdf, ps, other]
Title: Difference Decomposition Networks for Infrared Small Target Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[703]  arXiv:2512.03463 [pdf, ps, other]
Title: Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[704]  arXiv:2512.03454 [pdf, ps, other]
Title: Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[705]  arXiv:2512.03453 [pdf, ps, other]
Title: GeoVideo: Introducing Geometric Regularization into Video Generation Model
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[706]  arXiv:2512.03451 [pdf, ps, other]
Title: GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[707]  arXiv:2512.03450 [pdf, ps, other]
Title: KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[708]  arXiv:2512.03449 [src]
Title: LM-CartSeg: Automated Segmentation of Lateral and Medial Cartilage and Subchondral Bone for Radiomics Analysis
Authors: Tongxu Zhang
Comments: The manuscript represents only a preliminary and substantially incompleted exploration. The author has decided not to stand by these results, and a thoroughly revised and significantly different version will be developed separately. Therefore this version is withdrawn and should not be cited
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[709]  arXiv:2512.03445 [pdf, ps, other]
Title: Multi-Aspect Knowledge-Enhanced Medical Vision-Language Pretraining with Multi-Agent Data Generation
Comments: 10 pages. Under Review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[710]  arXiv:2512.03430 [pdf, ps, other]
Title: Label-Efficient Hyperspectral Image Classification via Spectral FiLM Modulation of Low-Level Pretrained Diffusion Features
Comments: Accepted to the ICML 2025 TerraBytes Workshop (June 9, 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[711]  arXiv:2512.03427 [pdf, ps, other]
Title: Generalization Evaluation of Deep Stereo Matching Methods for UAV-Based Forestry Applications
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[712]  arXiv:2512.03424 [pdf, ps, other]
Title: DM3D: Deformable Mamba via Offset-Guided Gaussian Sequencing for Point Cloud Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[713]  arXiv:2512.03418 [pdf, ps, other]
Title: YOLOA: Real-Time Affordance Detection via LLM Adapter
Comments: 13 pages, 9 figures, conference
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[714]  arXiv:2512.03405 [pdf, ps, other]
Title: ViDiC: Video Difference Captioning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[715]  arXiv:2512.03404 [pdf, ps, other]
Title: MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[716]  arXiv:2512.03370 [pdf, ps, other]
Title: ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[717]  arXiv:2512.03369 [pdf, ps, other]
Title: FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread Forecasting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[718]  arXiv:2512.03359 [pdf, ps, other]
Title: A Hybrid Deep Learning Framework with Explainable AI for Lung Cancer Classification with DenseNet169 and SVM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[719]  arXiv:2512.03350 [pdf, ps, other]
Title: SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[720]  arXiv:2512.03346 [pdf, ps, other]
Title: Hierarchical Attention for Sparse Volumetric Anomaly Detection in Subclinical Keratoconus
Comments: 16 pages, 7 figures, 6 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[721]  arXiv:2512.03345 [pdf, ps, other]
Title: HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[722]  arXiv:2512.03339 [pdf, ps, other]
Title: ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in Echocardiography
Comments: 11 pages, Accepted in IMIMIC Workshop at MICCAI 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[723]  arXiv:2512.03335 [pdf, ps, other]
Title: Step-by-step Layered Design Generation
Journal-ref: AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[724]  arXiv:2512.03317 [pdf, ps, other]
Title: NavMapFusion: Diffusion-based Fusion of Navigation Maps for Online Vectorized HD Map Construction
Comments: Accepted to 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2026)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[725]  arXiv:2512.03284 [pdf, ps, other]
Title: SpatialReasoner: Active Perception for Large-Scale 3D Scene Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[726]  arXiv:2512.03257 [pdf, ps, other]
Title: PyroFocus: A Deep Learning Approach to Real-Time Wildfire Detection in Multispectral Remote Sensing Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[727]  arXiv:2512.03247 [pdf, ps, other]
Title: PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement
Comments: Published in the Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[728]  arXiv:2512.03245 [pdf, ps, other]
Title: 2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[729]  arXiv:2512.03237 [pdf, ps, other]
Title: LLM-Guided Material Inference for 3D Point Clouds
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[730]  arXiv:2512.03233 [pdf, ps, other]
Title: Object Counting with GPT-4o and GPT-5: A Comparative Study
Comments: 5 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[731]  arXiv:2512.03210 [pdf, ps, other]
Title: Flux4D: Flow-based Unsupervised 4D Reconstruction
Comments: NeurIPS 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[732]  arXiv:2512.03199 [pdf, ps, other]
Title: Does Head Pose Correction Improve Biometric Facial Recognition?
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[733]  arXiv:2512.03182 [pdf, ps, other]
Title: Drainage: A Unifying Framework for Addressing Class Uncertainty
Comments: 16 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[734]  arXiv:2512.03126 [pdf, ps, other]
Title: Hierarchical Process Reward Models are Symbolic Vision Learners
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[735]  arXiv:2512.04076 (cross-list from cs.GR) [pdf, ps, other]
Title: Radiance Meshes for Volumetric Reconstruction
Comments: Website: half-potato.gitlab.io/rm
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[736]  arXiv:2512.04032 (cross-list from cs.CL) [pdf, ps, other]
Title: Jina-VLM: Small Multilingual Vision Language Model
Comments: 18 pages, 1-7 main content, 13-18 appendix for tables and dataset
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[737]  arXiv:2512.03995 (cross-list from cs.RO) [pdf, ps, other]
Title: Artificial Microsaccade Compensation: Stable Vision for an Ornithopter
Comments: 29 pages, 5 figures, 2 tables, under review
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[738]  arXiv:2512.03962 (cross-list from eess.IV) [pdf, ps, other]
Title: Tada-DIP: Input-adaptive Deep Image Prior for One-shot 3D Image Reconstruction
Comments: 6 pages, 8 figures, 2025 Asilomar Conference on Signals, Systems, and Computers. Code is available at github.com/evanbell02/Tada-DIP/
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[739]  arXiv:2512.03656 (cross-list from cs.LG) [pdf, ps, other]
Title: Cyclical Temporal Encoding and Hybrid Deep Ensembles for Multistep Energy Forecasting
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[740]  arXiv:2512.03556 (cross-list from cs.RO) [pdf, ps, other]
Title: RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[741]  arXiv:2512.03522 (cross-list from cs.RO) [pdf, ps, other]
Title: MSG-Loc: Multi-Label Likelihood-based Semantic Graph Matching for Object-Level Global Localization
Comments: Accepted in IEEE Robotics and Automation Letters (2025)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[742]  arXiv:2512.03514 (cross-list from cs.IR) [pdf, ps, other]
Title: M3DR: Towards Universal Multilingual Multimodal Document Retrieval
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[743]  arXiv:2512.03422 (cross-list from cs.RO) [pdf, ps, other]
Title: What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[744]  arXiv:2512.03216 (cross-list from physics.ins-det) [pdf, ps, other]
Title: Kaleidoscopic Scintillation Event Imaging
Subjects: Instrumentation and Detectors (physics.ins-det); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[745]  arXiv:2512.03173 (cross-list from cs.CY) [pdf, ps, other]
Title: Culture Affordance Atlas: Reconciling Object Diversity Through Functional Mapping
Journal-ref: AAAI 2026 Social Impact Track
Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[746]  arXiv:2512.03166 (cross-list from cs.RO) [pdf, ps, other]
Title: Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[747]  arXiv:2512.03111 (cross-list from q-bio.GN) [pdf, ps, other]
Title: PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer
Comments: Accepted by AAAI 2026
Subjects: Genomics (q-bio.GN); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[748]  arXiv:2512.03054 (cross-list from cs.LG) [pdf, ps, other]
Title: Energy-Efficient Federated Learning via Adaptive Encoder Freezing for MRI-to-CT Conversion: A Green AI-Guided Research
Comments: 22 pages, 13 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Medical Physics (physics.med-ph)
[749]  arXiv:2512.03052 (cross-list from cs.GR) [pdf, ps, other]
Title: LATTICE: Democratize High-Fidelity 3D Generation at Scale
Comments: Technical Report
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[ total of 749 entries: 1-749 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)