We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 378

[ total of 747 entries: 1-1000 | 379-747 ]
[ showing up to 1000 entries per page: fewer | more ]

Wed, 10 Dec 2025 (continued, showing last 110 of 131 entries)

[379]  arXiv:2512.08738 [pdf, ps, other]
Title: Pose-Based Sign Language Spotting via an End-to-End Encoder Architecture
Comments: To appear at AACL-IJCNLP 2025 Workshop WSLP
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[380]  arXiv:2512.08733 [pdf, ps, other]
Title: Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[381]  arXiv:2512.08730 [pdf, ps, other]
Title: SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[382]  arXiv:2512.08700 [pdf, ps, other]
Title: Scale-invariant and View-relational Representation Learning for Full Surround Monocular Depth
Comments: Accepted at IEEE Robotics and Automation Letters (RA-L) 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[383]  arXiv:2512.08697 [pdf, ps, other]
Title: What really matters for person re-identification? A Mixture-of-Experts Framework for Semantic Attribute Importance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[384]  arXiv:2512.08673 [pdf, ps, other]
Title: Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds
Comments: 16 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[385]  arXiv:2512.08648 [pdf, ps, other]
Title: Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank
Comments: 19 pages, 19 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[386]  arXiv:2512.08647 [pdf, ps, other]
Title: C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior Recognition
Authors: Keito Inoshita
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[387]  arXiv:2512.08645 [pdf, ps, other]
Title: Chain-of-Image Generation: Toward Monitorable and Controllable Image Generation
Comments: 19 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[388]  arXiv:2512.08639 [pdf, ps, other]
Title: Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
Comments: Under Review, 12 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[389]  arXiv:2512.08627 [pdf, ps, other]
Title: Trajectory Densification and Depth from Perspective-based Blur
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[390]  arXiv:2512.08625 [pdf, ps, other]
Title: OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set Semantics
Comments: 8 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[391]  arXiv:2512.08606 [pdf, ps, other]
Title: Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot Learning
Comments: 14 pages, 8 figures, Association for the Advancement of Artificial Intelligence (AAAI2026, poster)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[392]  arXiv:2512.08589 [pdf, ps, other]
Title: Automated Pollen Recognition in Optical and Holographic Microscopy Images
Comments: 08 pages, 10 figures, 04 tables, 20 references. Date of Conference: 13-14 June 2025 Date Added to IEEE Xplore: 10 July 2025 Electronic ISBN: 979-8-3315-0969-9 Print on Demand(PoD) ISBN: 979-8-3315-0970-5 DOI: 10.1109/AICCONF64766.2025.11064260 Conference Location: Prague, Czech Republic Online Access: this https URL
Journal-ref: 2025 3rd Cognitive Models and Artificial Intelligence Conference (AICCONF), vol. 1, no. 1, pp. 1-8, Prague, Czech Republic, IEEE, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[393]  arXiv:2512.08577 [pdf, ps, other]
Title: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[394]  arXiv:2512.08572 [pdf, ps, other]
Title: From Cells to Survival: Hierarchical Analysis of Cell Inter-Relations in Multiplex Microscopy for Lung Cancer Prognosis
Comments: 5 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[395]  arXiv:2512.08569 [pdf, ps, other]
Title: Instance-Aware Test-Time Segmentation for Continual Domain Shifts
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[396]  arXiv:2512.08564 [pdf, ps, other]
Title: Modular Neural Image Signal Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[397]  arXiv:2512.08560 [pdf, ps, other]
Title: BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[398]  arXiv:2512.08557 [pdf, ps, other]
Title: SSCATeR: Sparse Scatter-Based Convolution Algorithm with Temporal Data Recycling for Real-Time 3D Object Detection in LiDAR Point Clouds
Comments: 22 Pages, 26 Figures, This work has been submitted to the IEEE Sensors Journal for possible publication
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[399]  arXiv:2512.08547 [pdf, ps, other]
Title: An Iteration-Free Fixed-Point Estimator for Diffusion Inversion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[400]  arXiv:2512.08542 [pdf, ps, other]
Title: A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
[401]  arXiv:2512.08537 [pdf, ps, other]
Title: Fast-ARDiff: An Entropy-informed Acceleration Framework for Continuous Space Autoregressive Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[402]  arXiv:2512.08535 [pdf, ps, other]
Title: Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403]  arXiv:2512.08534 [pdf, ps, other]
Title: PaintFlow: A Unified Framework for Interactive Oil Paintings Editing and Generation
Comments: 14 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404]  arXiv:2512.08529 [pdf, ps, other]
Title: MVP: Multiple View Prediction Improves GUI Grounding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[405]  arXiv:2512.08524 [pdf, ps, other]
Title: Beyond Real Weights: Hypercomplex Representations for Stable Quantization
Comments: Accepted in Winter Conference on Applications of Computer Vision (WACV) 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[406]  arXiv:2512.08511 [pdf, ps, other]
Title: Thinking with Images via Self-Calling Agent
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[407]  arXiv:2512.08506 [pdf, ps, other]
Title: OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point Clouds
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408]  arXiv:2512.08505 [pdf, ps, other]
Title: Beyond the Noise: Aligning Prompts with Latent Representations in Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[409]  arXiv:2512.08503 [pdf, ps, other]
Title: Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[410]  arXiv:2512.08498 [pdf, ps, other]
Title: On-the-fly Large-scale 3D Reconstruction from Multi-Camera Rigs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[411]  arXiv:2512.08486 [pdf, ps, other]
Title: Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned Interventions
Comments: Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[412]  arXiv:2512.08478 [pdf, ps, other]
Title: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[413]  arXiv:2512.08477 [pdf, ps, other]
Title: ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Consistent Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[414]  arXiv:2512.08467 [pdf, ps, other]
Title: Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion Recovery
Comments: 8 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[415]  arXiv:2512.08445 [pdf, ps, other]
Title: Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[416]  arXiv:2512.08441 [pdf, ps, other]
Title: Leveraging Multispectral Sensors for Color Correction in Mobile Cameras
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[417]  arXiv:2512.08439 [pdf, ps, other]
Title: LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[418]  arXiv:2512.08430 [pdf, ps, other]
Title: SDT-6D: Fully Sparse Depth-Transformer for Staged End-to-End 6D Pose Estimation in Industrial Multi-View Bin Picking
Comments: Accepted to WACV 2026. Preprint version
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[419]  arXiv:2512.08410 [pdf, ps, other]
Title: Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[420]  arXiv:2512.08406 [pdf, ps, other]
Title: SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[421]  arXiv:2512.08400 [pdf, ps, other]
Title: Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries
Comments: The paper has been accepted for publication at Northern Lights Deep Learning (NLDL) Conference 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[422]  arXiv:2512.08397 [pdf, ps, other]
Title: Detection of Digital Facial Retouching utilizing Face Beauty Information
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[423]  arXiv:2512.08378 [pdf, ps, other]
Title: Simultaneous Enhancement and Noise Suppression under Complex Illumination Conditions
Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[424]  arXiv:2512.08374 [pdf, ps, other]
Title: The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[425]  arXiv:2512.08362 [pdf, ps, other]
Title: SCU-CGAN: Enhancing Fire Detection through Synthetic Fire Image Generation and Dataset Augmentation
Comments: Accepted for main track at MobieSec 2024 (not published in the proceedings)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[426]  arXiv:2512.08358 [pdf, ps, other]
Title: TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
Comments: Accepted by NeurIPS 2025. Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427]  arXiv:2512.08337 [pdf, ps, other]
Title: DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[428]  arXiv:2512.08334 [pdf, ps, other]
Title: HybridSplat: Fast Reflection-baked Gaussian Tracing using Hybrid Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429]  arXiv:2512.08331 [pdf, ps, other]
Title: Bi^2MAC: Bimodal Bi-Adaptive Mask-Aware Convolution for Remote Sensing Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430]  arXiv:2512.08330 [pdf, ps, other]
Title: PointDico: Contrastive 3D Representation Learning Guided by Diffusion Models
Comments: Accepted by IJCNN 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[431]  arXiv:2512.08329 [pdf, ps, other]
Title: Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models
Comments: 32 pages, 17 figures, 1 table, 5 algorithms, preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[432]  arXiv:2512.08327 [pdf, ps, other]
Title: Low Rank Support Quaternion Matrix Machine
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[433]  arXiv:2512.08325 [pdf, ps, other]
Title: GeoDiffMM: Geometry-Guided Conditional Diffusion for Motion Magnification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[434]  arXiv:2512.08323 [pdf, ps, other]
Title: Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challenge
Comments: MICCAI 2024, 3DTeethLand, Challenge report, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435]  arXiv:2512.08317 [pdf, ps, other]
Title: GeoDM: Geometry-aware Distribution Matching for Dataset Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[436]  arXiv:2512.08309 [pdf, ps, other]
Title: Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise in Infinite, Real-Time Terrain Generation
Authors: Alexander Goslin
Comments: Project website: this https URL Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
[437]  arXiv:2512.08294 [pdf, ps, other]
Title: OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[438]  arXiv:2512.08282 [pdf, ps, other]
Title: PAVAS: Physics-Aware Video-to-Audio Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
[439]  arXiv:2512.08269 [pdf, ps, other]
Title: EgoX: Egocentric Video Generation from a Single Exocentric Video
Comments: 21 pages, project page : this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[440]  arXiv:2512.08262 [pdf, ps, other]
Title: RLCNet: An end-to-end deep learning framework for simultaneous online calibration of LiDAR, RADAR, and Camera
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[441]  arXiv:2512.08254 [pdf, ps, other]
Title: SFP: Real-World Scene Recovery Using Spatial and Frequency Priors
Comments: 10 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[442]  arXiv:2512.08253 [pdf, ps, other]
Title: Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[443]  arXiv:2512.08247 [pdf, ps, other]
Title: Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object Detection
Comments: AAAI-26
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[444]  arXiv:2512.08243 [pdf, ps, other]
Title: Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSI
Authors: Saeeda Naz, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences (UEAS), Swat, Pakistan)
Comments: 26 Pages, 10 Figures, 4 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[445]  arXiv:2512.08240 [pdf, ps, other]
Title: HybridToken-VLM: Hybrid Token Compression for Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[446]  arXiv:2512.08237 [pdf, ps, other]
Title: FastBEV++: Fast by Algorithm, Deployable by Design
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[447]  arXiv:2512.08229 [pdf, ps, other]
Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[448]  arXiv:2512.08228 [pdf, ps, other]
Title: MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[449]  arXiv:2512.08227 [pdf, ps, other]
Title: New VVC profiles targeting Feature Coding for Machines
Comments: Accepted for presentation at ICIP 2025 workshop on Coding for Machines
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[450]  arXiv:2512.08223 [pdf, ps, other]
Title: SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451]  arXiv:2512.08221 [pdf, ps, other]
Title: VisKnow: Constructing Visual Knowledge Base for Object Understanding
Comments: 16 pages, 12 figures, 7 tables. Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[452]  arXiv:2512.08215 [pdf, ps, other]
Title: Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior Refinement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453]  arXiv:2512.08198 [pdf, ps, other]
Title: Animal Re-Identification on Microcontrollers
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[454]  arXiv:2512.08180 [pdf, ps, other]
Title: GeoLoom: High-quality Geometric Diagram Generation from Textual Input
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[455]  arXiv:2512.08163 [pdf, ps, other]
Title: Accuracy Does Not Guarantee Human-Likeness in Monocular Depth Estimators
Comments: 22 pages, 12 figures, 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456]  arXiv:2512.08161 [pdf, ps, other]
Title: Fourier-RWKV: A Multi-State Perception Network for Efficient Image Dehazing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[457]  arXiv:2512.08135 [pdf, ps, other]
Title: CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[458]  arXiv:2512.08075 [pdf, ps, other]
Title: Identification of Deforestation Areas in the Amazon Rainforest Using Change Detection Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[459]  arXiv:2512.08048 [pdf, ps, other]
Title: Mask to Adapt: Simple Random Masking Enables Robust Continual Test-Time Learning
Comments: ongoing work
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460]  arXiv:2512.08042 [pdf, ps, other]
Title: Towards Sustainable Universal Deepfake Detection with Frequency-Domain Masking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461]  arXiv:2512.08040 [pdf, ps, other]
Title: Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462]  arXiv:2512.08038 [pdf, ps, other]
Title: SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity Classification
Comments: 20 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463]  arXiv:2512.08016 [pdf, ps, other]
Title: FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[464]  arXiv:2512.07984 [pdf, ps, other]
Title: Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection
Comments: 13 pages, 7 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[465]  arXiv:2512.07951 [pdf, ps, other]
Title: Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Comments: Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[466]  arXiv:2512.07925 [pdf, ps, other]
Title: Near-real time fires detection using satellite imagery in Sudan conflict
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[467]  arXiv:2512.07838 [pdf, ps, other]
Title: Detection of Cyberbullying in GIF using AI
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[468]  arXiv:2512.08715 (cross-list from cs.PF) [pdf, ps, other]
Title: Multi-domain performance analysis with scores tailored to user preferences
Subjects: Performance (cs.PF); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[469]  arXiv:2512.08629 (cross-list from cs.AI) [pdf, ps, other]
Title: See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[470]  arXiv:2512.08545 (cross-list from cs.CL) [pdf, ps, other]
Title: Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
Comments: 22 pages, 2 tables, 9 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[471]  arXiv:2512.08500 (cross-list from cs.GR) [pdf, ps, other]
Title: Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[472]  arXiv:2512.08360 (cross-list from cs.NE) [pdf, ps, other]
Title: Conditional Morphogenesis: Emergent Generation of Structural Digits via Neural Cellular Automata
Authors: Ali Sakour
Comments: 13 pages, 5 figures. Code available at: this https URL
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[473]  arXiv:2512.08284 (cross-list from physics.geo-ph) [pdf, ps, other]
Title: Self-Reinforced Deep Priors for Reparameterized Full Waveform Inversion
Comments: Submitted to GEOPHYSICS
Subjects: Geophysics (physics.geo-ph); Computer Vision and Pattern Recognition (cs.CV)
[474]  arXiv:2512.08271 (cross-list from cs.RO) [pdf, ps, other]
Title: Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic Teleoperation
Comments: Published and Presented at 3rd Workshop on Human-Centric Multilateral Teleoperation in ICRA 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[475]  arXiv:2512.08216 (cross-list from eess.IV) [pdf, ps, other]
Title: Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[476]  arXiv:2512.08188 (cross-list from cs.RO) [pdf, ps, other]
Title: Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World Model
Comments: Website at this https URL
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[477]  arXiv:2512.08170 (cross-list from cs.RO) [pdf, ps, other]
Title: RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric Features
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[478]  arXiv:2512.08153 (cross-list from cs.LG) [pdf, ps, other]
Title: TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models
Authors: Zheng Ding, Weirui Ye
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[479]  arXiv:2512.08125 (cross-list from eess.IV) [pdf, ps, other]
Title: FlowSteer: Conditioning Flow Field for Consistent Image Restoration
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[480]  arXiv:2512.08099 (cross-list from math.NA) [pdf, ps, other]
Title: Generalizations of the Normalized Radon Cumulative Distribution Transform for Limited Data Recognition
Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
[481]  arXiv:2512.08029 (cross-list from cs.LG) [pdf, ps, other]
Title: CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[482]  arXiv:2512.07998 (cross-list from cs.RO) [pdf, ps, other]
Title: DIJIT: A Robotic Head for an Active Observer
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[483]  arXiv:2512.07981 (cross-list from cs.LG) [pdf, ps, other]
Title: CIP-Net: Continual Interpretable Prototype-based Network
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[484]  arXiv:2512.07976 (cross-list from cs.RO) [pdf, ps, other]
Title: VLD: Visual Language Goal Distance for Reinforcement Learning Navigation
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[485]  arXiv:2512.07969 (cross-list from cs.RO) [pdf, ps, other]
Title: Sparse Variable Projection in Robotic Perception: Exploiting Separable Structure for Efficient Nonlinear Optimization
Comments: 8 pages, submitted for review
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[486]  arXiv:2512.07884 (cross-list from cs.LG) [pdf, ps, other]
Title: GSPN-2: Efficient Parallel Sequence Modeling
Comments: NeurIPS 2025
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[487]  arXiv:2512.07855 (cross-list from cs.LG) [pdf, ps, other]
Title: LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer Model
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[488]  arXiv:2512.05791 (cross-list from physics.med-ph) [pdf, ps, other]
Title: Fast and Robust Diffusion Posterior Sampling for MR Image Reconstruction Using the Preconditioned Unadjusted Langevin Algorithm
Comments: Submitted to Magnetic Resonance in Medicine
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Probability (math.PR)

Tue, 9 Dec 2025

[489]  arXiv:2512.07834 [pdf, ps, other]
Title: Voxify3D: Pixel Art Meets Volumetric Rendering
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[490]  arXiv:2512.07833 [pdf, ps, other]
Title: Relational Visual Similarity
Comments: Project page, data, and code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[491]  arXiv:2512.07831 [pdf, ps, other]
Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
Comments: Project Website this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[492]  arXiv:2512.07829 [pdf, ps, other]
Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[493]  arXiv:2512.07826 [pdf, ps, other]
Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing
Comments: 38 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[494]  arXiv:2512.07821 [pdf, ps, other]
Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[495]  arXiv:2512.07807 [pdf, ps, other]
Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale Scenes
Comments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[496]  arXiv:2512.07806 [pdf, ps, other]
Title: Multi-view Pyramid Transformer: Look Coarser to See Broader
Comments: Project page: see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[497]  arXiv:2512.07802 [pdf, ps, other]
Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[498]  arXiv:2512.07778 [pdf, ps, other]
Title: Distribution Matching Variational AutoEncoder
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[499]  arXiv:2512.07776 [pdf, ps, other]
Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[500]  arXiv:2512.07760 [pdf, ps, other]
Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-Identification
Comments: Accepted to AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[501]  arXiv:2512.07756 [pdf, ps, other]
Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[502]  arXiv:2512.07747 [pdf, ps, other]
Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[503]  arXiv:2512.07745 [pdf, ps, other]
Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[504]  arXiv:2512.07738 [pdf, ps, other]
Title: HLTCOE Evaluation Team at TREC 2025: VQA Track
Comments: 7 pages, 1 figure
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[505]  arXiv:2512.07733 [pdf, ps, other]
Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[506]  arXiv:2512.07730 [pdf, ps, other]
Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[507]  arXiv:2512.07729 [pdf, ps, other]
Title: Improving action classification with brain-inspired deep networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[508]  arXiv:2512.07720 [pdf, ps, other]
Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation
Comments: Project page: \url{this https URL}
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[509]  arXiv:2512.07712 [pdf, ps, other]
Title: UnCageNet: Tracking and Pose Estimation of Caged Animal
Comments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, India
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[510]  arXiv:2512.07703 [pdf, ps, other]
Title: PVeRA: Probabilistic Vector-Based Random Matrix Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[511]  arXiv:2512.07702 [pdf, ps, other]
Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[512]  arXiv:2512.07698 [pdf, ps, other]
Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data Only
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[513]  arXiv:2512.07674 [pdf, ps, other]
Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast Representations
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[514]  arXiv:2512.07668 [pdf, ps, other]
Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[515]  arXiv:2512.07661 [pdf, ps, other]
Title: Optimization-Guided Diffusion for Interactive Scene Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[516]  arXiv:2512.07652 [pdf, ps, other]
Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[517]  arXiv:2512.07651 [pdf, ps, other]
Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline Method
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[518]  arXiv:2512.07628 [pdf, ps, other]
Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[519]  arXiv:2512.07606 [pdf, ps, other]
Title: Decomposition Sampling for Efficient Region Annotations in Active Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[520]  arXiv:2512.07599 [pdf, ps, other]
Title: Online Segment Any 3D Thing as Instance Tracking
Comments: NeurIPS 2025, Code is at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[521]  arXiv:2512.07596 [pdf, ps, other]
Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[522]  arXiv:2512.07590 [pdf, ps, other]
Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[523]  arXiv:2512.07584 [pdf, ps, other]
Title: LongCat-Image Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[524]  arXiv:2512.07580 [pdf, ps, other]
Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[525]  arXiv:2512.07568 [pdf, ps, other]
Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[526]  arXiv:2512.07564 [pdf, ps, other]
Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models
Comments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URL
Journal-ref: The 4th National and International Academic Conference Celebrating the 20th Anniversary of Rajapruk University (2026)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[527]  arXiv:2512.07527 [pdf, ps, other]
Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[528]  arXiv:2512.07514 [pdf, ps, other]
Title: MeshRipple: Structured Autoregressive Generation of Artist-Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[529]  arXiv:2512.07504 [pdf, ps, other]
Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points
Comments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530]  arXiv:2512.07503 [pdf, ps, other]
Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531]  arXiv:2512.07500 [pdf, ps, other]
Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532]  arXiv:2512.07498 [pdf, ps, other]
Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior
Comments: 16 pages (including appendix)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533]  arXiv:2512.07480 [pdf, ps, other]
Title: Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[534]  arXiv:2512.07469 [pdf, ps, other]
Title: Unified Video Editing with Temporal Reasoner
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[535]  arXiv:2512.07426 [pdf, ps, other]
Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processing
Comments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[536]  arXiv:2512.07415 [pdf, ps, other]
Title: Data-driven Exploration of Mobility Interaction Patterns
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[537]  arXiv:2512.07410 [pdf, ps, other]
Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538]  arXiv:2512.07394 [pdf, ps, other]
Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric Video
Comments: webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539]  arXiv:2512.07391 [pdf, ps, other]
Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency Monitoring
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540]  arXiv:2512.07385 [pdf, ps, other]
Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[541]  arXiv:2512.07383 [pdf, ps, other]
Title: LogicCBMs: Logic-Enhanced Concept-Based Learning
Comments: 18 pages, 19 figures, WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[542]  arXiv:2512.07381 [pdf, ps, other]
Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic Objects
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543]  arXiv:2512.07379 [pdf, ps, other]
Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and Efficiency
Comments: 22 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[544]  arXiv:2512.07360 [pdf, ps, other]
Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic Segmentation
Comments: Accepted to WACV2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[545]  arXiv:2512.07351 [pdf, ps, other]
Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
[546]  arXiv:2512.07348 [pdf, ps, other]
Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547]  arXiv:2512.07345 [pdf, ps, other]
Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian Splatting
Comments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548]  arXiv:2512.07338 [pdf, ps, other]
Title: Generalized Referring Expression Segmentation on Aerial Photos
Comments: Submitted to IEEE J-STARS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549]  arXiv:2512.07331 [pdf, ps, other]
Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision Transformers
Authors: Kanishk Awadhiya
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550]  arXiv:2512.07328 [pdf, ps, other]
Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[551]  arXiv:2512.07305 [pdf, ps, other]
Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[552]  arXiv:2512.07302 [pdf, ps, other]
Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[553]  arXiv:2512.07276 [pdf, ps, other]
Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery
Comments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554]  arXiv:2512.07275 [pdf, ps, other]
Title: Effective Attention-Guided Multi-Scale Medical Network for Skin Lesion Segmentation
Comments: The paper has been accepted by BIBM 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[555]  arXiv:2512.07273 [pdf, ps, other]
Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language Translation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556]  arXiv:2512.07269 [pdf, ps, other]
Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[557]  arXiv:2512.07253 [pdf, ps, other]
Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement
Comments: 18 pages, 8 figures, and 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[558]  arXiv:2512.07251 [pdf, ps, other]
Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[559]  arXiv:2512.07247 [pdf, ps, other]
Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing
Comments: 40 pages, 34 figures, 18 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[560]  arXiv:2512.07245 [pdf, ps, other]
Title: Zero-Shot Textual Explanations via Translating Decision-Critical Features
Comments: 11+6 pages, 8 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561]  arXiv:2512.07241 [pdf, ps, other]
Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network Architecture
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562]  arXiv:2512.07237 [pdf, ps, other]
Title: Unified Camera Positional Encoding for Controlled Video Generation
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[563]  arXiv:2512.07234 [pdf, ps, other]
Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[564]  arXiv:2512.07230 [pdf, ps, other]
Title: STRinGS: Selective Text Refinement in Gaussian Splatting
Comments: Accepted to WACV 2026. Project Page, see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[565]  arXiv:2512.07229 [pdf, ps, other]
Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category Discovery
Comments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[566]  arXiv:2512.07228 [pdf, ps, other]
Title: Towards Robust Protective Perturbation against DeepFake Face Swapping
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[567]  arXiv:2512.07215 [pdf, ps, other]
Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[568]  arXiv:2512.07211 [pdf, ps, other]
Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point Clouds
Comments: 8 pages, 8 figures, 5 tables, ICCR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569]  arXiv:2512.07206 [pdf, ps, other]
Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CT
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[570]  arXiv:2512.07203 [pdf, ps, other]
Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning
Comments: 7 pages, 1 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571]  arXiv:2512.07201 [pdf, ps, other]
Title: Understanding Diffusion Models via Code Execution
Authors: Cheng Yu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[572]  arXiv:2512.07198 [pdf, ps, other]
Title: Generating Storytelling Images with Rich Chains-of-Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[573]  arXiv:2512.07197 [pdf, ps, other]
Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting
Comments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574]  arXiv:2512.07192 [pdf, ps, other]
Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image Compression
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[575]  arXiv:2512.07191 [pdf, ps, other]
Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field Correction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[576]  arXiv:2512.07190 [pdf, ps, other]
Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[577]  arXiv:2512.07186 [pdf, ps, other]
Title: START: Spatial and Textual Learning for Chart Understanding
Comments: WACV2026 Camera Ready
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[578]  arXiv:2512.07171 [pdf, ps, other]
Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image Restoration
Comments: 21 pages, 11 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[579]  arXiv:2512.07170 [pdf, ps, other]
Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[580]  arXiv:2512.07166 [pdf, ps, other]
Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing
Comments: 9 pages,7figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[581]  arXiv:2512.07165 [pdf, ps, other]
Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[582]  arXiv:2512.07155 [pdf, ps, other]
Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented Metrics
Comments: Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[583]  arXiv:2512.07141 [pdf, ps, other]
Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[584]  arXiv:2512.07136 [pdf, ps, other]
Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[585]  arXiv:2512.07135 [pdf, ps, other]
Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[586]  arXiv:2512.07128 [pdf, ps, other]
Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIP
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[587]  arXiv:2512.07126 [pdf, ps, other]
Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-On
Comments: 16 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[588]  arXiv:2512.07110 [pdf, ps, other]
Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[589]  arXiv:2512.07107 [pdf, ps, other]
Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D Supervision
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[590]  arXiv:2512.07078 [pdf, ps, other]
Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object Detection
Comments: 16 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[591]  arXiv:2512.07076 [pdf, ps, other]
Title: Context-measure: Contextualizing Metric for Camouflage
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[592]  arXiv:2512.07065 [pdf, ps, other]
Title: Persistent Homology-Guided Frequency Filtering for Image Compression
Comments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compression
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[593]  arXiv:2512.07062 [pdf, ps, other]
Title: $\mathrm{D}^\mathrm{3}$-Predictor: Noise-Free Deterministic Diffusion for Dense Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[594]  arXiv:2512.07052 [pdf, ps, other]
Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[595]  arXiv:2512.07051 [pdf, ps, other]
Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation
Comments: 11 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[596]  arXiv:2512.07037 [pdf, ps, other]
Title: Evaluating and Preserving High-level Fidelity in Super-Resolution
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[597]  arXiv:2512.07034 [pdf, ps, other]
Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent Cues
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[598]  arXiv:2512.06981 [pdf, ps, other]
Title: Selective Masking based Self-Supervised Learning for Image Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[599]  arXiv:2512.06949 [pdf, ps, other]
Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology
Comments: 19 pages, 5 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[600]  arXiv:2512.06921 [pdf, ps, other]
Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification
Comments: Accepted by IEEE ICIA 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[601]  arXiv:2512.06905 [pdf, ps, other]
Title: Scaling Zero-Shot Reference-to-Video Generation
Comments: Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[602]  arXiv:2512.06888 [pdf, ps, other]
Title: Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[603]  arXiv:2512.06886 [pdf, ps, other]
Title: Balanced Learning for Domain Adaptive Semantic Segmentation
Comments: Accepted by International Conference on Machine Learning (ICML 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[604]  arXiv:2512.06885 [pdf, ps, other]
Title: JoPano: Unified Panorama Generation via Joint Modeling
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[605]  arXiv:2512.06882 [pdf, ps, other]
Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian Fusion
Comments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon request
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[606]  arXiv:2512.06877 [pdf, ps, other]
Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene Classification
Comments: Accepted and presented in ICSPIS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[607]  arXiv:2512.06870 [pdf, ps, other]
Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective
Comments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[608]  arXiv:2512.06866 [pdf, ps, other]
Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
Comments: Accepted by NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[609]  arXiv:2512.06865 [pdf, ps, other]
Title: Spatial Retrieval Augmented Autonomous Driving
Comments: Demo Page: this https URL with open sourced code, dataset, and checkpoints
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[610]  arXiv:2512.06864 [pdf, ps, other]
Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training
Comments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[611]  arXiv:2512.06862 [pdf, ps, other]
Title: Omni-Referring Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[612]  arXiv:2512.06849 [pdf, ps, other]
Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT
Comments: In submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[613]  arXiv:2512.06845 [pdf, ps, other]
Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[614]  arXiv:2512.06840 [pdf, ps, other]
Title: CADE: Continual Weakly-supervised Video Anomaly Detection with Ensembles
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[615]  arXiv:2512.06838 [pdf, ps, other]
Title: SparseCoop: Cooperative Perception with Kinematic-Grounded Queries
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[616]  arXiv:2512.06818 [pdf, ps, other]
Title: MeshSplatting: Differentiable Rendering with Opaque Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[617]  arXiv:2512.06811 [pdf, ps, other]
Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language Models
Comments: Accepted by AAAI 2026(Oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[618]  arXiv:2512.06810 [pdf, ps, other]
Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[619]  arXiv:2512.06802 [pdf, ps, other]
Title: VDOT: Efficient Unified Video Creation via Optimal Transport Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[620]  arXiv:2512.06793 [pdf, ps, other]
Title: Generalized Geometry Encoding Volume for Real-time Stereo Matching
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[621]  arXiv:2512.06783 [pdf, ps, other]
Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-Videos
Comments: 16 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[622]  arXiv:2512.06774 [pdf, ps, other]
Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[623]  arXiv:2512.06769 [pdf, ps, other]
Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[624]  arXiv:2512.06763 [pdf, ps, other]
Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control Algorithms
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[625]  arXiv:2512.06759 [pdf, ps, other]
Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors
Comments: 12 pages,13figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[626]  arXiv:2512.06750 [pdf, ps, other]
Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[627]  arXiv:2512.06746 [pdf, ps, other]
Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[628]  arXiv:2512.06738 [pdf, ps, other]
Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain Adaptation
Comments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[629]  arXiv:2512.06736 [pdf, ps, other]
Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[630]  arXiv:2512.06726 [pdf, ps, other]
Title: The Role of Entropy in Visual Grounding: Analysis and Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[631]  arXiv:2512.06689 [pdf, ps, other]
Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and Separation
Comments: Accepted to ASRU 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[632]  arXiv:2512.06684 [pdf, ps, other]
Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[633]  arXiv:2512.06674 [pdf, ps, other]
Title: RunawayEvil: Jailbreaking the Image-to-Video Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[634]  arXiv:2512.06673 [pdf, ps, other]
Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[635]  arXiv:2512.06663 [pdf, ps, other]
Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[636]  arXiv:2512.06662 [pdf, ps, other]
Title: Personalized Image Descriptions from Attention Sequences
Comments: 10 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[637]  arXiv:2512.06657 [pdf, ps, other]
Title: TextMamba: Scene Text Detector with Mamba
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[638]  arXiv:2512.06642 [pdf, ps, other]
Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution
Comments: 21 pages, 7 figures, 3 table
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[639]  arXiv:2512.06613 [pdf, ps, other]
Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
Authors: Yueying Ke
Comments: Version 2: Corrected reference details, improved architectural diagram, and enhanced writing for clarity and precision. Added a table illustrating the masking mechanism. No changes to experimental results or conclusions. 11 pages, 6 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[640]  arXiv:2512.06612 [pdf, ps, other]
Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics
Comments: Neurips 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[641]  arXiv:2512.06598 [pdf, ps, other]
Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake Champlain
Comments: 23 pages, 15 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[642]  arXiv:2512.06581 [pdf, ps, other]
Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[643]  arXiv:2512.06575 [pdf, ps, other]
Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules
Authors: Fariza Dahes
Comments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LG
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[644]  arXiv:2512.06565 [pdf, ps, other]
Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose Estimation
Authors: Xiujin Liu
Comments: 1 figures, 2 tables, 14pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[645]  arXiv:2512.06562 [pdf, ps, other]
Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[646]  arXiv:2512.06560 [pdf, ps, other]
Title: Bridging spatial awareness and global context in medical image segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[647]  arXiv:2512.06531 [pdf, ps, other]
Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images
Authors: Sayan Das (1), Arghadip Biswas (2) ((1) IIIT Delhi, (2) Jadavpur University)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[648]  arXiv:2512.06530 [pdf, ps, other]
Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-Generalization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[649]  arXiv:2512.06521 [pdf, ps, other]
Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife Images
Authors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)
Comments: 31 pages + appendix
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[650]  arXiv:2512.06504 [pdf, ps, other]
Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[651]  arXiv:2512.06485 [pdf, ps, other]
Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based Interaction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[652]  arXiv:2512.06447 [pdf, ps, other]
Title: Towards Stable Cross-Domain Depression Recognition under Missing Modalities
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[653]  arXiv:2512.06438 [pdf, ps, other]
Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[654]  arXiv:2512.06434 [pdf, ps, other]
Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular Screening
Comments: 8 pages, 2 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[655]  arXiv:2512.06426 [pdf, ps, other]
Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition
Comments: 12 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[656]  arXiv:2512.06424 [pdf, ps, other]
Title: DragMesh: Interactive 3D Generation Made Easy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[657]  arXiv:2512.06422 [pdf, ps, other]
Title: A Perception CNN for Facial Expression Recognition
Comments: in IEEE Transactions on Image Processing (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[658]  arXiv:2512.06421 [pdf, ps, other]
Title: Rethinking Training Dynamics in Scale-wise Autoregressive Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[659]  arXiv:2512.06400 [pdf, ps, other]
Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene Enhancement
Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[660]  arXiv:2512.06379 [pdf, ps, other]
Title: OCFER-Net: Recognizing Facial Expression in Online Learning System
Authors: Yi Huo, Lei Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[661]  arXiv:2512.06377 [pdf, ps, other]
Title: VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System
Authors: Yi Huo, Yun Ge
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[662]  arXiv:2512.06376 [pdf, ps, other]
Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[663]  arXiv:2512.06373 [pdf, ps, other]
Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning
Comments: The project page is [this url](this https URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[664]  arXiv:2512.06368 [pdf, ps, other]
Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[665]  arXiv:2512.06363 [pdf, ps, other]
Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[666]  arXiv:2512.06358 [pdf, ps, other]
Title: Rectifying Latent Space for Generative Single-Image Reflection Removal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[667]  arXiv:2512.06353 [pdf, ps, other]
Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search
Comments: Code and Supplementary Material could be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[668]  arXiv:2512.06345 [pdf, ps, other]
Title: CLUENet: Cluster Attention Makes Neural Networks Have Eyes
Comments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[669]  arXiv:2512.06344 [pdf, ps, other]
Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[670]  arXiv:2512.06332 [pdf, ps, other]
Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[671]  arXiv:2512.06330 [pdf, ps, other]
Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[672]  arXiv:2512.06328 [pdf, ps, other]
Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models
Comments: Accepted as an Oral presentation at AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[673]  arXiv:2512.06306 [pdf, ps, other]
Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[674]  arXiv:2512.06290 [pdf, ps, other]
Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification
Comments: 17 pages, 5 figures
Journal-ref: ICDAR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[675]  arXiv:2512.06282 [pdf, ps, other]
Title: A Sleep Monitoring System Based on Audio, Video and Depth Information
Comments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[676]  arXiv:2512.06281 [pdf, ps, other]
Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[677]  arXiv:2512.06276 [pdf, ps, other]
Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[678]  arXiv:2512.06275 [pdf, ps, other]
Title: FacePhys: State of the Heart Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[679]  arXiv:2512.06269 [pdf, ps, other]
Title: TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting
Authors: Quan Tran, Tuan Dang
Comments: 10 pages
Journal-ref: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[680]  arXiv:2512.06258 [pdf, ps, other]
Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[681]  arXiv:2512.06255 [pdf, ps, other]
Title: Language-driven Fine-grained Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[682]  arXiv:2512.06251 [pdf, ps, other]
Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[683]  arXiv:2512.06232 [pdf, ps, other]
Title: Opinion: Learning Intuitive Physics May Require More than Visual Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[684]  arXiv:2512.06230 [pdf, ps, other]
Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[685]  arXiv:2512.06221 [pdf, ps, other]
Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility Study
Authors: Alena Makarova
Comments: 15 pages, 13 figures. Reproducibility study
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[686]  arXiv:2512.06206 [pdf, ps, other]
Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning
Comments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URL
Journal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[687]  arXiv:2512.06190 [pdf, ps, other]
Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food Drying
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[688]  arXiv:2512.06185 [pdf, ps, other]
Title: SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling
Authors: Ankit Gupta, Christoph Adami, Emily Dolson (Michigan State University)
Comments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary material
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[689]  arXiv:2512.06179 [pdf, ps, other]
Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[690]  arXiv:2512.06174 [pdf, ps, other]
Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[691]  arXiv:2512.06171 [pdf, ps, other]
Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection
Comments: 11 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[692]  arXiv:2512.06158 [pdf, ps, other]
Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation
Comments: 15 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[693]  arXiv:2512.06105 [pdf, ps, other]
Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report Generation
Comments: AAAI-26-AIA
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[694]  arXiv:2512.06103 [pdf, ps, other]
Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack Detection
Comments: Accepted in IEEE T-BIOM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[695]  arXiv:2512.06096 [pdf, ps, other]
Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[696]  arXiv:2512.06080 [pdf, ps, other]
Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light
Comments: SIGGRAPH Asia 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[697]  arXiv:2512.06065 [pdf, ps, other]
Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[698]  arXiv:2512.06058 [pdf, ps, other]
Title: Representation Learning for Point Cloud Understanding
Authors: Siming Yan
Comments: 181 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[699]  arXiv:2512.06032 [pdf, ps, other]
Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[700]  arXiv:2512.06024 [pdf, ps, other]
Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
[701]  arXiv:2512.06020 [pdf, ps, other]
Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation
Comments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[702]  arXiv:2512.06014 [pdf, ps, other]
Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[703]  arXiv:2512.06013 [pdf, ps, other]
Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViT
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[704]  arXiv:2512.06012 [pdf, ps, other]
Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive Manufacturing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[705]  arXiv:2512.06010 [pdf, other]
Title: Fast and Flexible Robustness Certificates for Semantic Segmentation
Authors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[706]  arXiv:2512.06006 [pdf, ps, other]
Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[707]  arXiv:2512.06003 [pdf, ps, other]
Title: PrunedCaps: A Case For Primary Capsules Discrimination
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[708]  arXiv:2512.05996 [pdf, ps, other]
Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and Counting
Comments: 18 pages, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
[709]  arXiv:2512.05993 [pdf, ps, other]
[710]  arXiv:2512.05991 [pdf, ps, other]
Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[711]  arXiv:2512.05988 [pdf, ps, other]
Title: VG3T: Visual Geometry Grounded Gaussian Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[712]  arXiv:2512.05987 [pdf, ps, other]
Title: Adaptive Dataset Quantization: A New Direction for Dataset Pruning
Authors: Chenyue Yu, Jianyu Yu
Comments: Accepted by ICCPR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[713]  arXiv:2512.05969 [pdf, ps, other]
Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices
Authors: Hokin Deng
Comments: See $\href{this https URL}{results}$ and $\href{this https URL}{code}$
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[714]  arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]
Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[715]  arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]
Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray Segmentation
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[716]  arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]
Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics Framework
Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[717]  arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]
Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[718]  arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]
Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces
Authors: Nikita Gabdullin
Comments: 9 pages, 5 figures, 1 table, 4 equations
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[719]  arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]
Title: Human Geometry Distribution for 3D Animation Generation
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[720]  arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]
Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World Models
Comments: 23 pages, 8 figures, 3 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
[721]  arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]
Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[722]  arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]
Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness Likelihood
Comments: Accepted to WACV 2026
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[723]  arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]
Title: A Geometric Unification of Concept Learning with Concept Cones
Comments: 22 pages
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[724]  arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]
Title: Affine Subspace Models and Clustering for Patch-Based Image Denoising
Comments: Asilomar Conference on Signals, Systems, and Computers 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[725]  arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]
Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty Metrics
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[726]  arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]
Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[727]  arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]
Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket Search
Comments: This work plans to be submitted to the IEEE for possible publication
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[728]  arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]
Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning
Comments: Code: this https URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[729]  arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]
Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous Driving
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[730]  arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]
Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep Analysis
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[731]  arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]
Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme Patients
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[732]  arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]
Title: VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Comments: Project page: this https URL
Journal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[733]  arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]
Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge
Comments: 2025 NeurIPS Behavior Challenge 1st place solution
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[734]  arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]
Title: Dynamic Visual SLAM using a General 3D Prior
Comments: 8 pages
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[735]  arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]
Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge Devices
Comments: 9Pages, 3 figure, Politeknik Negeri Banyuwangi
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[736]  arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]
Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association
Comments: FAME 2026 Technical Report
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[737]  arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]
Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics
Comments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-concept
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[738]  arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]
Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG Data
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[739]  arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]
Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[740]  arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]
Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine Learning
Comments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
[741]  arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]
Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network
Authors: Xiao Li
Comments: in Chinese language
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[742]  arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]
Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[743]  arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]
Title: Vector Quantization using Gaussian Variational Autoencoder
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[744]  arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]
Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[745]  arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]
Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[746]  arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]
Title: Semantic Temporal Single-photon LiDAR
Comments: 14 pages, 5 figures. And any comment is welcome
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
[747]  arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]
Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation
Comments: NeurIPS Black in AI workshop - 2022
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[ total of 747 entries: 1-1000 | 379-747 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)