We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 250

[ total of 749 entries: 1-250 | 251-500 | 501-749 ]
[ showing 250 entries per page: fewer | more | all ]

Tue, 9 Dec 2025 (continued, showing last 140 of 259 entries)

[251]  arXiv:2512.06866 [pdf, ps, other]
Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
Comments: Accepted by NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[252]  arXiv:2512.06865 [pdf, ps, other]
Title: Spatial Retrieval Augmented Autonomous Driving
Comments: Demo Page: this https URL with open sourced code, dataset, and checkpoints
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253]  arXiv:2512.06864 [pdf, ps, other]
Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training
Comments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[254]  arXiv:2512.06862 [pdf, ps, other]
Title: Omni-Referring Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[255]  arXiv:2512.06849 [pdf, ps, other]
Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT
Comments: In submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[256]  arXiv:2512.06845 [pdf, ps, other]
Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257]  arXiv:2512.06840 [pdf, ps, other]
Title: CADE: Continual Weakly-supervised Video Anomaly Detection with Ensembles
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258]  arXiv:2512.06838 [pdf, ps, other]
Title: SparseCoop: Cooperative Perception with Kinematic-Grounded Queries
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259]  arXiv:2512.06818 [pdf, ps, other]
Title: MeshSplatting: Differentiable Rendering with Opaque Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260]  arXiv:2512.06811 [pdf, ps, other]
Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language Models
Comments: Accepted by AAAI 2026(Oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[261]  arXiv:2512.06810 [pdf, ps, other]
Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[262]  arXiv:2512.06802 [pdf, ps, other]
Title: VDOT: Efficient Unified Video Creation via Optimal Transport Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263]  arXiv:2512.06793 [pdf, ps, other]
Title: Generalized Geometry Encoding Volume for Real-time Stereo Matching
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264]  arXiv:2512.06783 [pdf, ps, other]
Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-Videos
Comments: 16 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[265]  arXiv:2512.06774 [pdf, ps, other]
Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[266]  arXiv:2512.06769 [pdf, ps, other]
Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[267]  arXiv:2512.06763 [pdf, ps, other]
Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control Algorithms
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268]  arXiv:2512.06759 [pdf, ps, other]
Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors
Comments: 12 pages,13figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[269]  arXiv:2512.06750 [pdf, ps, other]
Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[270]  arXiv:2512.06746 [pdf, ps, other]
Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[271]  arXiv:2512.06738 [pdf, ps, other]
Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain Adaptation
Comments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[272]  arXiv:2512.06736 [pdf, ps, other]
Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2512.06726 [pdf, ps, other]
Title: The Role of Entropy in Visual Grounding: Analysis and Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[274]  arXiv:2512.06689 [pdf, ps, other]
Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and Separation
Comments: Accepted to ASRU 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[275]  arXiv:2512.06684 [pdf, ps, other]
Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[276]  arXiv:2512.06674 [pdf, ps, other]
Title: RunawayEvil: Jailbreaking the Image-to-Video Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[277]  arXiv:2512.06673 [pdf, ps, other]
Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[278]  arXiv:2512.06663 [pdf, ps, other]
Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279]  arXiv:2512.06662 [pdf, ps, other]
Title: Personalized Image Descriptions from Attention Sequences
Comments: 10 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[280]  arXiv:2512.06657 [pdf, ps, other]
Title: TextMamba: Scene Text Detector with Mamba
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[281]  arXiv:2512.06642 [pdf, ps, other]
Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution
Comments: 21 pages, 7 figures, 3 table
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[282]  arXiv:2512.06613 [pdf, ps, other]
Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
Authors: Yueying Ke
Comments: 10 pages, 6 figures, 2 tables, IEEE conference format. Submitted as course project
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283]  arXiv:2512.06612 [pdf, ps, other]
Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics
Comments: Neurips 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284]  arXiv:2512.06598 [pdf, ps, other]
Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake Champlain
Comments: 23 pages, 15 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285]  arXiv:2512.06581 [pdf, ps, other]
Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286]  arXiv:2512.06575 [pdf, ps, other]
Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules
Authors: Fariza Dahes
Comments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LG
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[287]  arXiv:2512.06565 [pdf, ps, other]
Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose Estimation
Authors: Xiujin Liu
Comments: 1 figures, 2 tables, 14pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288]  arXiv:2512.06562 [pdf, ps, other]
Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[289]  arXiv:2512.06560 [pdf, ps, other]
Title: Bridging spatial awareness and global context in medical image segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2512.06531 [pdf, ps, other]
Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images
Authors: Sayan Das (1), Arghadip Biswas (2) ((1) IIIT Delhi, (2) Jadavpur University)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[291]  arXiv:2512.06530 [pdf, ps, other]
Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-Generalization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[292]  arXiv:2512.06521 [pdf, ps, other]
Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife Images
Authors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)
Comments: 31 pages + appendix
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[293]  arXiv:2512.06504 [pdf, ps, other]
Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[294]  arXiv:2512.06485 [pdf, ps, other]
Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based Interaction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2512.06447 [pdf, ps, other]
Title: Towards Stable Cross-Domain Depression Recognition under Missing Modalities
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296]  arXiv:2512.06438 [pdf, ps, other]
Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297]  arXiv:2512.06434 [pdf, ps, other]
Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular Screening
Comments: 8 pages, 2 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[298]  arXiv:2512.06426 [pdf, ps, other]
Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition
Comments: 12 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[299]  arXiv:2512.06424 [pdf, ps, other]
Title: DragMesh: Interactive 3D Generation Made Easy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300]  arXiv:2512.06422 [pdf, ps, other]
Title: A Perception CNN for Facial Expression Recognition
Comments: in IEEE Transactions on Image Processing (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301]  arXiv:2512.06421 [pdf, ps, other]
Title: Rethinking Training Dynamics in Scale-wise Autoregressive Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[302]  arXiv:2512.06400 [pdf, ps, other]
Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene Enhancement
Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303]  arXiv:2512.06379 [pdf, ps, other]
Title: OCFER-Net: Recognizing Facial Expression in Online Learning System
Authors: Yi Huo, Lei Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2512.06377 [pdf, ps, other]
Title: VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System
Authors: Yi Huo, Yun Ge
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[305]  arXiv:2512.06376 [pdf, ps, other]
Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306]  arXiv:2512.06373 [pdf, ps, other]
Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning
Comments: The project page is [this url](this https URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2512.06368 [pdf, ps, other]
Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308]  arXiv:2512.06363 [pdf, ps, other]
Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[309]  arXiv:2512.06358 [pdf, ps, other]
Title: Rectifying Latent Space for Generative Single-Image Reflection Removal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310]  arXiv:2512.06353 [pdf, ps, other]
Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search
Comments: Code and Supplementary Material could be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[311]  arXiv:2512.06345 [pdf, ps, other]
Title: CLUENet: Cluster Attention Makes Neural Networks Have Eyes
Comments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312]  arXiv:2512.06344 [pdf, ps, other]
Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313]  arXiv:2512.06332 [pdf, ps, other]
Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314]  arXiv:2512.06330 [pdf, ps, other]
Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[315]  arXiv:2512.06328 [pdf, ps, other]
Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models
Comments: Accepted as an Oral presentation at AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316]  arXiv:2512.06306 [pdf, ps, other]
Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[317]  arXiv:2512.06290 [pdf, ps, other]
Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification
Comments: 17 pages, 5 figures
Journal-ref: ICDAR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318]  arXiv:2512.06282 [pdf, ps, other]
Title: A Sleep Monitoring System Based on Audio, Video and Depth Information
Comments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[319]  arXiv:2512.06281 [pdf, ps, other]
Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320]  arXiv:2512.06276 [pdf, ps, other]
Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[321]  arXiv:2512.06275 [pdf, ps, other]
Title: FacePhys: State of the Heart Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322]  arXiv:2512.06269 [pdf, ps, other]
Title: TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting
Authors: Quan Tran, Tuan Dang
Comments: 10 pages
Journal-ref: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323]  arXiv:2512.06258 [pdf, ps, other]
Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324]  arXiv:2512.06255 [pdf, ps, other]
Title: Language-driven Fine-grained Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325]  arXiv:2512.06251 [pdf, ps, other]
Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326]  arXiv:2512.06232 [pdf, ps, other]
Title: Opinion: Learning Intuitive Physics May Require More than Visual Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[327]  arXiv:2512.06230 [pdf, ps, other]
Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328]  arXiv:2512.06221 [pdf, ps, other]
Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility Study
Authors: Alena Makarova
Comments: 15 pages, 13 figures. Reproducibility study
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329]  arXiv:2512.06206 [pdf, ps, other]
Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning
Comments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URL
Journal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[330]  arXiv:2512.06190 [pdf, ps, other]
Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food Drying
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[331]  arXiv:2512.06185 [pdf, ps, other]
Title: SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling
Authors: Ankit Gupta, Christoph Adami, Emily Dolson (Michigan State University)
Comments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary material
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[332]  arXiv:2512.06179 [pdf, ps, other]
Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[333]  arXiv:2512.06174 [pdf, ps, other]
Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334]  arXiv:2512.06171 [pdf, ps, other]
Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection
Comments: 11 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335]  arXiv:2512.06158 [pdf, ps, other]
Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation
Comments: 15 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[336]  arXiv:2512.06105 [pdf, ps, other]
Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report Generation
Comments: AAAI-26-AIA
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[337]  arXiv:2512.06103 [pdf, ps, other]
Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack Detection
Comments: Accepted in IEEE T-BIOM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[338]  arXiv:2512.06096 [pdf, ps, other]
Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[339]  arXiv:2512.06080 [pdf, ps, other]
Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light
Comments: SIGGRAPH Asia 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[340]  arXiv:2512.06065 [pdf, ps, other]
Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[341]  arXiv:2512.06058 [pdf, ps, other]
Title: Representation Learning for Point Cloud Understanding
Authors: Siming Yan
Comments: 181 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[342]  arXiv:2512.06032 [pdf, ps, other]
Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[343]  arXiv:2512.06024 [pdf, ps, other]
Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
[344]  arXiv:2512.06020 [pdf, ps, other]
Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation
Comments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[345]  arXiv:2512.06014 [pdf, ps, other]
Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[346]  arXiv:2512.06013 [pdf, ps, other]
Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViT
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[347]  arXiv:2512.06012 [pdf, ps, other]
Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive Manufacturing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[348]  arXiv:2512.06010 [pdf, other]
Title: Fast and Flexible Robustness Certificates for Semantic Segmentation
Authors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[349]  arXiv:2512.06006 [pdf, ps, other]
Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[350]  arXiv:2512.06003 [pdf, ps, other]
Title: PrunedCaps: A Case For Primary Capsules Discrimination
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[351]  arXiv:2512.05996 [pdf, ps, other]
Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and Counting
Comments: 18 pages, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
[352]  arXiv:2512.05993 [pdf, ps, other]
[353]  arXiv:2512.05991 [pdf, ps, other]
Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[354]  arXiv:2512.05988 [pdf, ps, other]
Title: VG3T: Visual Geometry Grounded Gaussian Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[355]  arXiv:2512.05987 [pdf, ps, other]
Title: Adaptive Dataset Quantization: A New Direction for Dataset Pruning
Authors: Chenyue Yu, Jianyu Yu
Comments: Accepted by ICCPR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[356]  arXiv:2512.05969 [pdf, ps, other]
Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices
Authors: Hokin Deng
Comments: See $\href{this https URL}{results}$ and $\href{this https URL}{code}$
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[357]  arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]
Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[358]  arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]
Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray Segmentation
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[359]  arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]
Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics Framework
Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[360]  arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]
Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[361]  arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]
Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces
Authors: Nikita Gabdullin
Comments: 9 pages, 5 figures, 1 table, 4 equations
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[362]  arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]
Title: Human Geometry Distribution for 3D Animation Generation
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[363]  arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]
Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World Models
Comments: 23 pages, 8 figures, 3 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
[364]  arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]
Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[365]  arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]
Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness Likelihood
Comments: Accepted to WACV 2026
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[366]  arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]
Title: A Geometric Unification of Concept Learning with Concept Cones
Comments: 22 pages
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[367]  arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]
Title: Affine Subspace Models and Clustering for Patch-Based Image Denoising
Comments: Asilomar Conference on Signals, Systems, and Computers 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[368]  arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]
Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty Metrics
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[369]  arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]
Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[370]  arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]
Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket Search
Comments: This work plans to be submitted to the IEEE for possible publication
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[371]  arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]
Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning
Comments: Code: this https URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[372]  arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]
Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous Driving
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[373]  arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]
Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep Analysis
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[374]  arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]
Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme Patients
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[375]  arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]
Title: VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Comments: Project page: this https URL
Journal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[376]  arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]
Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge
Comments: 2025 NeurIPS Behavior Challenge 1st place solution
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[377]  arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]
Title: Dynamic Visual SLAM using a General 3D Prior
Comments: 8 pages
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[378]  arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]
Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge Devices
Comments: 9Pages, 3 figure, Politeknik Negeri Banyuwangi
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[379]  arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]
Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association
Comments: FAME 2026 Technical Report
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[380]  arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]
Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics
Comments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-concept
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[381]  arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]
Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG Data
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[382]  arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]
Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[383]  arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]
Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine Learning
Comments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
[384]  arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]
Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network
Authors: Xiao Li
Comments: in Chinese language
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[385]  arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]
Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[386]  arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]
Title: Vector Quantization using Gaussian Variational Autoencoder
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[387]  arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]
Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[388]  arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]
Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[389]  arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]
Title: Semantic Temporal Single-photon LiDAR
Comments: 14 pages, 5 figures. And any comment is welcome
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
[390]  arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]
Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation
Comments: NeurIPS Black in AI workshop - 2022
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Mon, 8 Dec 2025

[391]  arXiv:2512.05965 [pdf, ps, other]
Title: EditThinker: Unlocking Iterative Reasoning for Any Image Editor
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[392]  arXiv:2512.05960 [pdf, ps, other]
Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[393]  arXiv:2512.05941 [pdf, ps, other]
Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[394]  arXiv:2512.05937 [pdf, ps, other]
Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV Perception
Comments: 8 pages, 2 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[395]  arXiv:2512.05936 [pdf, ps, other]
Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition
Comments: 8 pages, 8 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[396]  arXiv:2512.05928 [pdf, ps, other]
Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face Recognition
Comments: 18 pages, 17 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[397]  arXiv:2512.05927 [pdf, ps, other]
Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[398]  arXiv:2512.05922 [pdf, ps, other]
Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology Segmentation
Comments: Note: Khang Le and Anh Mai Vu contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[399]  arXiv:2512.05920 [pdf, ps, other]
Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[400]  arXiv:2512.05905 [pdf, ps, other]
Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[401]  arXiv:2512.05866 [pdf, ps, other]
Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator
Comments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[402]  arXiv:2512.05859 [pdf, ps, other]
Title: Edit-aware RAW Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403]  arXiv:2512.05853 [pdf, ps, other]
Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404]  arXiv:2512.05830 [pdf, ps, other]
Title: Phase-OTDR Event Detection Using Image-Based Data Transformation and Deep Learning
Comments: 22 pages, 11 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[405]  arXiv:2512.05814 [pdf, ps, other]
Title: UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease Detection
Comments: The code is already available on GitHub: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[406]  arXiv:2512.05809 [pdf, ps, other]
Title: Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling
Comments: Extended abstract at World Modeling Workshop 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[407]  arXiv:2512.05802 [pdf, ps, other]
Title: Bring Your Dreams to Life: Continual Text-to-Video Customization
Comments: Accepted to AAAI2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408]  arXiv:2512.05783 [pdf, ps, other]
Title: Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[409]  arXiv:2512.05774 [pdf, ps, other]
Title: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding
Comments: Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[410]  arXiv:2512.05762 [pdf, ps, other]
Title: FNOPT: Resolution-Agnostic, Self-Supervised Cloth Simulation using Meta-Optimization with Fourier Neural Operators
Comments: Accepted for WACV
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[411]  arXiv:2512.05759 [pdf, ps, other]
Title: Label-Efficient Point Cloud Segmentation with Active Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[412]  arXiv:2512.05754 [pdf, ps, other]
Title: USV: Unified Sparsification for Accelerating Video Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[413]  arXiv:2512.05746 [pdf, ps, other]
Title: HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[414]  arXiv:2512.05740 [pdf, ps, other]
Title: Distilling Expert Surgical Knowledge: How to train local surgical VLMs for anatomy explanation in Complete Mesocolic Excision
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[415]  arXiv:2512.05710 [pdf, ps, other]
Title: Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[416]  arXiv:2512.05698 [pdf, ps, other]
Title: OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning
Comments: The 40th Annual AAAI Conference on Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[417]  arXiv:2512.05683 [pdf, ps, other]
Title: Physics-Informed Graph Neural Network with Frequency-Aware Learning for Optical Aberration Correction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[418]  arXiv:2512.05674 [pdf, ps, other]
Title: Hyperspectral Unmixing with 3D Convolutional Sparse Coding and Projected Simplex Volume Maximization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[419]  arXiv:2512.05672 [pdf, ps, other]
Title: InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[420]  arXiv:2512.05669 [pdf, ps, other]
Title: Deep Learning-Based Real-Time Sequential Facial Expression Analysis Using Geometric Features
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[421]  arXiv:2512.05663 [pdf, ps, other]
Title: LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[422]  arXiv:2512.05651 [pdf, ps, other]
Title: Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[423]  arXiv:2512.05635 [pdf, ps, other]
Title: Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[424]  arXiv:2512.05613 [pdf, ps, other]
Title: DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[425]  arXiv:2512.05610 [pdf, ps, other]
Title: NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projections
Comments: 19 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[426]  arXiv:2512.05597 [pdf, ps, other]
Title: Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token Prediction
Comments: 10 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427]  arXiv:2512.05593 [pdf, ps, other]
Title: Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer
Comments: Accepted to 3DV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[428]  arXiv:2512.05571 [pdf, ps, other]
Title: MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical Imaging
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429]  arXiv:2512.05564 [pdf, ps, other]
Title: ProPhy: Progressive Physical Alignment for Dynamic World Simulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430]  arXiv:2512.05557 [pdf, ps, other]
Title: 2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence Consistency
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[431]  arXiv:2512.05546 [pdf, ps, other]
Title: Conscious Gaze: Adaptive Attention Mechanisms for Hallucination Mitigation in Vision-Language Models
Comments: 6 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[432]  arXiv:2512.05539 [pdf, ps, other]
Title: Ideal Observer for Segmentation of Dead Leaves Images
Comments: 41 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Methodology (stat.ME)
[433]  arXiv:2512.05529 [pdf, ps, other]
Title: See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors
Comments: The first two authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[434]  arXiv:2512.05524 [pdf, ps, other]
Title: VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435]  arXiv:2512.05515 [pdf, ps, other]
Title: DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis
Comments: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[436]  arXiv:2512.05513 [pdf, ps, other]
Title: Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[437]  arXiv:2512.05511 [pdf, ps, other]
Title: Rethinking Infrared Small Target Detection: A Foundation-Driven Efficient Paradigm
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[438]  arXiv:2512.05494 [pdf, ps, other]
Title: Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation
Comments: Accepted to AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[439]  arXiv:2512.05492 [pdf, ps, other]
Title: WaterWave: Bridging Underwater Image Enhancement into Video Streams via Wavelet-based Temporal Consistency Field
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[440]  arXiv:2512.05482 [pdf, ps, other]
Title: Concept-based Explainable Data Mining with VLM for 3D Detection
Authors: Mai Tsujimoto
Comments: 28 pages including appendix. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[441]  arXiv:2512.05481 [pdf, ps, other]
Title: UniFS: Unified Multi-Contrast MRI Reconstruction via Frequency-Spatial Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[442]  arXiv:2512.05478 [pdf, ps, other]
Title: EmoStyle: Emotion-Driven Image Stylization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[443]  arXiv:2512.05468 [pdf, ps, other]
Title: University Building Recognition Dataset in Thailand for the mission-oriented IoT sensor system
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[444]  arXiv:2512.05446 [pdf, ps, other]
Title: TED-4DGS: Temporally Activated and Embedding-based Deformation for 4DGS Compression
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[445]  arXiv:2512.05422 [pdf, ps, other]
Title: ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[446]  arXiv:2512.05418 [pdf, ps, other]
Title: Performance Evaluation of Deep Learning for Tree Branch Segmentation in Autonomous Forestry Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[447]  arXiv:2512.05415 [pdf, ps, other]
Title: Moving object detection from multi-depth images with an attention-enhanced CNN
Comments: 14 pages, 22 figures, submitted to PASJ
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[448]  arXiv:2512.05412 [pdf, ps, other]
Title: YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[449]  arXiv:2512.05410 [pdf, ps, other]
Title: Genetic Algorithms For Parameter Optimization for Disparity Map Generation of Radiata Pine Branch Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[450]  arXiv:2512.05398 [pdf, ps, other]
Title: The Dynamic Prior: Understanding 3D Structures for Casual Dynamic Videos
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451]  arXiv:2512.05394 [pdf, ps, other]
Title: Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[452]  arXiv:2512.05391 [pdf, ps, other]
Title: LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453]  arXiv:2512.05385 [pdf, ps, other]
Title: ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[454]  arXiv:2512.05362 [pdf, ps, other]
Title: PoolNet: Deep Learning for 2D to 3D Video Process Validation
Comments: All code related to this paper can be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[455]  arXiv:2512.05359 [pdf, ps, other]
Title: Group Orthogonal Low-Rank Adaptation for RGB-T Tracking
Comments: 13 pages, 8 figures. Accepted by AAAI 2026. Extended version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456]  arXiv:2512.05354 [pdf, ps, other]
Title: SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training
Comments: project page this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[457]  arXiv:2512.05343 [pdf, ps, other]
Title: SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[458]  arXiv:2512.05277 [pdf, ps, other]
Title: From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[459]  arXiv:2512.05272 [pdf, ps, other]
Title: Inferring Compositional 4D Scenes without Ever Seeing One
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460]  arXiv:2512.05268 [pdf, ps, other]
Title: CARD: Correlation Aware Restoration with Diffusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461]  arXiv:2512.05259 [pdf, ps, other]
Title: Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462]  arXiv:2512.05240 [pdf, ps, other]
Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463]  arXiv:2512.05209 [pdf, ps, other]
Title: DEAR: Dataset for Evaluating the Aesthetics of RenderingDEAR: Dataset for Evaluating the Aesthetics of Rendering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464]  arXiv:2512.05198 [pdf, ps, other]
Title: Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion Models
Comments: 16 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[465]  arXiv:2512.05172 [pdf, ps, other]
Title: Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[466]  arXiv:2512.05152 [pdf, ps, other]
Title: EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer Models
Comments: 6pages, 5figures, published to 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[467]  arXiv:2512.05150 [pdf, ps, other]
Title: TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
Comments: arxiv v0
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[468]  arXiv:2512.05145 [pdf, ps, other]
Title: Self-Improving VLM Judges Without Human Annotations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469]  arXiv:2512.05140 [pdf, other]
Title: FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation
Authors: Georges Le Bellier (CEDRIC - VERTIGO, Cnam), Nicolas Audebert (LaSTIG, IGN, CEDRIC - VERTIGO)
Comments: 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2026, Tucson (AZ), United States
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[470]  arXiv:2512.05139 [pdf, ps, other]
Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[471]  arXiv:2512.05137 [pdf, ps, other]
Title: ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[472]  arXiv:2512.05136 [pdf, ps, other]
Title: Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[473]  arXiv:2512.05134 [pdf, ps, other]
Title: InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models
Authors: Zihao Wu
Comments: 8 pages main, 8 pages appendix, 16 figures, 5 tables. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[474]  arXiv:2512.05132 [pdf, ps, other]
Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[475]  arXiv:2512.05131 [pdf, ps, other]
Title: AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[476]  arXiv:2512.05959 (cross-list from cs.CL) [pdf, ps, other]
Title: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Comments: Preprint
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[477]  arXiv:2512.05955 (cross-list from cs.RO) [pdf, ps, other]
Title: SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[478]  arXiv:2512.05932 (cross-list from cs.RO) [pdf, ps, other]
Title: Physically-Based Simulation of Automotive LiDAR
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[479]  arXiv:2512.05824 (cross-list from cs.AI) [pdf, ps, other]
Title: Multimodal Oncology Agent for IDH1 Mutation Prediction in Low-Grade Glioma
Authors: Hafsa Akebli (1), Adam Shephard (2), Vincenzo Della Mea (1), Nasir Rajpoot (2 and 3) ((1) University of Udine, Udine, Italy, (2) University of Warwick, Coventry, UK, (3) Histofy Ltd, Coventry, UK)
Comments: 4 pages, 2 figures
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[480]  arXiv:2512.05812 (cross-list from cs.RO) [pdf, ps, other]
Title: Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation
Comments: This work has been submitted to the IEEE for possible publication
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[481]  arXiv:2512.05665 (cross-list from cs.CL) [pdf, ps, other]
Title: Interleaved Latent Visual Reasoning with Selective Perceptual Modeling
Comments: 11 pages, 6 figures. Code available at this https URL
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[482]  arXiv:2512.05438 (cross-list from cs.HC) [pdf, ps, other]
Title: EXR: An Interactive Immersive EHR Visualization in Extended Reality
Comments: 11 pages, 6 figures. Preprint version. This paper has been accepted to IEEE ICIR 2025. This is the author-prepared version and not the final published version. The final version will appear in IEEE Xplo
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[483]  arXiv:2512.05299 (cross-list from eess.SY) [pdf, ps, other]
Title: ARCAS: An Augmented Reality Collision Avoidance System with SLAM-Based Tracking for Enhancing VRU Safety
Comments: 8 pages, 3 figures, 1 table
Subjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Robotics (cs.RO); Image and Video Processing (eess.IV)
[484]  arXiv:2512.05126 (cross-list from eess.AS) [pdf, ps, other]
Title: SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS Model
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)

Fri, 5 Dec 2025 (showing first 16 of 135 entries)

[485]  arXiv:2512.05115 [pdf, ps, other]
Title: Light-X: Generative 4D Video Rendering with Camera and Illumination Control
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[486]  arXiv:2512.05113 [pdf, ps, other]
Title: Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting
Comments: WACV 2026. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[487]  arXiv:2512.05112 [pdf, ps, other]
Title: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[488]  arXiv:2512.05111 [pdf, ps, other]
Title: ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[489]  arXiv:2512.05110 [pdf, ps, other]
Title: ShadowDraw: From Any Object to Shadow-Drawing Compositional Art
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[490]  arXiv:2512.05106 [pdf, ps, other]
Title: NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Robotics (cs.RO)
[491]  arXiv:2512.05104 [pdf, ps, other]
Title: EvoIR: Towards All-in-One Image Restoration via Evolutionary Frequency Modulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[492]  arXiv:2512.05098 [pdf, ps, other]
Title: SA-IQA: Redefining Image Quality Assessment for Spatial Aesthetics with Multi-Dimensional Rewards
Authors: Yuan Gao, Jin Song
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[493]  arXiv:2512.05091 [pdf, ps, other]
Title: Visual Reasoning Tracer: Object-Level Grounded Reasoning Benchmark
Comments: Technical Report; Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[494]  arXiv:2512.05081 [pdf, ps, other]
Title: Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[495]  arXiv:2512.05079 [pdf, ps, other]
Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[496]  arXiv:2512.05076 [pdf, ps, other]
Title: BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[497]  arXiv:2512.05060 [pdf, ps, other]
Title: 4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer
Comments: Code: this https URL, Webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[498]  arXiv:2512.05044 [pdf, ps, other]
Title: Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Comments: 18 Pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[499]  arXiv:2512.05039 [pdf, ps, other]
Title: Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual Encoding
Comments: Submitted for review CVPR-2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[500]  arXiv:2512.05025 [pdf, ps, other]
Title: RAMEN: Resolution-Adjustable Multimodal Encoder for Earth Observation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 749 entries: 1-250 | 251-500 | 501-749 ]
[ showing 250 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)