We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 59

[ total of 759 entries: 1-250 | 60-309 | 310-559 | 560-759 ]
[ showing 250 entries per page: fewer | more | all ]

Tue, 9 Dec 2025 (continued, showing last 200 of 259 entries)

[60]  arXiv:2512.07338 [pdf, ps, other]
Title: Generalized Referring Expression Segmentation on Aerial Photos
Comments: Submitted to IEEE J-STARS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[61]  arXiv:2512.07331 [pdf, ps, other]
Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision Transformers
Authors: Kanishk Awadhiya
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[62]  arXiv:2512.07328 [pdf, ps, other]
Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[63]  arXiv:2512.07305 [pdf, ps, other]
Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64]  arXiv:2512.07302 [pdf, ps, other]
Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[65]  arXiv:2512.07276 [pdf, ps, other]
Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery
Comments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66]  arXiv:2512.07275 [pdf, ps, other]
Title: Effective Attention-Guided Multi-Scale Medical Network for Skin Lesion Segmentation
Comments: The paper has been accepted by BIBM 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[67]  arXiv:2512.07273 [pdf, ps, other]
Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language Translation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[68]  arXiv:2512.07269 [pdf, ps, other]
Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[69]  arXiv:2512.07253 [pdf, ps, other]
Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement
Comments: 18 pages, 8 figures, and 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[70]  arXiv:2512.07251 [pdf, ps, other]
Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71]  arXiv:2512.07247 [pdf, ps, other]
Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing
Comments: 40 pages, 34 figures, 18 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[72]  arXiv:2512.07245 [pdf, ps, other]
Title: Zero-Shot Textual Explanations via Translating Decision-Critical Features
Comments: 11+6 pages, 8 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73]  arXiv:2512.07241 [pdf, ps, other]
Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network Architecture
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[74]  arXiv:2512.07237 [pdf, ps, other]
Title: Unified Camera Positional Encoding for Controlled Video Generation
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[75]  arXiv:2512.07234 [pdf, ps, other]
Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[76]  arXiv:2512.07230 [pdf, ps, other]
Title: STRinGS: Selective Text Refinement in Gaussian Splatting
Comments: Accepted to WACV 2026. Project Page, see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77]  arXiv:2512.07229 [pdf, ps, other]
Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category Discovery
Comments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78]  arXiv:2512.07228 [pdf, ps, other]
Title: Towards Robust Protective Perturbation against DeepFake Face Swapping
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[79]  arXiv:2512.07215 [pdf, ps, other]
Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[80]  arXiv:2512.07211 [pdf, ps, other]
Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point Clouds
Comments: 8 pages, 8 figures, 5 tables, ICCR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[81]  arXiv:2512.07206 [pdf, ps, other]
Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CT
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[82]  arXiv:2512.07203 [pdf, ps, other]
Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning
Comments: 7 pages, 1 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83]  arXiv:2512.07201 [pdf, ps, other]
Title: Understanding Diffusion Models via Code Execution
Authors: Cheng Yu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[84]  arXiv:2512.07198 [pdf, ps, other]
Title: Generating Storytelling Images with Rich Chains-of-Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[85]  arXiv:2512.07197 [pdf, ps, other]
Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting
Comments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[86]  arXiv:2512.07192 [pdf, ps, other]
Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image Compression
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[87]  arXiv:2512.07191 [pdf, ps, other]
Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field Correction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[88]  arXiv:2512.07190 [pdf, ps, other]
Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[89]  arXiv:2512.07186 [pdf, ps, other]
Title: START: Spatial and Textual Learning for Chart Understanding
Comments: WACV2026 Camera Ready
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[90]  arXiv:2512.07171 [pdf, ps, other]
Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image Restoration
Comments: 21 pages, 11 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[91]  arXiv:2512.07170 [pdf, ps, other]
Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[92]  arXiv:2512.07166 [pdf, ps, other]
Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing
Comments: 9 pages,7figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93]  arXiv:2512.07165 [pdf, ps, other]
Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94]  arXiv:2512.07155 [pdf, ps, other]
Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented Metrics
Comments: Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95]  arXiv:2512.07141 [pdf, ps, other]
Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[96]  arXiv:2512.07136 [pdf, ps, other]
Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[97]  arXiv:2512.07135 [pdf, ps, other]
Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[98]  arXiv:2512.07128 [pdf, ps, other]
Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIP
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[99]  arXiv:2512.07126 [pdf, ps, other]
Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-On
Comments: 16 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100]  arXiv:2512.07110 [pdf, ps, other]
Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[101]  arXiv:2512.07107 [pdf, ps, other]
Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D Supervision
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102]  arXiv:2512.07078 [pdf, ps, other]
Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object Detection
Comments: 16 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[103]  arXiv:2512.07076 [pdf, ps, other]
Title: Context-measure: Contextualizing Metric for Camouflage
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[104]  arXiv:2512.07065 [pdf, ps, other]
Title: Persistent Homology-Guided Frequency Filtering for Image Compression
Comments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compression
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105]  arXiv:2512.07062 [pdf, ps, other]
Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[106]  arXiv:2512.07052 [pdf, ps, other]
Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[107]  arXiv:2512.07051 [pdf, ps, other]
Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation
Comments: 11 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[108]  arXiv:2512.07037 [pdf, ps, other]
Title: Evaluating and Preserving High-level Fidelity in Super-Resolution
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[109]  arXiv:2512.07034 [pdf, ps, other]
Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent Cues
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[110]  arXiv:2512.06981 [pdf, ps, other]
Title: Selective Masking based Self-Supervised Learning for Image Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[111]  arXiv:2512.06949 [pdf, ps, other]
Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology
Comments: 19 pages, 5 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[112]  arXiv:2512.06921 [pdf, ps, other]
Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification
Comments: Accepted by IEEE ICIA 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[113]  arXiv:2512.06905 [pdf, ps, other]
Title: Scaling Zero-Shot Reference-to-Video Generation
Comments: Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[114]  arXiv:2512.06888 [pdf, ps, other]
Title: Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[115]  arXiv:2512.06886 [pdf, ps, other]
Title: Balanced Learning for Domain Adaptive Semantic Segmentation
Comments: Accepted by International Conference on Machine Learning (ICML 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[116]  arXiv:2512.06885 [pdf, ps, other]
Title: JoPano: Unified Panorama Generation via Joint Modeling
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[117]  arXiv:2512.06882 [pdf, ps, other]
Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian Fusion
Comments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon request
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118]  arXiv:2512.06877 [pdf, ps, other]
Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene Classification
Comments: Accepted and presented in ICSPIS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[119]  arXiv:2512.06870 [pdf, ps, other]
Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective
Comments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120]  arXiv:2512.06866 [pdf, ps, other]
Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
Comments: Accepted by NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[121]  arXiv:2512.06865 [pdf, ps, other]
Title: Spatial Retrieval Augmented Autonomous Driving
Comments: Demo Page: this https URL with open sourced code, dataset, and checkpoints
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[122]  arXiv:2512.06864 [pdf, ps, other]
Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training
Comments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[123]  arXiv:2512.06862 [pdf, ps, other]
Title: Omni-Referring Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[124]  arXiv:2512.06849 [pdf, ps, other]
Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT
Comments: In submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[125]  arXiv:2512.06845 [pdf, ps, other]
Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[126]  arXiv:2512.06840 [pdf, ps, other]
Title: CADE: Continual Weakly-supervised Video Anomaly Detection with Ensembles
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127]  arXiv:2512.06838 [pdf, ps, other]
Title: SparseCoop: Cooperative Perception with Kinematic-Grounded Queries
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128]  arXiv:2512.06818 [pdf, ps, other]
Title: MeshSplatting: Differentiable Rendering with Opaque Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[129]  arXiv:2512.06811 [pdf, ps, other]
Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language Models
Comments: Accepted by AAAI 2026(Oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[130]  arXiv:2512.06810 [pdf, ps, other]
Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[131]  arXiv:2512.06802 [pdf, ps, other]
Title: VDOT: Efficient Unified Video Creation via Optimal Transport Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[132]  arXiv:2512.06793 [pdf, ps, other]
Title: Generalized Geometry Encoding Volume for Real-time Stereo Matching
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133]  arXiv:2512.06783 [pdf, ps, other]
Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-Videos
Comments: 16 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[134]  arXiv:2512.06774 [pdf, ps, other]
Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[135]  arXiv:2512.06769 [pdf, ps, other]
Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[136]  arXiv:2512.06763 [pdf, ps, other]
Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control Algorithms
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137]  arXiv:2512.06759 [pdf, ps, other]
Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors
Comments: 12 pages,13figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[138]  arXiv:2512.06750 [pdf, ps, other]
Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[139]  arXiv:2512.06746 [pdf, ps, other]
Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[140]  arXiv:2512.06738 [pdf, ps, other]
Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain Adaptation
Comments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141]  arXiv:2512.06736 [pdf, ps, other]
Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142]  arXiv:2512.06726 [pdf, ps, other]
Title: The Role of Entropy in Visual Grounding: Analysis and Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[143]  arXiv:2512.06689 [pdf, ps, other]
Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and Separation
Comments: Accepted to ASRU 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[144]  arXiv:2512.06684 [pdf, ps, other]
Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[145]  arXiv:2512.06674 [pdf, ps, other]
Title: RunawayEvil: Jailbreaking the Image-to-Video Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146]  arXiv:2512.06673 [pdf, ps, other]
Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147]  arXiv:2512.06663 [pdf, ps, other]
Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148]  arXiv:2512.06662 [pdf, ps, other]
Title: Personalized Image Descriptions from Attention Sequences
Comments: 10 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149]  arXiv:2512.06657 [pdf, ps, other]
Title: TextMamba: Scene Text Detector with Mamba
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[150]  arXiv:2512.06642 [pdf, ps, other]
Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution
Comments: 21 pages, 7 figures, 3 table
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[151]  arXiv:2512.06613 [pdf, ps, other]
Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
Authors: Yueying Ke
Comments: 10 pages, 6 figures, 2 tables, IEEE conference format. Submitted as course project
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152]  arXiv:2512.06612 [pdf, ps, other]
Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics
Comments: Neurips 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153]  arXiv:2512.06598 [pdf, ps, other]
Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake Champlain
Comments: 23 pages, 15 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[154]  arXiv:2512.06581 [pdf, ps, other]
Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155]  arXiv:2512.06575 [pdf, ps, other]
Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules
Authors: Fariza Dahes
Comments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LG
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[156]  arXiv:2512.06565 [pdf, ps, other]
Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose Estimation
Authors: Xiujin Liu
Comments: 1 figures, 2 tables, 14pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[157]  arXiv:2512.06562 [pdf, ps, other]
Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[158]  arXiv:2512.06560 [pdf, ps, other]
Title: Bridging spatial awareness and global context in medical image segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159]  arXiv:2512.06531 [pdf, ps, other]
Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images
Authors: Sayan Das (1), Arghadip Biswas (2) ((1) IIIT Delhi, (2) Jadavpur University)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[160]  arXiv:2512.06530 [pdf, ps, other]
Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-Generalization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[161]  arXiv:2512.06521 [pdf, ps, other]
Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife Images
Authors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)
Comments: 31 pages + appendix
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[162]  arXiv:2512.06504 [pdf, ps, other]
Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[163]  arXiv:2512.06485 [pdf, ps, other]
Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based Interaction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164]  arXiv:2512.06447 [pdf, ps, other]
Title: Towards Stable Cross-Domain Depression Recognition under Missing Modalities
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[165]  arXiv:2512.06438 [pdf, ps, other]
Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166]  arXiv:2512.06434 [pdf, ps, other]
Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular Screening
Comments: 8 pages, 2 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[167]  arXiv:2512.06426 [pdf, ps, other]
Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition
Comments: 12 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[168]  arXiv:2512.06424 [pdf, ps, other]
Title: DragMesh: Interactive 3D Generation Made Easy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169]  arXiv:2512.06422 [pdf, ps, other]
Title: A Perception CNN for Facial Expression Recognition
Comments: in IEEE Transactions on Image Processing (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170]  arXiv:2512.06421 [pdf, ps, other]
Title: Rethinking Training Dynamics in Scale-wise Autoregressive Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[171]  arXiv:2512.06400 [pdf, ps, other]
Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene Enhancement
Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172]  arXiv:2512.06379 [pdf, ps, other]
Title: OCFER-Net: Recognizing Facial Expression in Online Learning System
Authors: Yi Huo, Lei Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[173]  arXiv:2512.06377 [pdf, ps, other]
Title: VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System
Authors: Yi Huo, Yun Ge
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174]  arXiv:2512.06376 [pdf, ps, other]
Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175]  arXiv:2512.06373 [pdf, ps, other]
Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning
Comments: The project page is [this url](this https URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176]  arXiv:2512.06368 [pdf, ps, other]
Title: Human3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177]  arXiv:2512.06363 [pdf, ps, other]
Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178]  arXiv:2512.06358 [pdf, ps, other]
Title: Rectifying Latent Space for Generative Single-Image Reflection Removal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179]  arXiv:2512.06353 [pdf, ps, other]
Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search
Comments: Code and Supplementary Material could be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180]  arXiv:2512.06345 [pdf, ps, other]
Title: CLUENet: Cluster Attention Makes Neural Networks Have Eyes
Comments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181]  arXiv:2512.06344 [pdf, ps, other]
Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182]  arXiv:2512.06332 [pdf, ps, other]
Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183]  arXiv:2512.06330 [pdf, ps, other]
Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184]  arXiv:2512.06328 [pdf, ps, other]
Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models
Comments: Accepted as an Oral presentation at AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185]  arXiv:2512.06306 [pdf, ps, other]
Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[186]  arXiv:2512.06290 [pdf, ps, other]
Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification
Comments: 17 pages, 5 figures
Journal-ref: ICDAR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187]  arXiv:2512.06282 [pdf, ps, other]
Title: A Sleep Monitoring System Based on Audio, Video and Depth Information
Comments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[188]  arXiv:2512.06281 [pdf, ps, other]
Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[189]  arXiv:2512.06276 [pdf, ps, other]
Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[190]  arXiv:2512.06275 [pdf, ps, other]
Title: FacePhys: State of the Heart Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191]  arXiv:2512.06269 [pdf, ps, other]
Title: TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting
Authors: Quan Tran, Tuan Dang
Comments: 10 pages
Journal-ref: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192]  arXiv:2512.06258 [pdf, ps, other]
Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193]  arXiv:2512.06255 [pdf, ps, other]
Title: Language-driven Fine-grained Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[194]  arXiv:2512.06251 [pdf, ps, other]
Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195]  arXiv:2512.06232 [pdf, ps, other]
Title: Opinion: Learning Intuitive Physics May Require More than Visual Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[196]  arXiv:2512.06230 [pdf, ps, other]
Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[197]  arXiv:2512.06221 [pdf, ps, other]
Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility Study
Authors: Alena Makarova
Comments: 15 pages, 13 figures. Reproducibility study
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[198]  arXiv:2512.06206 [pdf, ps, other]
Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning
Comments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URL
Journal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[199]  arXiv:2512.06190 [pdf, ps, other]
Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food Drying
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[200]  arXiv:2512.06185 [pdf, ps, other]
Title: SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling
Authors: Ankit Gupta, Christoph Adami, Emily Dolson (Michigan State University)
Comments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary material
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[201]  arXiv:2512.06179 [pdf, ps, other]
Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202]  arXiv:2512.06174 [pdf, ps, other]
Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[203]  arXiv:2512.06171 [pdf, ps, other]
Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection
Comments: 11 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204]  arXiv:2512.06158 [pdf, ps, other]
Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation
Comments: 15 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205]  arXiv:2512.06105 [pdf, ps, other]
Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report Generation
Comments: AAAI-26-AIA
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[206]  arXiv:2512.06103 [pdf, ps, other]
Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack Detection
Comments: Accepted in IEEE T-BIOM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[207]  arXiv:2512.06096 [pdf, ps, other]
Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208]  arXiv:2512.06080 [pdf, ps, other]
Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light
Comments: SIGGRAPH Asia 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[209]  arXiv:2512.06065 [pdf, ps, other]
Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[210]  arXiv:2512.06058 [pdf, ps, other]
Title: Representation Learning for Point Cloud Understanding
Authors: Siming Yan
Comments: 181 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[211]  arXiv:2512.06032 [pdf, ps, other]
Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[212]  arXiv:2512.06024 [pdf, ps, other]
Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
[213]  arXiv:2512.06020 [pdf, ps, other]
Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation
Comments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[214]  arXiv:2512.06014 [pdf, ps, other]
Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215]  arXiv:2512.06013 [pdf, ps, other]
Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViT
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[216]  arXiv:2512.06012 [pdf, ps, other]
Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive Manufacturing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217]  arXiv:2512.06010 [pdf, other]
Title: Fast and Flexible Robustness Certificates for Semantic Segmentation
Authors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[218]  arXiv:2512.06006 [pdf, ps, other]
Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[219]  arXiv:2512.06003 [pdf, ps, other]
Title: PrunedCaps: A Case For Primary Capsules Discrimination
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220]  arXiv:2512.05996 [pdf, ps, other]
Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and Counting
Comments: 18 pages, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
[221]  arXiv:2512.05993 [pdf, ps, other]
[222]  arXiv:2512.05991 [pdf, ps, other]
Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[223]  arXiv:2512.05988 [pdf, ps, other]
Title: VG3T: Visual Geometry Grounded Gaussian Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[224]  arXiv:2512.05987 [pdf, ps, other]
Title: Adaptive Dataset Quantization: A New Direction for Dataset Pruning
Authors: Chenyue Yu, Jianyu Yu
Comments: Accepted by ICCPR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[225]  arXiv:2512.05969 [pdf, ps, other]
Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices
Authors: Hokin Deng
Comments: See $\href{this https URL}{results}$ and $\href{this https URL}{code}$
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[226]  arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]
Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[227]  arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]
Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray Segmentation
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[228]  arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]
Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics Framework
Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[229]  arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]
Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[230]  arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]
Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces
Authors: Nikita Gabdullin
Comments: 9 pages, 5 figures, 1 table, 4 equations
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[231]  arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]
Title: Human Geometry Distribution for 3D Animation Generation
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[232]  arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]
Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World Models
Comments: 23 pages, 8 figures, 3 tables
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
[233]  arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]
Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[234]  arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]
Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness Likelihood
Comments: Accepted to WACV 2026
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[235]  arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]
Title: A Geometric Unification of Concept Learning with Concept Cones
Comments: 22 pages
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[236]  arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]
Title: Affine Subspace Models and Clustering for Patch-Based Image Denoising
Comments: Asilomar Conference on Signals, Systems, and Computers 2025
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[237]  arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]
Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty Metrics
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[238]  arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]
Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[239]  arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]
Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket Search
Comments: This work plans to be submitted to the IEEE for possible publication
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[240]  arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]
Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning
Comments: Code: this https URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[241]  arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]
Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous Driving
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[242]  arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]
Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep Analysis
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[243]  arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]
Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme Patients
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[244]  arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]
Title: VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Comments: Project page: this https URL
Journal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[245]  arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]
Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge
Comments: 2025 NeurIPS Behavior Challenge 1st place solution
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[246]  arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]
Title: Dynamic Visual SLAM using a General 3D Prior
Comments: 8 pages
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[247]  arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]
Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge Devices
Comments: 9Pages, 3 figure, Politeknik Negeri Banyuwangi
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[248]  arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]
Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association
Comments: FAME 2026 Technical Report
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[249]  arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]
Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics
Comments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-concept
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[250]  arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]
Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG Data
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[251]  arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]
Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[252]  arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]
Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine Learning
Comments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
[253]  arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]
Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network
Authors: Xiao Li
Comments: in Chinese language
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[254]  arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]
Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[255]  arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]
Title: Vector Quantization using Gaussian Variational Autoencoder
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[256]  arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]
Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[257]  arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]
Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[258]  arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]
Title: Semantic Temporal Single-photon LiDAR
Comments: 14 pages, 5 figures. And any comment is welcome
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
[259]  arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]
Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation
Comments: NeurIPS Black in AI workshop - 2022
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Mon, 8 Dec 2025 (showing first 50 of 94 entries)

[260]  arXiv:2512.05965 [pdf, ps, other]
Title: EditThinker: Unlocking Iterative Reasoning for Any Image Editor
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[261]  arXiv:2512.05960 [pdf, ps, other]
Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[262]  arXiv:2512.05941 [pdf, ps, other]
Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[263]  arXiv:2512.05937 [pdf, ps, other]
Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV Perception
Comments: 8 pages, 2 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[264]  arXiv:2512.05936 [pdf, ps, other]
Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition
Comments: 8 pages, 8 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[265]  arXiv:2512.05928 [pdf, ps, other]
Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face Recognition
Comments: 18 pages, 17 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[266]  arXiv:2512.05927 [pdf, ps, other]
Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[267]  arXiv:2512.05922 [pdf, ps, other]
Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology Segmentation
Comments: Note: Khang Le and Anh Mai Vu contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268]  arXiv:2512.05920 [pdf, ps, other]
Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[269]  arXiv:2512.05905 [pdf, ps, other]
Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[270]  arXiv:2512.05866 [pdf, ps, other]
Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator
Comments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[271]  arXiv:2512.05859 [pdf, ps, other]
Title: Edit-aware RAW Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[272]  arXiv:2512.05853 [pdf, ps, other]
Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2512.05830 [pdf, ps, other]
Title: Phase-OTDR Event Detection Using Image-Based Data Transformation and Deep Learning
Comments: 22 pages, 11 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[274]  arXiv:2512.05814 [pdf, ps, other]
Title: UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease Detection
Comments: The code is already available on GitHub: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[275]  arXiv:2512.05809 [pdf, ps, other]
Title: Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling
Comments: Extended abstract at World Modeling Workshop 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[276]  arXiv:2512.05802 [pdf, ps, other]
Title: Bring Your Dreams to Life: Continual Text-to-Video Customization
Comments: Accepted to AAAI2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[277]  arXiv:2512.05783 [pdf, ps, other]
Title: Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[278]  arXiv:2512.05774 [pdf, ps, other]
Title: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding
Comments: Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[279]  arXiv:2512.05762 [pdf, ps, other]
Title: FNOPT: Resolution-Agnostic, Self-Supervised Cloth Simulation using Meta-Optimization with Fourier Neural Operators
Comments: Accepted for WACV
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[280]  arXiv:2512.05759 [pdf, ps, other]
Title: Label-Efficient Point Cloud Segmentation with Active Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[281]  arXiv:2512.05754 [pdf, ps, other]
Title: USV: Unified Sparsification for Accelerating Video Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[282]  arXiv:2512.05746 [pdf, ps, other]
Title: HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283]  arXiv:2512.05740 [pdf, ps, other]
Title: Distilling Expert Surgical Knowledge: How to train local surgical VLMs for anatomy explanation in Complete Mesocolic Excision
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284]  arXiv:2512.05710 [pdf, ps, other]
Title: Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285]  arXiv:2512.05698 [pdf, ps, other]
Title: OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning
Comments: The 40th Annual AAAI Conference on Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286]  arXiv:2512.05683 [pdf, ps, other]
Title: Physics-Informed Graph Neural Network with Frequency-Aware Learning for Optical Aberration Correction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[287]  arXiv:2512.05674 [pdf, ps, other]
Title: Hyperspectral Unmixing with 3D Convolutional Sparse Coding and Projected Simplex Volume Maximization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288]  arXiv:2512.05672 [pdf, ps, other]
Title: InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[289]  arXiv:2512.05669 [pdf, ps, other]
Title: Deep Learning-Based Real-Time Sequential Facial Expression Analysis Using Geometric Features
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2512.05663 [pdf, ps, other]
Title: LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291]  arXiv:2512.05651 [pdf, ps, other]
Title: Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[292]  arXiv:2512.05635 [pdf, ps, other]
Title: Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[293]  arXiv:2512.05613 [pdf, ps, other]
Title: DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[294]  arXiv:2512.05610 [pdf, ps, other]
Title: NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projections
Comments: 19 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2512.05597 [pdf, ps, other]
Title: Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token Prediction
Comments: 10 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296]  arXiv:2512.05593 [pdf, ps, other]
Title: Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer
Comments: Accepted to 3DV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297]  arXiv:2512.05571 [pdf, ps, other]
Title: MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical Imaging
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[298]  arXiv:2512.05564 [pdf, ps, other]
Title: ProPhy: Progressive Physical Alignment for Dynamic World Simulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[299]  arXiv:2512.05557 [pdf, ps, other]
Title: 2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence Consistency
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[300]  arXiv:2512.05546 [pdf, ps, other]
Title: Conscious Gaze: Adaptive Attention Mechanisms for Hallucination Mitigation in Vision-Language Models
Comments: 6 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[301]  arXiv:2512.05539 [pdf, ps, other]
Title: Ideal Observer for Segmentation of Dead Leaves Images
Comments: 41 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Methodology (stat.ME)
[302]  arXiv:2512.05529 [pdf, ps, other]
Title: See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors
Comments: The first two authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[303]  arXiv:2512.05524 [pdf, ps, other]
Title: VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2512.05515 [pdf, ps, other]
Title: DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis
Comments: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[305]  arXiv:2512.05513 [pdf, ps, other]
Title: Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306]  arXiv:2512.05511 [pdf, ps, other]
Title: Rethinking Infrared Small Target Detection: A Foundation-Driven Efficient Paradigm
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2512.05494 [pdf, ps, other]
Title: Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation
Comments: Accepted to AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308]  arXiv:2512.05492 [pdf, ps, other]
Title: WaterWave: Bridging Underwater Image Enhancement into Video Streams via Wavelet-based Temporal Consistency Field
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[309]  arXiv:2512.05482 [pdf, ps, other]
Title: Concept-based Explainable Data Mining with VLM for 3D Detection
Authors: Mai Tsujimoto
Comments: 28 pages including appendix. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 759 entries: 1-250 | 60-309 | 310-559 | 560-759 ]
[ showing 250 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)