We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 84

[ total of 749 entries: 1-250 | 85-334 | 335-584 | 585-749 ]
[ showing 250 entries per page: fewer | more | all ]

Wed, 10 Dec 2025 (continued, showing last 47 of 131 entries)

[85]  arXiv:2512.08253 [pdf, ps, other]
Title: Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[86]  arXiv:2512.08247 [pdf, ps, other]
Title: Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object Detection
Comments: AAAI-26
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[87]  arXiv:2512.08243 [pdf, ps, other]
Title: Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSI
Authors: Saeeda Naz, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences (UEAS), Swat, Pakistan)
Comments: 26 Pages, 10 Figures, 4 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[88]  arXiv:2512.08240 [pdf, ps, other]
Title: HybridToken-VLM: Hybrid Token Compression for Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[89]  arXiv:2512.08237 [pdf, ps, other]
Title: FastBEV++: Fast by Algorithm, Deployable by Design
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[90]  arXiv:2512.08229 [pdf, ps, other]
Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[91]  arXiv:2512.08228 [pdf, ps, other]
Title: MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[92]  arXiv:2512.08227 [pdf, ps, other]
Title: New VVC profiles targeting Feature Coding for Machines
Comments: Accepted for presentation at ICIP 2025 workshop on Coding for Machines
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93]  arXiv:2512.08223 [pdf, ps, other]
Title: SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94]  arXiv:2512.08221 [pdf, ps, other]
Title: VisKnow: Constructing Visual Knowledge Base for Object Understanding
Comments: 16 pages, 12 figures, 7 tables. Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95]  arXiv:2512.08215 [pdf, ps, other]
Title: Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior Refinement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[96]  arXiv:2512.08198 [pdf, ps, other]
Title: Animal Re-Identification on Microcontrollers
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[97]  arXiv:2512.08180 [pdf, ps, other]
Title: GeoLoom: High-quality Geometric Diagram Generation from Textual Input
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[98]  arXiv:2512.08163 [pdf, ps, other]
Title: Accuracy Does Not Guarantee Human-Likeness in Monocular Depth Estimators
Comments: 22 pages, 12 figures, 1 table
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[99]  arXiv:2512.08161 [pdf, ps, other]
Title: Fourier-RWKV: A Multi-State Perception Network for Efficient Image Dehazing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100]  arXiv:2512.08135 [pdf, ps, other]
Title: CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[101]  arXiv:2512.08075 [pdf, ps, other]
Title: Identification of Deforestation Areas in the Amazon Rainforest Using Change Detection Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102]  arXiv:2512.08048 [pdf, ps, other]
Title: Mask to Adapt: Simple Random Masking Enables Robust Continual Test-Time Learning
Comments: ongoing work
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[103]  arXiv:2512.08042 [pdf, ps, other]
Title: Towards Sustainable Universal Deepfake Detection with Frequency-Domain Masking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[104]  arXiv:2512.08040 [pdf, ps, other]
Title: Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105]  arXiv:2512.08038 [pdf, ps, other]
Title: SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity Classification
Comments: 20 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[106]  arXiv:2512.08016 [pdf, ps, other]
Title: FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[107]  arXiv:2512.07984 [pdf, ps, other]
Title: Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection
Comments: 13 pages, 7 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[108]  arXiv:2512.07951 [pdf, ps, other]
Title: Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Comments: Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[109]  arXiv:2512.07925 [pdf, ps, other]
Title: Near-real time fires detection using satellite imagery in Sudan conflict
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[110]  arXiv:2512.07838 [pdf, ps, other]
Title: Detection of Cyberbullying in GIF using AI
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[111]  arXiv:2512.08715 (cross-list from cs.PF) [pdf, ps, other]
Title: Multi-domain performance analysis with scores tailored to user preferences
Subjects: Performance (cs.PF); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[112]  arXiv:2512.08629 (cross-list from cs.AI) [pdf, ps, other]
Title: See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[113]  arXiv:2512.08545 (cross-list from cs.CL) [pdf, ps, other]
Title: Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
Comments: 22 pages, 2 tables, 9 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[114]  arXiv:2512.08500 (cross-list from cs.GR) [pdf, ps, other]
Title: Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[115]  arXiv:2512.08360 (cross-list from cs.NE) [pdf, ps, other]
Title: Conditional Morphogenesis: Emergent Generation of Structural Digits via Neural Cellular Automata
Authors: Ali Sakour
Comments: 13 pages, 5 figures. Code available at: this https URL
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[116]  arXiv:2512.08284 (cross-list from physics.geo-ph) [pdf, ps, other]
Title: Self-Reinforced Deep Priors for Reparameterized Full Waveform Inversion
Comments: Submitted to GEOPHYSICS
Subjects: Geophysics (physics.geo-ph); Computer Vision and Pattern Recognition (cs.CV)
[117]  arXiv:2512.08271 (cross-list from cs.RO) [pdf, ps, other]
Title: Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic Teleoperation
Comments: Published and Presented at 3rd Workshop on Human-Centric Multilateral Teleoperation in ICRA 2025
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[118]  arXiv:2512.08216 (cross-list from eess.IV) [pdf, ps, other]
Title: Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[119]  arXiv:2512.08188 (cross-list from cs.RO) [pdf, ps, other]
Title: Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World Model
Comments: Website at this https URL
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[120]  arXiv:2512.08170 (cross-list from cs.RO) [pdf, ps, other]
Title: RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric Features
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[121]  arXiv:2512.08153 (cross-list from cs.LG) [pdf, ps, other]
Title: TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models
Authors: Zheng Ding, Weirui Ye
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[122]  arXiv:2512.08125 (cross-list from eess.IV) [pdf, ps, other]
Title: FlowSteer: Conditioning Flow Field for Consistent Image Restoration
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[123]  arXiv:2512.08099 (cross-list from math.NA) [pdf, ps, other]
Title: Generalizations of the Normalized Radon Cumulative Distribution Transform for Limited Data Recognition
Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
[124]  arXiv:2512.08029 (cross-list from cs.LG) [pdf, ps, other]
Title: CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[125]  arXiv:2512.07998 (cross-list from cs.RO) [pdf, ps, other]
Title: DIJIT: A Robotic Head for an Active Observer
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[126]  arXiv:2512.07981 (cross-list from cs.LG) [pdf, ps, other]
Title: CIP-Net: Continual Interpretable Prototype-based Network
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[127]  arXiv:2512.07976 (cross-list from cs.RO) [pdf, ps, other]
Title: VLD: Visual Language Goal Distance for Reinforcement Learning Navigation
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[128]  arXiv:2512.07969 (cross-list from cs.RO) [pdf, ps, other]
Title: Sparse Variable Projection in Robotic Perception: Exploiting Separable Structure for Efficient Nonlinear Optimization
Comments: 8 pages, submitted for review
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[129]  arXiv:2512.07884 (cross-list from cs.LG) [pdf, ps, other]
Title: GSPN-2: Efficient Parallel Sequence Modeling
Comments: NeurIPS 2025
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[130]  arXiv:2512.07855 (cross-list from cs.LG) [pdf, ps, other]
Title: LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer Model
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[131]  arXiv:2512.05791 (cross-list from physics.med-ph) [pdf, ps, other]
Title: Fast and Robust Diffusion Posterior Sampling for MR Image Reconstruction Using the Preconditioned Unadjusted Langevin Algorithm
Comments: Submitted to Magnetic Resonance in Medicine
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Probability (math.PR)

Tue, 9 Dec 2025 (showing first 203 of 259 entries)

[132]  arXiv:2512.07834 [pdf, ps, other]
Title: Voxify3D: Pixel Art Meets Volumetric Rendering
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133]  arXiv:2512.07833 [pdf, ps, other]
Title: Relational Visual Similarity
Comments: Project page, data, and code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[134]  arXiv:2512.07831 [pdf, ps, other]
Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
Comments: Project Website this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135]  arXiv:2512.07829 [pdf, ps, other]
Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[136]  arXiv:2512.07826 [pdf, ps, other]
Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing
Comments: 38 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137]  arXiv:2512.07821 [pdf, ps, other]
Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[138]  arXiv:2512.07807 [pdf, ps, other]
Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale Scenes
Comments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[139]  arXiv:2512.07806 [pdf, ps, other]
Title: Multi-view Pyramid Transformer: Look Coarser to See Broader
Comments: Project page: see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140]  arXiv:2512.07802 [pdf, ps, other]
Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141]  arXiv:2512.07778 [pdf, ps, other]
Title: Distribution Matching Variational AutoEncoder
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142]  arXiv:2512.07776 [pdf, ps, other]
Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[143]  arXiv:2512.07760 [pdf, ps, other]
Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-Identification
Comments: Accepted to AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144]  arXiv:2512.07756 [pdf, ps, other]
Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[145]  arXiv:2512.07747 [pdf, ps, other]
Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146]  arXiv:2512.07745 [pdf, ps, other]
Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147]  arXiv:2512.07738 [pdf, ps, other]
Title: HLTCOE Evaluation Team at TREC 2025: VQA Track
Comments: 7 pages, 1 figure
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148]  arXiv:2512.07733 [pdf, ps, other]
Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149]  arXiv:2512.07730 [pdf, ps, other]
Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[150]  arXiv:2512.07729 [pdf, ps, other]
Title: Improving action classification with brain-inspired deep networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[151]  arXiv:2512.07720 [pdf, ps, other]
Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation
Comments: Project page: \url{this https URL}
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152]  arXiv:2512.07712 [pdf, ps, other]
Title: UnCageNet: Tracking and Pose Estimation of Caged Animal
Comments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, India
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153]  arXiv:2512.07703 [pdf, ps, other]
Title: PVeRA: Probabilistic Vector-Based Random Matrix Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[154]  arXiv:2512.07702 [pdf, ps, other]
Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment
Comments: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[155]  arXiv:2512.07698 [pdf, ps, other]
Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data Only
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[156]  arXiv:2512.07674 [pdf, ps, other]
Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast Representations
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[157]  arXiv:2512.07668 [pdf, ps, other]
Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158]  arXiv:2512.07661 [pdf, ps, other]
Title: Optimization-Guided Diffusion for Interactive Scene Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159]  arXiv:2512.07652 [pdf, ps, other]
Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[160]  arXiv:2512.07651 [pdf, ps, other]
Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline Method
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161]  arXiv:2512.07628 [pdf, ps, other]
Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162]  arXiv:2512.07606 [pdf, ps, other]
Title: Decomposition Sampling for Efficient Region Annotations in Active Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163]  arXiv:2512.07599 [pdf, ps, other]
Title: Online Segment Any 3D Thing as Instance Tracking
Comments: NeurIPS 2025, Code is at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164]  arXiv:2512.07596 [pdf, ps, other]
Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[165]  arXiv:2512.07590 [pdf, ps, other]
Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166]  arXiv:2512.07584 [pdf, ps, other]
Title: LongCat-Image Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167]  arXiv:2512.07580 [pdf, ps, other]
Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168]  arXiv:2512.07568 [pdf, ps, other]
Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[169]  arXiv:2512.07564 [pdf, ps, other]
Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models
Comments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[170]  arXiv:2512.07527 [pdf, ps, other]
Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[171]  arXiv:2512.07514 [pdf, ps, other]
Title: MeshRipple: Structured Autoregressive Generation of Artist-Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172]  arXiv:2512.07504 [pdf, ps, other]
Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points
Comments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[173]  arXiv:2512.07503 [pdf, ps, other]
Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174]  arXiv:2512.07500 [pdf, ps, other]
Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175]  arXiv:2512.07498 [pdf, ps, other]
Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior
Comments: 16 pages (including appendix)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176]  arXiv:2512.07480 [pdf, ps, other]
Title: Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177]  arXiv:2512.07469 [pdf, ps, other]
Title: Unified Video Editing with Temporal Reasoner
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178]  arXiv:2512.07426 [pdf, ps, other]
Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processing
Comments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[179]  arXiv:2512.07415 [pdf, ps, other]
Title: Data-driven Exploration of Mobility Interaction Patterns
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[180]  arXiv:2512.07410 [pdf, ps, other]
Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181]  arXiv:2512.07394 [pdf, ps, other]
Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric Video
Comments: webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182]  arXiv:2512.07391 [pdf, ps, other]
Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency Monitoring
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183]  arXiv:2512.07385 [pdf, ps, other]
Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184]  arXiv:2512.07383 [pdf, ps, other]
Title: LogicCBMs: Logic-Enhanced Concept-Based Learning
Comments: 18 pages, 19 figures, WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185]  arXiv:2512.07381 [pdf, ps, other]
Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic Objects
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[186]  arXiv:2512.07379 [pdf, ps, other]
Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and Efficiency
Comments: 22 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187]  arXiv:2512.07360 [pdf, ps, other]
Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic Segmentation
Comments: Accepted to WACV2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[188]  arXiv:2512.07351 [pdf, ps, other]
Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
[189]  arXiv:2512.07348 [pdf, ps, other]
Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190]  arXiv:2512.07345 [pdf, ps, other]
Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian Splatting
Comments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191]  arXiv:2512.07338 [pdf, ps, other]
Title: Generalized Referring Expression Segmentation on Aerial Photos
Comments: Submitted to IEEE J-STARS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192]  arXiv:2512.07331 [pdf, ps, other]
Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision Transformers
Authors: Kanishk Awadhiya
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193]  arXiv:2512.07328 [pdf, ps, other]
Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[194]  arXiv:2512.07305 [pdf, ps, other]
Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195]  arXiv:2512.07302 [pdf, ps, other]
Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[196]  arXiv:2512.07276 [pdf, ps, other]
Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery
Comments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[197]  arXiv:2512.07275 [pdf, ps, other]
Title: Effective Attention-Guided Multi-Scale Medical Network for Skin Lesion Segmentation
Comments: The paper has been accepted by BIBM 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[198]  arXiv:2512.07273 [pdf, ps, other]
Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language Translation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199]  arXiv:2512.07269 [pdf, ps, other]
Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[200]  arXiv:2512.07253 [pdf, ps, other]
Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement
Comments: 18 pages, 8 figures, and 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[201]  arXiv:2512.07251 [pdf, ps, other]
Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202]  arXiv:2512.07247 [pdf, ps, other]
Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing
Comments: 40 pages, 34 figures, 18 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[203]  arXiv:2512.07245 [pdf, ps, other]
Title: Zero-Shot Textual Explanations via Translating Decision-Critical Features
Comments: 11+6 pages, 8 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204]  arXiv:2512.07241 [pdf, ps, other]
Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network Architecture
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205]  arXiv:2512.07237 [pdf, ps, other]
Title: Unified Camera Positional Encoding for Controlled Video Generation
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206]  arXiv:2512.07234 [pdf, ps, other]
Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[207]  arXiv:2512.07230 [pdf, ps, other]
Title: STRinGS: Selective Text Refinement in Gaussian Splatting
Comments: Accepted to WACV 2026. Project Page, see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208]  arXiv:2512.07229 [pdf, ps, other]
Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category Discovery
Comments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[209]  arXiv:2512.07228 [pdf, ps, other]
Title: Towards Robust Protective Perturbation against DeepFake Face Swapping
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[210]  arXiv:2512.07215 [pdf, ps, other]
Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[211]  arXiv:2512.07211 [pdf, ps, other]
Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point Clouds
Comments: 8 pages, 8 figures, 5 tables, ICCR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[212]  arXiv:2512.07206 [pdf, ps, other]
Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CT
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[213]  arXiv:2512.07203 [pdf, ps, other]
Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning
Comments: 7 pages, 1 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214]  arXiv:2512.07201 [pdf, ps, other]
Title: Understanding Diffusion Models via Code Execution
Authors: Cheng Yu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[215]  arXiv:2512.07198 [pdf, ps, other]
Title: Generating Storytelling Images with Rich Chains-of-Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[216]  arXiv:2512.07197 [pdf, ps, other]
Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting
Comments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217]  arXiv:2512.07192 [pdf, ps, other]
Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image Compression
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[218]  arXiv:2512.07191 [pdf, ps, other]
Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field Correction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219]  arXiv:2512.07190 [pdf, ps, other]
Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220]  arXiv:2512.07186 [pdf, ps, other]
Title: START: Spatial and Textual Learning for Chart Understanding
Comments: WACV2026 Camera Ready
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[221]  arXiv:2512.07171 [pdf, ps, other]
Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image Restoration
Comments: 21 pages, 11 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222]  arXiv:2512.07170 [pdf, ps, other]
Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[223]  arXiv:2512.07166 [pdf, ps, other]
Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing
Comments: 9 pages,7figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[224]  arXiv:2512.07165 [pdf, ps, other]
Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225]  arXiv:2512.07155 [pdf, ps, other]
Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented Metrics
Comments: Please visit our project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226]  arXiv:2512.07141 [pdf, ps, other]
Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[227]  arXiv:2512.07136 [pdf, ps, other]
Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[228]  arXiv:2512.07135 [pdf, ps, other]
Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[229]  arXiv:2512.07128 [pdf, ps, other]
Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIP
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[230]  arXiv:2512.07126 [pdf, ps, other]
Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-On
Comments: 16 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231]  arXiv:2512.07110 [pdf, ps, other]
Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232]  arXiv:2512.07107 [pdf, ps, other]
Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D Supervision
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233]  arXiv:2512.07078 [pdf, ps, other]
Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object Detection
Comments: 16 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[234]  arXiv:2512.07076 [pdf, ps, other]
Title: Context-measure: Contextualizing Metric for Camouflage
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[235]  arXiv:2512.07065 [pdf, ps, other]
Title: Persistent Homology-Guided Frequency Filtering for Image Compression
Comments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compression
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236]  arXiv:2512.07062 [pdf, ps, other]
Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[237]  arXiv:2512.07052 [pdf, ps, other]
Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238]  arXiv:2512.07051 [pdf, ps, other]
Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation
Comments: 11 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[239]  arXiv:2512.07037 [pdf, ps, other]
Title: Evaluating and Preserving High-level Fidelity in Super-Resolution
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[240]  arXiv:2512.07034 [pdf, ps, other]
Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent Cues
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[241]  arXiv:2512.06981 [pdf, ps, other]
Title: Selective Masking based Self-Supervised Learning for Image Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[242]  arXiv:2512.06949 [pdf, ps, other]
Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology
Comments: 19 pages, 5 figures, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243]  arXiv:2512.06921 [pdf, ps, other]
Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification
Comments: Accepted by IEEE ICIA 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[244]  arXiv:2512.06905 [pdf, ps, other]
Title: Scaling Zero-Shot Reference-to-Video Generation
Comments: Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245]  arXiv:2512.06888 [pdf, ps, other]
Title: Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[246]  arXiv:2512.06886 [pdf, ps, other]
Title: Balanced Learning for Domain Adaptive Semantic Segmentation
Comments: Accepted by International Conference on Machine Learning (ICML 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[247]  arXiv:2512.06885 [pdf, ps, other]
Title: JoPano: Unified Panorama Generation via Joint Modeling
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[248]  arXiv:2512.06882 [pdf, ps, other]
Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian Fusion
Comments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon request
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249]  arXiv:2512.06877 [pdf, ps, other]
Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene Classification
Comments: Accepted and presented in ICSPIS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250]  arXiv:2512.06870 [pdf, ps, other]
Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective
Comments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[251]  arXiv:2512.06866 [pdf, ps, other]
Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
Comments: Accepted by NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[252]  arXiv:2512.06865 [pdf, ps, other]
Title: Spatial Retrieval Augmented Autonomous Driving
Comments: Demo Page: this https URL with open sourced code, dataset, and checkpoints
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253]  arXiv:2512.06864 [pdf, ps, other]
Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training
Comments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[254]  arXiv:2512.06862 [pdf, ps, other]
Title: Omni-Referring Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[255]  arXiv:2512.06849 [pdf, ps, other]
Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT
Comments: In submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[256]  arXiv:2512.06845 [pdf, ps, other]
Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257]  arXiv:2512.06840 [pdf, ps, other]
Title: CADE: Continual Weakly-supervised Video Anomaly Detection with Ensembles
Comments: Accepted to WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258]  arXiv:2512.06838 [pdf, ps, other]
Title: SparseCoop: Cooperative Perception with Kinematic-Grounded Queries
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259]  arXiv:2512.06818 [pdf, ps, other]
Title: MeshSplatting: Differentiable Rendering with Opaque Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260]  arXiv:2512.06811 [pdf, ps, other]
Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language Models
Comments: Accepted by AAAI 2026(Oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[261]  arXiv:2512.06810 [pdf, ps, other]
Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[262]  arXiv:2512.06802 [pdf, ps, other]
Title: VDOT: Efficient Unified Video Creation via Optimal Transport Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263]  arXiv:2512.06793 [pdf, ps, other]
Title: Generalized Geometry Encoding Volume for Real-time Stereo Matching
Comments: Accepted by AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264]  arXiv:2512.06783 [pdf, ps, other]
Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-Videos
Comments: 16 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[265]  arXiv:2512.06774 [pdf, ps, other]
Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[266]  arXiv:2512.06769 [pdf, ps, other]
Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[267]  arXiv:2512.06763 [pdf, ps, other]
Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control Algorithms
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268]  arXiv:2512.06759 [pdf, ps, other]
Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors
Comments: 12 pages,13figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[269]  arXiv:2512.06750 [pdf, ps, other]
Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[270]  arXiv:2512.06746 [pdf, ps, other]
Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[271]  arXiv:2512.06738 [pdf, ps, other]
Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain Adaptation
Comments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[272]  arXiv:2512.06736 [pdf, ps, other]
Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2512.06726 [pdf, ps, other]
Title: The Role of Entropy in Visual Grounding: Analysis and Optimization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[274]  arXiv:2512.06689 [pdf, ps, other]
Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and Separation
Comments: Accepted to ASRU 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[275]  arXiv:2512.06684 [pdf, ps, other]
Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[276]  arXiv:2512.06674 [pdf, ps, other]
Title: RunawayEvil: Jailbreaking the Image-to-Video Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[277]  arXiv:2512.06673 [pdf, ps, other]
Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[278]  arXiv:2512.06663 [pdf, ps, other]
Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279]  arXiv:2512.06662 [pdf, ps, other]
Title: Personalized Image Descriptions from Attention Sequences
Comments: 10 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[280]  arXiv:2512.06657 [pdf, ps, other]
Title: TextMamba: Scene Text Detector with Mamba
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[281]  arXiv:2512.06642 [pdf, ps, other]
Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution
Comments: 21 pages, 7 figures, 3 table
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[282]  arXiv:2512.06613 [pdf, ps, other]
Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach
Authors: Yueying Ke
Comments: 10 pages, 6 figures, 2 tables, IEEE conference format. Submitted as course project
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283]  arXiv:2512.06612 [pdf, ps, other]
Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics
Comments: Neurips 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284]  arXiv:2512.06598 [pdf, ps, other]
Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake Champlain
Comments: 23 pages, 15 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285]  arXiv:2512.06581 [pdf, ps, other]
Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286]  arXiv:2512.06575 [pdf, ps, other]
Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules
Authors: Fariza Dahes
Comments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LG
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[287]  arXiv:2512.06565 [pdf, ps, other]
Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose Estimation
Authors: Xiujin Liu
Comments: 1 figures, 2 tables, 14pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288]  arXiv:2512.06562 [pdf, ps, other]
Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[289]  arXiv:2512.06560 [pdf, ps, other]
Title: Bridging spatial awareness and global context in medical image segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2512.06531 [pdf, ps, other]
Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images
Authors: Sayan Das (1), Arghadip Biswas (2) ((1) IIIT Delhi, (2) Jadavpur University)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[291]  arXiv:2512.06530 [pdf, ps, other]
Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-Generalization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[292]  arXiv:2512.06521 [pdf, ps, other]
Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife Images
Authors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)
Comments: 31 pages + appendix
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[293]  arXiv:2512.06504 [pdf, ps, other]
Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[294]  arXiv:2512.06485 [pdf, ps, other]
Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based Interaction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2512.06447 [pdf, ps, other]
Title: Towards Stable Cross-Domain Depression Recognition under Missing Modalities
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296]  arXiv:2512.06438 [pdf, ps, other]
Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297]  arXiv:2512.06434 [pdf, ps, other]
Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular Screening
Comments: 8 pages, 2 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[298]  arXiv:2512.06426 [pdf, ps, other]
Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition
Comments: 12 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[299]  arXiv:2512.06424 [pdf, ps, other]
Title: DragMesh: Interactive 3D Generation Made Easy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300]  arXiv:2512.06422 [pdf, ps, other]
Title: A Perception CNN for Facial Expression Recognition
Comments: in IEEE Transactions on Image Processing (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301]  arXiv:2512.06421 [pdf, ps, other]
Title: Rethinking Training Dynamics in Scale-wise Autoregressive Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[302]  arXiv:2512.06400 [pdf, ps, other]
Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene Enhancement
Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303]  arXiv:2512.06379 [pdf, ps, other]
Title: OCFER-Net: Recognizing Facial Expression in Online Learning System
Authors: Yi Huo, Lei Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2512.06377 [pdf, ps, other]
Title: VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System
Authors: Yi Huo, Yun Ge
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[305]  arXiv:2512.06376 [pdf, ps, other]
Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306]  arXiv:2512.06373 [pdf, ps, other]
Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning
Comments: The project page is [this url](this https URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2512.06368 [pdf, ps, other]
Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308]  arXiv:2512.06363 [pdf, ps, other]
Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[309]  arXiv:2512.06358 [pdf, ps, other]
Title: Rectifying Latent Space for Generative Single-Image Reflection Removal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310]  arXiv:2512.06353 [pdf, ps, other]
Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search
Comments: Code and Supplementary Material could be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[311]  arXiv:2512.06345 [pdf, ps, other]
Title: CLUENet: Cluster Attention Makes Neural Networks Have Eyes
Comments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312]  arXiv:2512.06344 [pdf, ps, other]
Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313]  arXiv:2512.06332 [pdf, ps, other]
Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314]  arXiv:2512.06330 [pdf, ps, other]
Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[315]  arXiv:2512.06328 [pdf, ps, other]
Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models
Comments: Accepted as an Oral presentation at AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316]  arXiv:2512.06306 [pdf, ps, other]
Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[317]  arXiv:2512.06290 [pdf, ps, other]
Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification
Comments: 17 pages, 5 figures
Journal-ref: ICDAR 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318]  arXiv:2512.06282 [pdf, ps, other]
Title: A Sleep Monitoring System Based on Audio, Video and Depth Information
Comments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[319]  arXiv:2512.06281 [pdf, ps, other]
Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320]  arXiv:2512.06276 [pdf, ps, other]
Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[321]  arXiv:2512.06275 [pdf, ps, other]
Title: FacePhys: State of the Heart Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322]  arXiv:2512.06269 [pdf, ps, other]
Title: TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting
Authors: Quan Tran, Tuan Dang
Comments: 10 pages
Journal-ref: WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323]  arXiv:2512.06258 [pdf, ps, other]
Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324]  arXiv:2512.06255 [pdf, ps, other]
Title: Language-driven Fine-grained Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325]  arXiv:2512.06251 [pdf, ps, other]
Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks
Comments: 12 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326]  arXiv:2512.06232 [pdf, ps, other]
Title: Opinion: Learning Intuitive Physics May Require More than Visual Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[327]  arXiv:2512.06230 [pdf, ps, other]
Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328]  arXiv:2512.06221 [pdf, ps, other]
Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility Study
Authors: Alena Makarova
Comments: 15 pages, 13 figures. Reproducibility study
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329]  arXiv:2512.06206 [pdf, ps, other]
Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning
Comments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URL
Journal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[330]  arXiv:2512.06190 [pdf, ps, other]
Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food Drying
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[331]  arXiv:2512.06185 [pdf, ps, other]
Title: SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling
Authors: Ankit Gupta, Christoph Adami, Emily Dolson (Michigan State University)
Comments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary material
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[332]  arXiv:2512.06179 [pdf, ps, other]
Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[333]  arXiv:2512.06174 [pdf, ps, other]
Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334]  arXiv:2512.06171 [pdf, ps, other]
Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection
Comments: 11 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 749 entries: 1-250 | 85-334 | 335-584 | 585-749 ]
[ showing 250 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)