We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 378

[ total of 749 entries: 1-100 | 79-178 | 179-278 | 279-378 | 379-478 | 479-578 | 579-678 | 679-749 ]
[ showing 100 entries per page: fewer | more | all ]

Tue, 9 Dec 2025 (continued, showing last 12 of 259 entries)

[379]  arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]
Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association
Comments: FAME 2026 Technical Report
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[380]  arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]
Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics
Comments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-concept
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[381]  arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]
Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG Data
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[382]  arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]
Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[383]  arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]
Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine Learning
Comments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
[384]  arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]
Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network
Authors: Xiao Li
Comments: in Chinese language
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[385]  arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]
Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[386]  arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]
Title: Vector Quantization using Gaussian Variational Autoencoder
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[387]  arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]
Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[388]  arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]
Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[389]  arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]
Title: Semantic Temporal Single-photon LiDAR
Comments: 14 pages, 5 figures. And any comment is welcome
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
[390]  arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]
Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation
Comments: NeurIPS Black in AI workshop - 2022
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Mon, 8 Dec 2025 (showing first 88 of 94 entries)

[391]  arXiv:2512.05965 [pdf, ps, other]
Title: EditThinker: Unlocking Iterative Reasoning for Any Image Editor
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[392]  arXiv:2512.05960 [pdf, ps, other]
Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[393]  arXiv:2512.05941 [pdf, ps, other]
Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[394]  arXiv:2512.05937 [pdf, ps, other]
Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV Perception
Comments: 8 pages, 2 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[395]  arXiv:2512.05936 [pdf, ps, other]
Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition
Comments: 8 pages, 8 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[396]  arXiv:2512.05928 [pdf, ps, other]
Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face Recognition
Comments: 18 pages, 17 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[397]  arXiv:2512.05927 [pdf, ps, other]
Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[398]  arXiv:2512.05922 [pdf, ps, other]
Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology Segmentation
Comments: Note: Khang Le and Anh Mai Vu contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[399]  arXiv:2512.05920 [pdf, ps, other]
Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[400]  arXiv:2512.05905 [pdf, ps, other]
Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[401]  arXiv:2512.05866 [pdf, ps, other]
Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator
Comments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[402]  arXiv:2512.05859 [pdf, ps, other]
Title: Edit-aware RAW Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403]  arXiv:2512.05853 [pdf, ps, other]
Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404]  arXiv:2512.05830 [pdf, ps, other]
Title: Phase-OTDR Event Detection Using Image-Based Data Transformation and Deep Learning
Comments: 22 pages, 11 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[405]  arXiv:2512.05814 [pdf, ps, other]
Title: UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease Detection
Comments: The code is already available on GitHub: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[406]  arXiv:2512.05809 [pdf, ps, other]
Title: Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling
Comments: Extended abstract at World Modeling Workshop 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[407]  arXiv:2512.05802 [pdf, ps, other]
Title: Bring Your Dreams to Life: Continual Text-to-Video Customization
Comments: Accepted to AAAI2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408]  arXiv:2512.05783 [pdf, ps, other]
Title: Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[409]  arXiv:2512.05774 [pdf, ps, other]
Title: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding
Comments: Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[410]  arXiv:2512.05762 [pdf, ps, other]
Title: FNOPT: Resolution-Agnostic, Self-Supervised Cloth Simulation using Meta-Optimization with Fourier Neural Operators
Comments: Accepted for WACV
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[411]  arXiv:2512.05759 [pdf, ps, other]
Title: Label-Efficient Point Cloud Segmentation with Active Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[412]  arXiv:2512.05754 [pdf, ps, other]
Title: USV: Unified Sparsification for Accelerating Video Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[413]  arXiv:2512.05746 [pdf, ps, other]
Title: HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[414]  arXiv:2512.05740 [pdf, ps, other]
Title: Distilling Expert Surgical Knowledge: How to train local surgical VLMs for anatomy explanation in Complete Mesocolic Excision
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[415]  arXiv:2512.05710 [pdf, ps, other]
Title: Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[416]  arXiv:2512.05698 [pdf, ps, other]
Title: OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning
Comments: The 40th Annual AAAI Conference on Artificial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[417]  arXiv:2512.05683 [pdf, ps, other]
Title: Physics-Informed Graph Neural Network with Frequency-Aware Learning for Optical Aberration Correction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[418]  arXiv:2512.05674 [pdf, ps, other]
Title: Hyperspectral Unmixing with 3D Convolutional Sparse Coding and Projected Simplex Volume Maximization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[419]  arXiv:2512.05672 [pdf, ps, other]
Title: InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[420]  arXiv:2512.05669 [pdf, ps, other]
Title: Deep Learning-Based Real-Time Sequential Facial Expression Analysis Using Geometric Features
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[421]  arXiv:2512.05663 [pdf, ps, other]
Title: LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[422]  arXiv:2512.05651 [pdf, ps, other]
Title: Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[423]  arXiv:2512.05635 [pdf, ps, other]
Title: Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[424]  arXiv:2512.05613 [pdf, ps, other]
Title: DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[425]  arXiv:2512.05610 [pdf, ps, other]
Title: NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projections
Comments: 19 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[426]  arXiv:2512.05597 [pdf, ps, other]
Title: Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token Prediction
Comments: 10 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427]  arXiv:2512.05593 [pdf, ps, other]
Title: Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer
Comments: Accepted to 3DV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[428]  arXiv:2512.05571 [pdf, ps, other]
Title: MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical Imaging
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429]  arXiv:2512.05564 [pdf, ps, other]
Title: ProPhy: Progressive Physical Alignment for Dynamic World Simulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430]  arXiv:2512.05557 [pdf, ps, other]
Title: 2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence Consistency
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[431]  arXiv:2512.05546 [pdf, ps, other]
Title: Conscious Gaze: Adaptive Attention Mechanisms for Hallucination Mitigation in Vision-Language Models
Comments: 6 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[432]  arXiv:2512.05539 [pdf, ps, other]
Title: Ideal Observer for Segmentation of Dead Leaves Images
Comments: 41 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Methodology (stat.ME)
[433]  arXiv:2512.05529 [pdf, ps, other]
Title: See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors
Comments: The first two authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[434]  arXiv:2512.05524 [pdf, ps, other]
Title: VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435]  arXiv:2512.05515 [pdf, ps, other]
Title: DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis
Comments: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[436]  arXiv:2512.05513 [pdf, ps, other]
Title: Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[437]  arXiv:2512.05511 [pdf, ps, other]
Title: Rethinking Infrared Small Target Detection: A Foundation-Driven Efficient Paradigm
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[438]  arXiv:2512.05494 [pdf, ps, other]
Title: Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation
Comments: Accepted to AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[439]  arXiv:2512.05492 [pdf, ps, other]
Title: WaterWave: Bridging Underwater Image Enhancement into Video Streams via Wavelet-based Temporal Consistency Field
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[440]  arXiv:2512.05482 [pdf, ps, other]
Title: Concept-based Explainable Data Mining with VLM for 3D Detection
Authors: Mai Tsujimoto
Comments: 28 pages including appendix. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[441]  arXiv:2512.05481 [pdf, ps, other]
Title: UniFS: Unified Multi-Contrast MRI Reconstruction via Frequency-Spatial Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[442]  arXiv:2512.05478 [pdf, ps, other]
Title: EmoStyle: Emotion-Driven Image Stylization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[443]  arXiv:2512.05468 [pdf, ps, other]
Title: University Building Recognition Dataset in Thailand for the mission-oriented IoT sensor system
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[444]  arXiv:2512.05446 [pdf, ps, other]
Title: TED-4DGS: Temporally Activated and Embedding-based Deformation for 4DGS Compression
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[445]  arXiv:2512.05422 [pdf, ps, other]
Title: ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[446]  arXiv:2512.05418 [pdf, ps, other]
Title: Performance Evaluation of Deep Learning for Tree Branch Segmentation in Autonomous Forestry Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[447]  arXiv:2512.05415 [pdf, ps, other]
Title: Moving object detection from multi-depth images with an attention-enhanced CNN
Comments: 14 pages, 22 figures, submitted to PASJ
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[448]  arXiv:2512.05412 [pdf, ps, other]
Title: YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[449]  arXiv:2512.05410 [pdf, ps, other]
Title: Genetic Algorithms For Parameter Optimization for Disparity Map Generation of Radiata Pine Branch Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[450]  arXiv:2512.05398 [pdf, ps, other]
Title: The Dynamic Prior: Understanding 3D Structures for Casual Dynamic Videos
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451]  arXiv:2512.05394 [pdf, ps, other]
Title: Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[452]  arXiv:2512.05391 [pdf, ps, other]
Title: LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453]  arXiv:2512.05385 [pdf, ps, other]
Title: ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[454]  arXiv:2512.05362 [pdf, ps, other]
Title: PoolNet: Deep Learning for 2D to 3D Video Process Validation
Comments: All code related to this paper can be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[455]  arXiv:2512.05359 [pdf, ps, other]
Title: Group Orthogonal Low-Rank Adaptation for RGB-T Tracking
Comments: 13 pages, 8 figures. Accepted by AAAI 2026. Extended version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456]  arXiv:2512.05354 [pdf, ps, other]
Title: SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training
Comments: project page this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[457]  arXiv:2512.05343 [pdf, ps, other]
Title: SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[458]  arXiv:2512.05277 [pdf, ps, other]
Title: From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[459]  arXiv:2512.05272 [pdf, ps, other]
Title: Inferring Compositional 4D Scenes without Ever Seeing One
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460]  arXiv:2512.05268 [pdf, ps, other]
Title: CARD: Correlation Aware Restoration with Diffusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461]  arXiv:2512.05259 [pdf, ps, other]
Title: Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462]  arXiv:2512.05240 [pdf, ps, other]
Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463]  arXiv:2512.05209 [pdf, ps, other]
Title: DEAR: Dataset for Evaluating the Aesthetics of RenderingDEAR: Dataset for Evaluating the Aesthetics of Rendering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464]  arXiv:2512.05198 [pdf, ps, other]
Title: Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion Models
Comments: 16 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[465]  arXiv:2512.05172 [pdf, ps, other]
Title: Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[466]  arXiv:2512.05152 [pdf, ps, other]
Title: EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer Models
Comments: 6pages, 5figures, published to 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[467]  arXiv:2512.05150 [pdf, ps, other]
Title: TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
Comments: arxiv v0
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[468]  arXiv:2512.05145 [pdf, ps, other]
Title: Self-Improving VLM Judges Without Human Annotations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469]  arXiv:2512.05140 [pdf, other]
Title: FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation
Authors: Georges Le Bellier (CEDRIC - VERTIGO, Cnam), Nicolas Audebert (LaSTIG, IGN, CEDRIC - VERTIGO)
Comments: 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2026, Tucson (AZ), United States
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[470]  arXiv:2512.05139 [pdf, ps, other]
Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[471]  arXiv:2512.05137 [pdf, ps, other]
Title: ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[472]  arXiv:2512.05136 [pdf, ps, other]
Title: Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[473]  arXiv:2512.05134 [pdf, ps, other]
Title: InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models
Authors: Zihao Wu
Comments: 8 pages main, 8 pages appendix, 16 figures, 5 tables. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[474]  arXiv:2512.05132 [pdf, ps, other]
Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[475]  arXiv:2512.05131 [pdf, ps, other]
Title: AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[476]  arXiv:2512.05959 (cross-list from cs.CL) [pdf, ps, other]
Title: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Comments: Preprint
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[477]  arXiv:2512.05955 (cross-list from cs.RO) [pdf, ps, other]
Title: SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[478]  arXiv:2512.05932 (cross-list from cs.RO) [pdf, ps, other]
Title: Physically-Based Simulation of Automotive LiDAR
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[ total of 749 entries: 1-100 | 79-178 | 179-278 | 279-378 | 379-478 | 479-578 | 579-678 | 679-749 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)