We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 450

[ total of 749 entries: 1-100 | ... | 151-250 | 251-350 | 351-450 | 451-550 | 551-650 | 651-749 ]
[ showing 100 entries per page: fewer | more | all ]

Mon, 8 Dec 2025 (continued, showing last 34 of 94 entries)

[451]  arXiv:2512.05394 [pdf, ps, other]
Title: Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[452]  arXiv:2512.05391 [pdf, ps, other]
Title: LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453]  arXiv:2512.05385 [pdf, ps, other]
Title: ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[454]  arXiv:2512.05362 [pdf, ps, other]
Title: PoolNet: Deep Learning for 2D to 3D Video Process Validation
Comments: All code related to this paper can be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[455]  arXiv:2512.05359 [pdf, ps, other]
Title: Group Orthogonal Low-Rank Adaptation for RGB-T Tracking
Comments: 13 pages, 8 figures. Accepted by AAAI 2026. Extended version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456]  arXiv:2512.05354 [pdf, ps, other]
Title: SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training
Comments: project page this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[457]  arXiv:2512.05343 [pdf, ps, other]
Title: SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[458]  arXiv:2512.05277 [pdf, ps, other]
Title: From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[459]  arXiv:2512.05272 [pdf, ps, other]
Title: Inferring Compositional 4D Scenes without Ever Seeing One
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460]  arXiv:2512.05268 [pdf, ps, other]
Title: CARD: Correlation Aware Restoration with Diffusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461]  arXiv:2512.05259 [pdf, ps, other]
Title: Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462]  arXiv:2512.05240 [pdf, ps, other]
Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463]  arXiv:2512.05209 [pdf, ps, other]
Title: DEAR: Dataset for Evaluating the Aesthetics of RenderingDEAR: Dataset for Evaluating the Aesthetics of Rendering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464]  arXiv:2512.05198 [pdf, ps, other]
Title: Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion Models
Comments: 16 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[465]  arXiv:2512.05172 [pdf, ps, other]
Title: Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[466]  arXiv:2512.05152 [pdf, ps, other]
Title: EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer Models
Comments: 6pages, 5figures, published to 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[467]  arXiv:2512.05150 [pdf, ps, other]
Title: TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
Comments: arxiv v0
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[468]  arXiv:2512.05145 [pdf, ps, other]
Title: Self-Improving VLM Judges Without Human Annotations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469]  arXiv:2512.05140 [pdf, other]
Title: FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation
Authors: Georges Le Bellier (CEDRIC - VERTIGO, Cnam), Nicolas Audebert (LaSTIG, IGN, CEDRIC - VERTIGO)
Comments: 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2026, Tucson (AZ), United States
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[470]  arXiv:2512.05139 [pdf, ps, other]
Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[471]  arXiv:2512.05137 [pdf, ps, other]
Title: ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[472]  arXiv:2512.05136 [pdf, ps, other]
Title: Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[473]  arXiv:2512.05134 [pdf, ps, other]
Title: InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models
Authors: Zihao Wu
Comments: 8 pages main, 8 pages appendix, 16 figures, 5 tables. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[474]  arXiv:2512.05132 [pdf, ps, other]
Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[475]  arXiv:2512.05131 [pdf, ps, other]
Title: AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[476]  arXiv:2512.05959 (cross-list from cs.CL) [pdf, ps, other]
Title: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Comments: Preprint
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[477]  arXiv:2512.05955 (cross-list from cs.RO) [pdf, ps, other]
Title: SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[478]  arXiv:2512.05932 (cross-list from cs.RO) [pdf, ps, other]
Title: Physically-Based Simulation of Automotive LiDAR
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[479]  arXiv:2512.05824 (cross-list from cs.AI) [pdf, ps, other]
Title: Multimodal Oncology Agent for IDH1 Mutation Prediction in Low-Grade Glioma
Authors: Hafsa Akebli (1), Adam Shephard (2), Vincenzo Della Mea (1), Nasir Rajpoot (2 and 3) ((1) University of Udine, Udine, Italy, (2) University of Warwick, Coventry, UK, (3) Histofy Ltd, Coventry, UK)
Comments: 4 pages, 2 figures
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[480]  arXiv:2512.05812 (cross-list from cs.RO) [pdf, ps, other]
Title: Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation
Comments: This work has been submitted to the IEEE for possible publication
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[481]  arXiv:2512.05665 (cross-list from cs.CL) [pdf, ps, other]
Title: Interleaved Latent Visual Reasoning with Selective Perceptual Modeling
Comments: 11 pages, 6 figures. Code available at this https URL
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[482]  arXiv:2512.05438 (cross-list from cs.HC) [pdf, ps, other]
Title: EXR: An Interactive Immersive EHR Visualization in Extended Reality
Comments: 11 pages, 6 figures. Preprint version. This paper has been accepted to IEEE ICIR 2025. This is the author-prepared version and not the final published version. The final version will appear in IEEE Xplo
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[483]  arXiv:2512.05299 (cross-list from eess.SY) [pdf, ps, other]
Title: ARCAS: An Augmented Reality Collision Avoidance System with SLAM-Based Tracking for Enhancing VRU Safety
Comments: 8 pages, 3 figures, 1 table
Subjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Robotics (cs.RO); Image and Video Processing (eess.IV)
[484]  arXiv:2512.05126 (cross-list from eess.AS) [pdf, ps, other]
Title: SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS Model
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)

Fri, 5 Dec 2025 (showing first 66 of 135 entries)

[485]  arXiv:2512.05115 [pdf, ps, other]
Title: Light-X: Generative 4D Video Rendering with Camera and Illumination Control
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[486]  arXiv:2512.05113 [pdf, ps, other]
Title: Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting
Comments: WACV 2026. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[487]  arXiv:2512.05112 [pdf, ps, other]
Title: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[488]  arXiv:2512.05111 [pdf, ps, other]
Title: ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[489]  arXiv:2512.05110 [pdf, ps, other]
Title: ShadowDraw: From Any Object to Shadow-Drawing Compositional Art
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[490]  arXiv:2512.05106 [pdf, ps, other]
Title: NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Robotics (cs.RO)
[491]  arXiv:2512.05104 [pdf, ps, other]
Title: EvoIR: Towards All-in-One Image Restoration via Evolutionary Frequency Modulation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[492]  arXiv:2512.05098 [pdf, ps, other]
Title: SA-IQA: Redefining Image Quality Assessment for Spatial Aesthetics with Multi-Dimensional Rewards
Authors: Yuan Gao, Jin Song
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[493]  arXiv:2512.05091 [pdf, ps, other]
Title: Visual Reasoning Tracer: Object-Level Grounded Reasoning Benchmark
Comments: Technical Report; Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[494]  arXiv:2512.05081 [pdf, ps, other]
Title: Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[495]  arXiv:2512.05079 [pdf, ps, other]
Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[496]  arXiv:2512.05076 [pdf, ps, other]
Title: BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[497]  arXiv:2512.05060 [pdf, ps, other]
Title: 4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer
Comments: Code: this https URL, Webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[498]  arXiv:2512.05044 [pdf, ps, other]
Title: Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Comments: 18 Pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[499]  arXiv:2512.05039 [pdf, ps, other]
Title: Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual Encoding
Comments: Submitted for review CVPR-2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[500]  arXiv:2512.05025 [pdf, ps, other]
Title: RAMEN: Resolution-Adjustable Multimodal Encoder for Earth Observation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[501]  arXiv:2512.05021 [pdf, ps, other]
Title: HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[502]  arXiv:2512.05016 [pdf, ps, other]
Title: Generative Neural Video Compression via Video Diffusion Prior
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[503]  arXiv:2512.05006 [pdf, ps, other]
Title: Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects
Comments: conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[504]  arXiv:2512.05000 [pdf, ps, other]
Title: Reflection Removal through Efficient Adaptation of Diffusion Transformers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[505]  arXiv:2512.04996 [pdf, ps, other]
Title: A dynamic memory assignment strategy for dilation-based ICP algorithm on embedded GPUs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[506]  arXiv:2512.04981 [pdf, ps, other]
Title: Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[507]  arXiv:2512.04970 [pdf, ps, other]
Title: Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks
Comments: UniReps Workshop 2025, 12 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[508]  arXiv:2512.04969 [pdf, ps, other]
Title: Rethinking the Use of Vision Transformers for AI-Generated Image Detection
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[509]  arXiv:2512.04967 [pdf, ps, other]
Title: Balanced Few-Shot Episodic Learning for Accurate Retinal Disease Diagnosis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[510]  arXiv:2512.04963 [pdf, ps, other]
Title: GeoPE:A Unified Geometric Positional Embedding for Structured Tensors
Authors: Yupu Yao, Bowen Yang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[511]  arXiv:2512.04952 [pdf, ps, other]
Title: FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via Neural Action Tokenization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[512]  arXiv:2512.04943 [pdf, ps, other]
Title: Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[513]  arXiv:2512.04939 [pdf, ps, other]
Title: LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[514]  arXiv:2512.04927 [pdf, ps, other]
Title: Virtually Unrolling the Herculaneum Papyri by Diffeomorphic Spiral Fitting
Authors: Paul Henderson
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[515]  arXiv:2512.04926 [pdf, ps, other]
Title: Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[516]  arXiv:2512.04904 [pdf, ps, other]
Title: ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[517]  arXiv:2512.04890 [pdf, ps, other]
Title: Equivariant Symmetry-Aware Head Pose Estimation for Fetal MRI
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[518]  arXiv:2512.04888 [pdf, ps, other]
Title: You Only Train Once (YOTO): A Retraining-Free Object Detection Framework
Comments: This manuscript was first submitted to the Engineering (Elsevier Journal). The preprint version was posted to arXiv afterwards to facilitate open access and community feedback
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[519]  arXiv:2512.04883 [pdf, ps, other]
Title: SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[520]  arXiv:2512.04875 [pdf, ps, other]
Title: SP-Det: Self-Prompted Dual-Text Fusion for Generalized Multi-Label Lesion Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[521]  arXiv:2512.04862 [pdf, ps, other]
Title: Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing
Comments: * Equal contribution. Minor figure corrections compared to the ICCV 2025 version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[522]  arXiv:2512.04857 [pdf, ps, other]
Title: Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[523]  arXiv:2512.04837 [pdf, ps, other]
Title: A Sanity Check for Multi-In-Domain Face Forgery Detection in the Real World
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[524]  arXiv:2512.04832 [pdf, ps, other]
Title: Tokenizing Buildings: A Transformer for Layout Synthesis
Comments: 8 pages, 1 page References, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[525]  arXiv:2512.04830 [pdf, ps, other]
Title: FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis
Comments: Novel View Synthesis, Driving Scene, Free Trajectory, Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[526]  arXiv:2512.04821 [pdf, ps, other]
Title: LatentFM: A Latent Flow Matching Approach for Generative Medical Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527]  arXiv:2512.04815 [pdf, ps, other]
Title: RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS
Comments: arXiv admin note: substantial text overlap with arXiv:2506.02751
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528]  arXiv:2512.04810 [pdf, ps, other]
Title: EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[529]  arXiv:2512.04786 [pdf, ps, other]
Title: LaFiTe: A Generative Latent Field for 3D Native Texturing
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530]  arXiv:2512.04784 [pdf, ps, other]
Title: PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531]  arXiv:2512.04761 [pdf, ps, other]
Title: Order Matters: 3D Shape Generation from Sequential VR Sketches
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532]  arXiv:2512.04734 [pdf, ps, other]
Title: MT-Depth: Multi-task Instance feature analysis for the Depth Completion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533]  arXiv:2512.04733 [pdf, ps, other]
Title: E3AD: An Emotion-Aware Vision-Language-Action Model for Human-Centric End-to-End Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[534]  arXiv:2512.04728 [pdf, ps, other]
Title: Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[535]  arXiv:2512.04699 [pdf, ps, other]
Title: OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution
Comments: Accepted as TCSVT, 15 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[536]  arXiv:2512.04686 [pdf, ps, other]
Title: Towards Cross-View Point Correspondence in Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[537]  arXiv:2512.04678 [pdf, ps, other]
Title: Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538]  arXiv:2512.04677 [pdf, ps, other]
Title: Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539]  arXiv:2512.04660 [pdf, ps, other]
Title: I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540]  arXiv:2512.04643 [pdf, ps, other]
Title: SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[541]  arXiv:2512.04619 [pdf, ps, other]
Title: Denoise to Track: Harnessing Video Diffusion Priors for Robust Correspondence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[542]  arXiv:2512.04599 [pdf, ps, other]
Title: Malicious Image Analysis via Vision-Language Segmentation Fusion: Detection, Element, and Location in One-shot
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543]  arXiv:2512.04597 [pdf, ps, other]
Title: When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[544]  arXiv:2512.04585 [pdf, ps, other]
Title: SAM3-I: Segment Anything with Instructions
Comments: Preliminary results; work in progress
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545]  arXiv:2512.04581 [pdf, ps, other]
Title: Infrared UAV Target Tracking with Dynamic Feature Refinement and Global Contextual Attention Knowledge Distillation
Comments: Accepted by IEEE TMM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546]  arXiv:2512.04576 [pdf, ps, other]
Title: TARDis: Time Attenuated Representation Disentanglement for Incomplete Multi-Modal Tumor Segmentation and Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547]  arXiv:2512.04568 [pdf, ps, other]
Title: Prompt2Craft: Generating Functional Craft Assemblies with LLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548]  arXiv:2512.04564 [pdf, ps, other]
Title: Dataset creation for supervised deep learning-based analysis of microscopic images -- review of important considerations and recommendations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549]  arXiv:2512.04563 [pdf, ps, other]
Title: COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550]  arXiv:2512.04554 [pdf, ps, other]
Title: Counterfeit Answers: Adversarial Forgery against OCR-Free Document Visual Question Answering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 749 entries: 1-100 | ... | 151-250 | 251-350 | 351-450 | 451-550 | 551-650 | 651-749 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)