We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 594

[ total of 759 entries: 1-100 | ... | 295-394 | 395-494 | 495-594 | 595-694 | 695-759 ]
[ showing 100 entries per page: fewer | more | all ]

Thu, 4 Dec 2025 (continued, showing last 24 of 130 entries)

[595]  arXiv:2512.03257 [pdf, ps, other]
Title: PyroFocus: A Deep Learning Approach to Real-Time Wildfire Detection in Multispectral Remote Sensing Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[596]  arXiv:2512.03247 [pdf, ps, other]
Title: PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement
Comments: Published in the Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[597]  arXiv:2512.03245 [pdf, ps, other]
Title: 2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[598]  arXiv:2512.03237 [pdf, ps, other]
Title: LLM-Guided Material Inference for 3D Point Clouds
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[599]  arXiv:2512.03233 [pdf, ps, other]
Title: Object Counting with GPT-4o and GPT-5: A Comparative Study
Comments: 5 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[600]  arXiv:2512.03210 [pdf, ps, other]
Title: Flux4D: Flow-based Unsupervised 4D Reconstruction
Comments: NeurIPS 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[601]  arXiv:2512.03199 [pdf, ps, other]
Title: Does Head Pose Correction Improve Biometric Facial Recognition?
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[602]  arXiv:2512.03182 [pdf, ps, other]
Title: Drainage: A Unifying Framework for Addressing Class Uncertainty
Comments: 16 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[603]  arXiv:2512.03126 [pdf, ps, other]
Title: Hierarchical Process Reward Models are Symbolic Vision Learners
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[604]  arXiv:2512.04076 (cross-list from cs.GR) [pdf, ps, other]
Title: Radiance Meshes for Volumetric Reconstruction
Comments: Website: half-potato.gitlab.io/rm
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[605]  arXiv:2512.04032 (cross-list from cs.CL) [pdf, ps, other]
Title: Jina-VLM: Small Multilingual Vision Language Model
Comments: 18 pages, 1-7 main content, 13-18 appendix for tables and dataset
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[606]  arXiv:2512.03995 (cross-list from cs.RO) [pdf, ps, other]
Title: Artificial Microsaccade Compensation: Stable Vision for an Ornithopter
Comments: 29 pages, 5 figures, 2 tables, under review
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[607]  arXiv:2512.03962 (cross-list from eess.IV) [pdf, ps, other]
Title: Tada-DIP: Input-adaptive Deep Image Prior for One-shot 3D Image Reconstruction
Comments: 6 pages, 8 figures, 2025 Asilomar Conference on Signals, Systems, and Computers. Code is available at github.com/evanbell02/Tada-DIP/
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[608]  arXiv:2512.03656 (cross-list from cs.LG) [pdf, ps, other]
Title: Cyclical Temporal Encoding and Hybrid Deep Ensembles for Multistep Energy Forecasting
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[609]  arXiv:2512.03556 (cross-list from cs.RO) [pdf, ps, other]
Title: RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[610]  arXiv:2512.03522 (cross-list from cs.RO) [pdf, ps, other]
Title: MSG-Loc: Multi-Label Likelihood-based Semantic Graph Matching for Object-Level Global Localization
Comments: Accepted in IEEE Robotics and Automation Letters (2025)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[611]  arXiv:2512.03514 (cross-list from cs.IR) [pdf, ps, other]
Title: M3DR: Towards Universal Multilingual Multimodal Document Retrieval
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[612]  arXiv:2512.03422 (cross-list from cs.RO) [pdf, ps, other]
Title: What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[613]  arXiv:2512.03216 (cross-list from physics.ins-det) [pdf, ps, other]
Title: Kaleidoscopic Scintillation Event Imaging
Subjects: Instrumentation and Detectors (physics.ins-det); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[614]  arXiv:2512.03173 (cross-list from cs.CY) [pdf, ps, other]
Title: Culture Affordance Atlas: Reconciling Object Diversity Through Functional Mapping
Journal-ref: AAAI 2026 Social Impact Track
Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[615]  arXiv:2512.03166 (cross-list from cs.RO) [pdf, ps, other]
Title: Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[616]  arXiv:2512.03111 (cross-list from q-bio.GN) [pdf, ps, other]
Title: PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer
Comments: Accepted by AAAI 2026
Subjects: Genomics (q-bio.GN); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[617]  arXiv:2512.03054 (cross-list from cs.LG) [pdf, ps, other]
Title: Energy-Efficient Federated Learning via Adaptive Encoder Freezing for MRI-to-CT Conversion: A Green AI-Guided Research
Comments: 22 pages, 13 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Medical Physics (physics.med-ph)
[618]  arXiv:2512.03052 (cross-list from cs.GR) [pdf, ps, other]
Title: LATTICE: Democratize High-Fidelity 3D Generation at Scale
Comments: Technical Report
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)

Wed, 3 Dec 2025 (showing first 76 of 141 entries)

[619]  arXiv:2512.03046 [pdf, ps, other]
Title: MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues
Comments: Code and demo available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[620]  arXiv:2512.03045 [pdf, ps, other]
Title: CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[621]  arXiv:2512.03043 [pdf, ps, other]
Title: OneThinker: All-in-one Reasoning Model for Image and Video
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[622]  arXiv:2512.03042 [pdf, ps, other]
Title: PPTArena: A Benchmark for Agentic PowerPoint Editing
Comments: Project webpage: this https URL GitHub: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[623]  arXiv:2512.03041 [pdf, ps, other]
Title: MultiShotMaster: A Controllable Multi-Shot Video Generation Framework
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[624]  arXiv:2512.03040 [pdf, ps, other]
Title: Video4Spatial: Towards Visuospatial Intelligence with Context-Guided Video Generation
Comments: Project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[625]  arXiv:2512.03036 [pdf, ps, other]
Title: ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[626]  arXiv:2512.03034 [pdf, ps, other]
Title: MAViD: A Multimodal Framework for Audio-Visual Dialogue Understanding and Generation
Comments: Our project website is this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[627]  arXiv:2512.03020 [pdf, ps, other]
Title: Unrolled Networks are Conditional Probability Flows in MRI Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[628]  arXiv:2512.03018 [pdf, ps, other]
Title: AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry
Comments: Accepted to Siggraph Asia 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[629]  arXiv:2512.03014 [pdf, ps, other]
Title: Instant Video Models: Universal Adapters for Stabilizing Image-Based Networks
Comments: NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[630]  arXiv:2512.03013 [pdf, ps, other]
Title: In-Context Sync-LoRA for Portrait Video Editing
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[631]  arXiv:2512.03010 [pdf, ps, other]
Title: SurfFill: Completion of LiDAR Point Clouds via Gaussian Surfel Splatting
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Robotics (cs.RO)
[632]  arXiv:2512.03004 [pdf, ps, other]
Title: DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[633]  arXiv:2512.03000 [pdf, ps, other]
Title: DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[634]  arXiv:2512.02993 [pdf, ps, other]
Title: TEXTRIX: Latent Attribute Grid for Native Texture Generation and Beyond
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[635]  arXiv:2512.02991 [pdf, ps, other]
Title: GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[636]  arXiv:2512.02982 [pdf, ps, other]
Title: U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences
Comments: Preprint; 19 pages, 7 figures, 8 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[637]  arXiv:2512.02981 [pdf, ps, other]
Title: InEx: Hallucination Mitigation via Introspection and Cross-Modal Multi-Agent Collaboration
Comments: Published in AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[638]  arXiv:2512.02973 [pdf, ps, other]
Title: Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Cryptography and Security (cs.CR)
[639]  arXiv:2512.02972 [pdf, ps, other]
Title: BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection
Comments: Accept by AAAI26
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[640]  arXiv:2512.02965 [pdf, ps, other]
Title: A Lightweight Real-Time Low-Light Enhancement Network for Embedded Automotive Vision Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[641]  arXiv:2512.02952 [pdf, ps, other]
Title: Layout Anything: One Transformer for Universal Room Layout Estimation
Comments: Published at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[642]  arXiv:2512.02942 [pdf, ps, other]
Title: Benchmarking Scientific Understanding and Reasoning for Video Generation using VideoScience-Bench
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[643]  arXiv:2512.02933 [pdf, ps, other]
Title: LoVoRA: Text-guided and Mask-free Video Object Removal and Addition with Learnable Object-aware Localization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[644]  arXiv:2512.02932 [pdf, ps, other]
Title: EGGS: Exchangeable 2D/3D Gaussian Splatting for Geometry-Appearance Balanced Novel View Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[645]  arXiv:2512.02931 [pdf, ps, other]
Title: DiverseAR: Boosting Diversity in Bitwise Autoregressive Image Generation
Comments: 23 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[646]  arXiv:2512.02906 [pdf, ps, other]
Title: MRD: Multi-resolution Retrieval-Detection Fusion for High-Resolution Image Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[647]  arXiv:2512.02899 [pdf, ps, other]
Title: Glance: Accelerating Diffusion Models with 1 Sample
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[648]  arXiv:2512.02897 [pdf, ps, other]
Title: Polar Perspectives: Evaluating 2-D LiDAR Projections for Robust Place Recognition with Visual Foundation Models
Comments: 13 Pages, 5 Figures, 2 Tables Under Review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[649]  arXiv:2512.02895 [pdf, ps, other]
Title: MindGPT-4ov: An Enhanced MLLM via a Multi-Stage Post-Training Paradigm
Comments: 33 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[650]  arXiv:2512.02870 [pdf, ps, other]
Title: Taming Camera-Controlled Video Generation with Verifiable Geometry Reward
Comments: 11 pages, 4 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[651]  arXiv:2512.02867 [pdf, ps, other]
Title: MICCAI STSR 2025 Challenge: Semi-Supervised Teeth and Pulp Segmentation and CBCT-IOS Registration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[652]  arXiv:2512.02860 [pdf, ps, other]
Title: RFOP: Rethinking Fusion and Orthogonal Projection for Face-Voice Association
Comments: Ranked 3rd in Fame 2026 Challenge, ICASSP
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[653]  arXiv:2512.02850 [pdf, ps, other]
Title: Are Detectors Fair to Indian IP-AIGC? A Cross-Generator Study
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[654]  arXiv:2512.02846 [pdf, ps, other]
Title: Action Anticipation at a Glimpse: To What Extent Can Multimodal Cues Replace Video?
Comments: Accepted in WACV 2026 - Applications Track
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[655]  arXiv:2512.02835 [pdf, ps, other]
Title: ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[656]  arXiv:2512.02830 [pdf, ps, other]
Title: Defense That Attacks: How Robust Models Become Better Attackers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[657]  arXiv:2512.02794 [pdf, ps, other]
Title: PhyCustom: Towards Realistic Physical Customization in Text-to-Image Generation
Comments: codes:this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[658]  arXiv:2512.02793 [pdf, ps, other]
Title: IC-World: In-Context Generation for Shared World Modeling
Comments: codes:this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[659]  arXiv:2512.02792 [pdf, ps, other]
Title: HUD: Hierarchical Uncertainty-Aware Disambiguation Network for Composed Video Retrieval
Comments: Accepted by ACM MM 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[660]  arXiv:2512.02790 [pdf, ps, other]
Title: UnicEdit-10M: A Dataset and Benchmark Breaking the Scale-Quality Barrier via Unified Verification for Reasoning-Enriched Edits
Comments: 31 pages, 15 figures, 12 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[661]  arXiv:2512.02789 [pdf, ps, other]
Title: TrackNetV5: Residual-Driven Spatio-Temporal Refinement and Motion Direction Decoupling for Fast Object Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[662]  arXiv:2512.02781 [pdf, ps, other]
Title: LumiX: Structured and Coherent Text-to-Intrinsic Generation
Comments: The code will be available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[663]  arXiv:2512.02780 [pdf, ps, other]
Title: Rethinking Surgical Smoke: A Smoke-Type-Aware Laparoscopic Video Desmoking Method and Dataset
Comments: 12 pages, 15 figures. Accepted to AAAI-26 (Main Technical Track)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[664]  arXiv:2512.02751 [pdf, ps, other]
Title: AttMetNet: Attention-Enhanced Deep Neural Network for Methane Plume Detection in Sentinel-2 Satellite Imagery
Comments: 15 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[665]  arXiv:2512.02743 [pdf, ps, other]
Title: Reasoning-Aware Multimodal Fusion for Hateful Video Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[666]  arXiv:2512.02737 [pdf, ps, other]
Title: Beyond Paired Data: Self-Supervised UAV Geo-Localization from Reference Imagery Alone
Comments: Accepted at WACV 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[667]  arXiv:2512.02727 [pdf, ps, other]
Title: DF-Mamba: Deformable State Space Modeling for 3D Hand Pose Estimation in Interactions
Comments: Accepted to WACV 2026. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[668]  arXiv:2512.02715 [pdf, ps, other]
Title: GeoViS: Geospatially Rewarded Visual Search for Remote Sensing Visual Grounding
Comments: 11 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[669]  arXiv:2512.02702 [pdf, ps, other]
Title: Tissue-mask supported inter-subject whole-body image registration in the UK Biobank -- A method benchmarking study
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[670]  arXiv:2512.02700 [pdf, ps, other]
Title: VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[671]  arXiv:2512.02697 [pdf, ps, other]
Title: GeoBridge: A Semantic-Anchored Multi-View Foundation Model Bridging Images and Text for Geo-Localization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[672]  arXiv:2512.02696 [pdf, ps, other]
Title: ALDI-ray: Adapting the ALDI Framework for Security X-ray Object Detection
Comments: Submitted to ICASSP 2026 Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[673]  arXiv:2512.02686 [pdf, ps, other]
Title: ClimaOoD: Improving Anomaly Segmentation via Physically Realistic Synthetic Data
Authors: Yuxing Liu, Yong Liu
Comments: Under review;
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[674]  arXiv:2512.02685 [pdf, ps, other]
Title: Unsupervised Structural Scene Decomposition via Foreground-Aware Slot Attention with Pseudo-Mask Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[675]  arXiv:2512.02681 [pdf, ps, other]
Title: PGP-DiffSR: Phase-Guided Progressive Pruning for Efficient Diffusion-based Image Super-Resolution
Comments: 10 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[676]  arXiv:2512.02668 [pdf, ps, other]
Title: UAUTrack: Towards Unified Multimodal Anti-UAV Visual Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[677]  arXiv:2512.02664 [pdf, ps, other]
Title: PolarGuide-GSDR: 3D Gaussian Splatting Driven by Polarization Priors and Deferred Reflection for Real-World Reflective Scenes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[678]  arXiv:2512.02660 [pdf, ps, other]
Title: Spatially-Grounded Document Retrieval via Patch-to-Region Relevance Propagation
Comments: 13 pages, 1 figure, 2 tables. Open-source implementation available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
[679]  arXiv:2512.02650 [pdf, ps, other]
Title: Hear What Matters! Text-conditioned Selective Video-to-Audio Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[680]  arXiv:2512.02648 [pdf, ps, other]
Title: PoreTrack3D: A Benchmark for Dynamic 3D Gaussian Splatting in Pore-Scale Facial Trajectory Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[681]  arXiv:2512.02643 [pdf, ps, other]
Title: Leveraging Large-Scale Pretrained Spatial-Spectral Priors for General Zero-Shot Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[682]  arXiv:2512.02624 [pdf, ps, other]
Title: PPTBench: Towards Holistic Evaluation of Large Language Models for PowerPoint Layout and Design Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[683]  arXiv:2512.02622 [pdf, ps, other]
Title: RULER-Bench: Probing Rule-based Reasoning Abilities of Next-level Video Generation Models for Vision Foundation Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[684]  arXiv:2512.02621 [pdf, ps, other]
Title: Content-Aware Texturing for Gaussian Splatting
Comments: Project Page: this https URL
Journal-ref: Eurographics Symposium on Rendering (Symposium Track), 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[685]  arXiv:2512.02576 [pdf, ps, other]
Title: Co-speech Gesture Video Generation via Motion-Based Graph Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[686]  arXiv:2512.02566 [pdf, ps, other]
Title: From Panel to Pixel: Zoom-In Vision-Language Pretraining from Biomedical Scientific Literature
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[687]  arXiv:2512.02554 [pdf, ps, other]
Title: OmniPerson: Unified Identity-Preserving Pedestrian Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[688]  arXiv:2512.02541 [pdf, ps, other]
Title: AVGGT: Rethinking Global Attention for Accelerating VGGT
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[689]  arXiv:2512.02536 [pdf, ps, other]
Title: WeMMU: Enhanced Bridging of Vision-Language Models and Diffusion Models via Noisy Query Tokens
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[690]  arXiv:2512.02520 [pdf, ps, other]
Title: On the Problem of Consistent Anomalies in Zero-Shot Anomaly Detection
Authors: Tai Le-Gia
Comments: PhD Dissertation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[691]  arXiv:2512.02517 [pdf, ps, other]
Title: SkyMoE: A Vision-Language Foundation Model for Enhancing Geospatial Interpretation with Mixture of Experts
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[692]  arXiv:2512.02512 [pdf, ps, other]
Title: Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling
Comments: Accepted as a Tiny Paper at the 13th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2025), IIT Mandi, India. 3 pages, 1 figure
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[693]  arXiv:2512.02505 [pdf, ps, other]
Title: GeoDiT: A Diffusion-based Vision-Language Model for Geospatial Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[694]  arXiv:2512.02498 [pdf, ps, other]
Title: dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 759 entries: 1-100 | ... | 295-394 | 395-494 | 495-594 | 595-694 | 695-759 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)