We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 500

[ total of 778 entries: 1-50 | ... | 351-400 | 401-450 | 451-500 | 501-550 | 551-600 | 601-650 | 651-700 | ... | 751-778 ]
[ showing 50 entries per page: fewer | more | all ]

Tue, 2 Dec 2025 (showing first 50 of 278 entries)

[501]  arXiv:2512.02018 [pdf, ps, other]
Title: Data-Centric Visual Development for Self-Driving Labs
Comments: 11 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[502]  arXiv:2512.02017 [pdf, ps, other]
Title: Visual Sync: Multi-Camera Synchronization via Cross-View Object Motion
Comments: Accepted to NeurIPS 2025. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[503]  arXiv:2512.02016 [pdf, ps, other]
Title: Objects in Generated Videos Are Slower Than They Appear: Models Suffer Sub-Earth Gravity and Don't Know Galileo's Principle...for now
Comments: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[504]  arXiv:2512.02015 [pdf, ps, other]
Title: Generative Video Motion Editing with 3D Point Tracks
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[505]  arXiv:2512.02014 [pdf, ps, other]
Title: TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[506]  arXiv:2512.02012 [pdf, ps, other]
Title: Improved Mean Flows: On the Challenges of Fastforward Generative Models
Comments: Technical report
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[507]  arXiv:2512.02009 [pdf, ps, other]
Title: AirSim360: A Panoramic Simulation Platform within Drone View
Comments: Project Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[508]  arXiv:2512.02006 [pdf, ps, other]
Title: MV-TAP: Tracking Any Point in Multi-View Videos
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[509]  arXiv:2512.02005 [pdf, ps, other]
Title: Learning Visual Affordance from Audio
Comments: 15 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[510]  arXiv:2512.01989 [pdf, ps, other]
Title: PAI-Bench: A Comprehensive Benchmark For Physical AI
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[511]  arXiv:2512.01988 [pdf, ps, other]
Title: Artemis: Structured Visual Reasoning for Perception Policy Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[512]  arXiv:2512.01975 [pdf, ps, other]
Title: SGDiff: Scene Graph Guided Diffusion Model for Image Collaborative SegCaptioning
Comments: Accept by AAAI-2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[513]  arXiv:2512.01960 [pdf, ps, other]
Title: SpriteHand: Real-Time Versatile Hand-Object Interaction with Autoregressive Video Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[514]  arXiv:2512.01952 [pdf, ps, other]
Title: GrndCtrl: Grounding World Models via Self-Supervised Reward Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[515]  arXiv:2512.01949 [pdf, ps, other]
Title: Script: Graph-Structured and Query-Conditioned Semantic Token Pruning for Multimodal Large Language Models
Comments: Published in Transactions on Machine Learning Research, Project in this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[516]  arXiv:2512.01934 [pdf, ps, other]
Title: Physical ID-Transfer Attacks against Multi-Object Tracking via Adversarial Trajectory
Comments: Accepted to Annual Computer Security Applications Conference (ACSAC) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[517]  arXiv:2512.01922 [pdf, ps, other]
Title: Med-VCD: Mitigating Hallucination for Medical Large Vision Language Models through Visual Contrastive Decoding
Journal-ref: Computers in Biology and Medicine (2026)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[518]  arXiv:2512.01908 [pdf, ps, other]
Title: SARL: Spatially-Aware Self-Supervised Representation Learning for Visuo-Tactile Perception
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[519]  arXiv:2512.01895 [pdf, ps, other]
Title: StyleYourSmile: Cross-Domain Face Retargeting Without Paired Multi-Style Data
Comments: 15 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[520]  arXiv:2512.01889 [pdf, ps, other]
Title: KM-ViPE: Online Tightly Coupled Vision-Language-Geometry Fusion for Open-Vocabulary Semantic SLAM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[521]  arXiv:2512.01885 [pdf, ps, other]
Title: TransientTrack: Advanced Multi-Object Tracking and Classification of Cancer Cells with Transient Fluorescent Signals
Comments: 13 pages, 7 figures, 2 tables. This work has been submitted to IEEE Transactions on Medical Imaging
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cell Behavior (q-bio.CB); Quantitative Methods (q-bio.QM)
[522]  arXiv:2512.01853 [pdf, ps, other]
Title: COACH: Collaborative Agents for Contextual Highlighting -- A Multi-Agent Framework for Sports Video Analysis
Comments: Accepted by AAAI 2026 Workshop LaMAS
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[523]  arXiv:2512.01850 [pdf, ps, other]
Title: Register Any Point: Scaling 3D Point Cloud Registration by Flow Matching
Comments: 22 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[524]  arXiv:2512.01843 [pdf, ps, other]
Title: PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models
Comments: 17 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[525]  arXiv:2512.01830 [pdf, ps, other]
Title: OpenREAD: Reinforced Open-Ended Reasoning for End-to-End Autonomous Driving with LLM-as-Critic
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[526]  arXiv:2512.01827 [pdf, ps, other]
Title: CauSight: Learning to Supersense for Visual Causal Discovery
Comments: project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527]  arXiv:2512.01821 [pdf, ps, other]
Title: Seeing through Imagination: Learning Scene Geometry via Implicit Spatial World Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528]  arXiv:2512.01816 [pdf, ps, other]
Title: Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights
Comments: 35 pages, 12 figures, 10 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[529]  arXiv:2512.01803 [pdf, ps, other]
Title: Generative Action Tell-Tales: Assessing Human Motion in Synthesized Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530]  arXiv:2512.01789 [pdf, ps, other]
Title: SAM3-UNet: Simplified Adaptation of Segment Anything Model 3
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531]  arXiv:2512.01788 [pdf, ps, other]
Title: Learned Image Compression for Earth Observation: Implications for Downstream Segmentation Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532]  arXiv:2512.01774 [pdf, ps, other]
Title: Evaluating SAM2 for Video Semantic Segmentation
Comments: 17 pages, 3 figures and 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533]  arXiv:2512.01771 [pdf, ps, other]
Title: Robust Rigid and Non-Rigid Medical Image Registration Using Learnable Edge Kernels
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[534]  arXiv:2512.01769 [pdf, ps, other]
Title: VideoScoop: A Non-Traditional Domain-Independent Framework For Video Analysis
Authors: Hafsa Billah
Comments: This is a report submitted as part of PhD proposal defense of Hafsa Billah
Subjects: Computer Vision and Pattern Recognition (cs.CV); Databases (cs.DB)
[535]  arXiv:2512.01763 [pdf, ps, other]
Title: HiconAgent: History Context-aware Policy Optimization for GUI Agents
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[536]  arXiv:2512.01755 [pdf, ps, other]
Title: FreqEdit: Preserving High-Frequency Features for Robust Multi-Turn Image Editing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[537]  arXiv:2512.01707 [pdf, ps, other]
Title: StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[538]  arXiv:2512.01701 [pdf, ps, other]
Title: SSR: Semantic and Spatial Rectification for CLIP-based Weakly Supervised Segmentation
Comments: Accepted in AAAI 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539]  arXiv:2512.01686 [pdf, ps, other]
Title: DreamingComics: A Story Visualization Pipeline via Subject and Layout Customized Generation using Video Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540]  arXiv:2512.01681 [pdf, ps, other]
Title: Cross-Domain Validation of a Resection-Trained Self-Supervised Model on Multicentre Mesothelioma Biopsies
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[541]  arXiv:2512.01677 [pdf, ps, other]
Title: Open-world Hand-Object Interaction Video Generation Based on Structure and Contact-aware Representation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[542]  arXiv:2512.01675 [pdf, ps, other]
Title: GRASP: Guided Residual Adapters with Sample-wise Partitioning
Comments: 10 pages, 4 figures, 6 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543]  arXiv:2512.01665 [pdf, ps, other]
Title: Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[544]  arXiv:2512.01657 [pdf, ps, other]
Title: DB-KAUNet: An Adaptive Dual Branch Kolmogorov-Arnold UNet for Retinal Vessel Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545]  arXiv:2512.01643 [pdf, ps, other]
Title: ViT$^3$: Unlocking Test-Time Training in Vision
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546]  arXiv:2512.01636 [pdf, ps, other]
Title: Generative Editing in the Joint Vision-Language Space for Zero-Shot Composed Image Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547]  arXiv:2512.01629 [pdf, ps, other]
Title: SPARK: Sim-ready Part-level Articulated Reconstruction with VLM Knowledge
Comments: Project page: this https URL 17 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[548]  arXiv:2512.01611 [pdf, ps, other]
Title: Depth Matching Method Based on ShapeDTW for Oil-Based Mud Imager
Subjects: Computer Vision and Pattern Recognition (cs.CV); Geophysics (physics.geo-ph)
[549]  arXiv:2512.01589 [pdf, ps, other]
Title: Toward Content-based Indexing and Retrieval of Head and Neck CT with Abscess Segmentation
Comments: The 2025 IEEE International Conference on Content-Based Multimedia Indexing (IEEE CBMI)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550]  arXiv:2512.01582 [pdf, ps, other]
Title: RoleMotion: A Large-Scale Dataset towards Robust Scene-Specific Role-Playing Motion Synthesis with Fine-grained Descriptions
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[ total of 778 entries: 1-50 | ... | 351-400 | 401-450 | 451-500 | 501-550 | 551-600 | 601-650 | 651-700 | ... | 751-778 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)