We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 279

[ total of 778 entries: 1-50 | ... | 130-179 | 180-229 | 230-279 | 280-329 | 330-379 | 380-429 | 430-479 | ... | 730-778 ]
[ showing 50 entries per page: fewer | more | all ]

Thu, 4 Dec 2025 (continued, showing 50 of 130 entries)

[280]  arXiv:2512.03663 [pdf, ps, other]
Title: Multi-Scale Visual Prompting for Lightweight Small-Image Classification
Authors: Salim Khazem
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[281]  arXiv:2512.03643 [pdf, ps, other]
Title: Optical Context Compression Is Just (Bad) Autoencoding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[282]  arXiv:2512.03640 [pdf, ps, other]
Title: MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms
Journal-ref: MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15521. Springer, Singapore
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[283]  arXiv:2512.03625 [pdf, ps, other]
Title: FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284]  arXiv:2512.03621 [pdf, ps, other]
Title: ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285]  arXiv:2512.03619 [pdf, ps, other]
Title: LAMP: Language-Assisted Motion Planning for Controllable Video Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286]  arXiv:2512.03601 [pdf, ps, other]
Title: Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding
Comments: Accepted to NeurIPS 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[287]  arXiv:2512.03598 [pdf, ps, other]
Title: Memory-Guided Point Cloud Completion for Dental Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288]  arXiv:2512.03597 [pdf, ps, other]
Title: HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation
Comments: 6 pages, 4 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289]  arXiv:2512.03593 [pdf, ps, other]
Title: CloseUpAvatar: High-Fidelity Animatable Full-Body Avatars with Mixture of Multi-Scale Textures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2512.03592 [pdf, ps, other]
Title: Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding
Authors: Guang Yang, Lei Fan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291]  arXiv:2512.03590 [pdf, ps, other]
Title: Beyond Boundary Frames: Audio-Visual Semantic Guidance for Context-Aware Video Interpolation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[292]  arXiv:2512.03580 [pdf, ps, other]
Title: Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[293]  arXiv:2512.03577 [pdf, ps, other]
Title: Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning
Comments: 6 pages, 2 figures. Camera-ready version accepted for IEEE BIBM 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[294]  arXiv:2512.03575 [pdf, ps, other]
Title: UniComp: Rethinking Video Compression Through Informational Uniqueness
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2512.03574 [pdf, ps, other]
Title: Global-Local Aware Scene Text Editing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296]  arXiv:2512.03566 [pdf, ps, other]
Title: GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models
Comments: Accepted by ACM MM Asia2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[297]  arXiv:2512.03558 [pdf, ps, other]
Title: CartoMapQA: A Fundamental Benchmark Dataset Evaluating Vision-Language Models on Cartographic Map Understanding
Comments: Accepted at SIGSPATIAL 2025 (Best paper candidates), 15 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[298]  arXiv:2512.03553 [pdf, ps, other]
Title: Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching
Comments: Accepted at KDD 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[299]  arXiv:2512.03542 [pdf, ps, other]
Title: V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time Intervention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[300]  arXiv:2512.03540 [pdf, ps, other]
Title: CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation
Comments: Accepted by ACM Multimedia 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[301]  arXiv:2512.03534 [pdf, ps, other]
Title: Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation
Comments: Visualizations are available at the website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[302]  arXiv:2512.03532 [pdf, ps, other]
Title: OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303]  arXiv:2512.03520 [pdf, ps, other]
Title: FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation
Comments: 15 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2512.03510 [pdf, ps, other]
Title: CSMapping: Scalable Crowdsourced Semantic Mapping and Topology Inference for Autonomous Driving
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[305]  arXiv:2512.03509 [pdf, ps, other]
Title: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306]  arXiv:2512.03508 [pdf, ps, other]
Title: Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation
Comments: ICCV 2025 (poster)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2512.03500 [pdf, ps, other]
Title: EEA: Exploration-Exploitation Agent for Long Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308]  arXiv:2512.03499 [pdf, ps, other]
Title: NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[309]  arXiv:2512.03479 [pdf, ps, other]
Title: Towards Object-centric Understanding for Instructional Videos
Authors: Wenliang Guo, Yu Kong
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310]  arXiv:2512.03477 [pdf, ps, other]
Title: Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis
Comments: 10 pages, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[311]  arXiv:2512.03474 [pdf, ps, other]
Title: Procedural Mistake Detection via Action Effect Modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312]  arXiv:2512.03470 [pdf, ps, other]
Title: Difference Decomposition Networks for Infrared Small Target Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313]  arXiv:2512.03463 [pdf, ps, other]
Title: Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[314]  arXiv:2512.03454 [pdf, ps, other]
Title: Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[315]  arXiv:2512.03453 [pdf, ps, other]
Title: GeoVideo: Introducing Geometric Regularization into Video Generation Model
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316]  arXiv:2512.03451 [pdf, ps, other]
Title: GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[317]  arXiv:2512.03450 [pdf, ps, other]
Title: KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[318]  arXiv:2512.03449 [src]
Title: LM-CartSeg: Automated Segmentation of Lateral and Medial Cartilage and Subchondral Bone for Radiomics Analysis
Authors: Tongxu Zhang
Comments: The manuscript represents only a preliminary and substantially incompleted exploration. The author has decided not to stand by these results, and a thoroughly revised and significantly different version will be developed separately. Therefore this version is withdrawn and should not be cited
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[319]  arXiv:2512.03445 [pdf, ps, other]
Title: Multi-Aspect Knowledge-Enhanced Medical Vision-Language Pretraining with Multi-Agent Data Generation
Comments: 10 pages. Under Review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320]  arXiv:2512.03430 [pdf, ps, other]
Title: Label-Efficient Hyperspectral Image Classification via Spectral FiLM Modulation of Low-Level Pretrained Diffusion Features
Comments: Accepted to the ICML 2025 TerraBytes Workshop (June 9, 2025)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321]  arXiv:2512.03427 [pdf, ps, other]
Title: Generalization Evaluation of Deep Stereo Matching Methods for UAV-Based Forestry Applications
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322]  arXiv:2512.03424 [pdf, ps, other]
Title: DM3D: Deformable Mamba via Offset-Guided Gaussian Sequencing for Point Cloud Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323]  arXiv:2512.03418 [pdf, ps, other]
Title: YOLOA: Real-Time Affordance Detection via LLM Adapter
Comments: 13 pages, 9 figures, conference
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[324]  arXiv:2512.03405 [pdf, ps, other]
Title: ViDiC: Video Difference Captioning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325]  arXiv:2512.03404 [pdf, ps, other]
Title: MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326]  arXiv:2512.03370 [pdf, ps, other]
Title: ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[327]  arXiv:2512.03369 [pdf, ps, other]
Title: FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread Forecasting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[328]  arXiv:2512.03359 [pdf, ps, other]
Title: A Hybrid Deep Learning Framework with Explainable AI for Lung Cancer Classification with DenseNet169 and SVM
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329]  arXiv:2512.03350 [pdf, ps, other]
Title: SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 778 entries: 1-50 | ... | 130-179 | 180-229 | 230-279 | 280-329 | 330-379 | 380-429 | 430-479 | ... | 730-778 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)