Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 421

[ total of 778 entries: 1-50 | ... | 272-321 | 322-371 | 372-421 | 422-471 | 472-521 | 522-571 | 572-621 | ... | 772-778 ]
[ showing 50 entries per page: fewer | more | all ]

Wed, 3 Dec 2025 (continued, showing 50 of 141 entries)

[422] arXiv:2512.02643 [pdf, ps, other]: Title: Leveraging Large-Scale Pretrained Spatial-Spectral Priors for General Zero-Shot Pansharpening

Authors: Yongchuan Cui, Peng Liu, Yi Zeng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[423] arXiv:2512.02624 [pdf, ps, other]: Title: PPTBench: Towards Holistic Evaluation of Large Language Models for PowerPoint Layout and Design Understanding

Authors: Zheng Huang, Xukai Liu, Tianyu Hu, Kai Zhang, Ye Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[424] arXiv:2512.02622 [pdf, ps, other]: Title: RULER-Bench: Probing Rule-based Reasoning Abilities of Next-level Video Generation Models for Vision Foundation Intelligence

Authors: Xuming He, Zehao Fan, Hengjia Li, Fan Zhuo, Hankun Xu, Senlin Cheng, Di Weng, Haifeng Liu, Can Ye, Boxi Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[425] arXiv:2512.02621 [pdf, ps, other]: Title: Content-Aware Texturing for Gaussian Splatting

Authors: Panagiotis Papantonakis, Georgios Kopanas, Fredo Durand, George Drettakis

Comments: Project Page: this https URL

Journal-ref: Eurographics Symposium on Rendering (Symposium Track), 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[426] arXiv:2512.02576 [pdf, ps, other]: Title: Co-speech Gesture Video Generation via Motion-Based Graph Retrieval

Authors: Yafei Song, Peng Zhang, Bang Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427] arXiv:2512.02566 [pdf, ps, other]: Title: From Panel to Pixel: Zoom-In Vision-Language Pretraining from Biomedical Scientific Literature

Authors: Kun Yuan, Min Woo Sun, Zhen Chen, Alejandro Lozano, Xiangteng He, Shi Li, Nassir Navab, Xiaoxiao Sun, Nicolas Padoy, Serena Yeung-Levy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[428] arXiv:2512.02554 [pdf, ps, other]: Title: OmniPerson: Unified Identity-Preserving Pedestrian Generation

Authors: Changxiao Ma, Chao Yuan, Xincheng Shi, Yuzhuo Ma, Yongfei Zhang, Longkun Zhou, Yujia Zhang, Shangze Li, Yifan Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429] arXiv:2512.02541 [pdf, ps, other]: Title: AVGGT: Rethinking Global Attention for Accelerating VGGT

Authors: Xianbing Sun, Zhikai Zhu, Zhengyu Lou, Bo Yang, Jinyang Tang, Liqing Zhang, He Wang, Jianfu Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430] arXiv:2512.02536 [pdf, ps, other]: Title: WeMMU: Enhanced Bridging of Vision-Language Models and Diffusion Models via Noisy Query Tokens

Authors: Jian Yang, Dacheng Yin, Xiaoxuan He, Yong Li, Fengyun Rao, Jing Lyu, Wei Zhai, Yang Cao, Zheng-Jun Zha

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[431] arXiv:2512.02520 [pdf, ps, other]: Title: On the Problem of Consistent Anomalies in Zero-Shot Anomaly Detection

Authors: Tai Le-Gia

Comments: PhD Dissertation

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[432] arXiv:2512.02517 [pdf, ps, other]: Title: SkyMoE: A Vision-Language Foundation Model for Enhancing Geospatial Interpretation with Mixture of Experts

Authors: Jiaqi Liu, Ronghao Fu, Lang Sun, Haoran Liu, Xiao Yang, Weipeng Zhang, Xu Na, Zhuoran Duan, Bo Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[433] arXiv:2512.02512 [pdf, ps, other]: Title: Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling

Authors: Aditya Chaudhary, Prachet Dev Singh, Ankit Jha

Comments: Accepted as a Tiny Paper at the 13th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2025), IIT Mandi, India. 3 pages, 1 figure

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[434] arXiv:2512.02505 [pdf, ps, other]: Title: GeoDiT: A Diffusion-based Vision-Language Model for Geospatial Understanding

Authors: Jiaqi Liu, Ronghao Fu, Haoran Liu, Lang Sun, Bo Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435] arXiv:2512.02498 [pdf, ps, other]: Title: dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model

Authors: Yumeng Li, Guang Yang, Hao Liu, Bowen Wang, Colin Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[436] arXiv:2512.02497 [pdf, ps, other]: Title: A Large Scale Benchmark for Test Time Adaptation Methods in Medical Image Segmentation

Authors: Wenjing Yu, Shuo Jiang, Yifei Chen, Shuo Chang, Yuanhan Wang, Beining Wu, Jie Dong, Mingxuan Liu, Shenghao Zhu, Feiwei Qin, Changmiao Wang, Qiyuan Tian

Comments: 45 pages, 18 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[437] arXiv:2512.02496 [pdf, ps, other]: Title: Attention-guided reference point shifting for Gaussian-mixture-based partial point set registration

Authors: Mizuki Kikkawa, Tatsuya Yatagawa, Yutaka Ohtake, Hiromasa Suzuki

Comments: 16 pages, 9 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[438] arXiv:2512.02492 [pdf, ps, other]: Title: YingVideo-MV: Music-Driven Multi-Stage Video Generation

Authors: Jiahui Chen, Weida Wang, Runhua Shi, Huan Yang, Chaofan Ding, Zihao Chen

Comments: 18 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[439] arXiv:2512.02487 [pdf, ps, other]: Title: Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding

Authors: Yerim Jeon, Miso Lee, WonJun Moon, Jae-Pil Heo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[440] arXiv:2512.02485 [pdf, ps, other]: Title: UCAgents: Unidirectional Convergence for Visual Evidence Anchored Multi-Agent Medical Decision-Making

Authors: Qianhan Feng, Zhongzhen Huang, Yakun Zhu, Xiaofan Zhang, Qi Dou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[441] arXiv:2512.02482 [pdf, ps, other]: Title: G-SHARP: Gaussian Surgical Hardware Accelerated Real-time Pipeline

Authors: Vishwesh Nath, Javier G. Tejero, Ruilong Li, Filippo Filicori, Mahdi Azizian, Sean D. Huver

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[442] arXiv:2512.02473 [pdf, ps, other]: Title: WorldPack: Compressed Memory Improves Spatial Consistency in Video World Modeling

Authors: Yuta Oshima, Yusuke Iwasawa, Masahiro Suzuki, Yutaka Matsuo, Hiroki Furuta

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[443] arXiv:2512.02469 [pdf, ps, other]: Title: TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution

Authors: Fengli Ran, Xiao Pu, Bo Liu, Xiuli Bi, Bin Xiao

Comments: Accepted in AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[444] arXiv:2512.02458 [pdf, ps, other]: Title: Vision to Geometry: 3D Spatial Memory for Sequential Embodied MLLM Reasoning and Exploration

Authors: Zhongyi Cai, Yi Du, Chen Wang, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[445] arXiv:2512.02457 [pdf, ps, other]: Title: Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation

Authors: Jianzong Wu, Hao Lian, Dachao Hao, Ye Tian, Qingyu Shi, Biaolong Chen, Hao Jiang, Yunhai Tong

Comments: Project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[446] arXiv:2512.02456 [pdf, ps, other]: Title: See, Think, Learn: A Self-Taught Multimodal Reasoner

Authors: Sourabh Sharma, Sonam Gupta, Sadbhawna

Comments: Winter Conference on Applications of Computer Vision 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[447] arXiv:2512.02453 [pdf, ps, other]: Title: ClusterStyle: Modeling Intra-Style Diversity with Prototypical Clustering for Stylized Motion Generation

Authors: Kerui Chen, Jianrong Zhang, Ming Li, Zhonglong Zheng, Hehe Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[448] arXiv:2512.02450 [pdf, ps, other]: Title: HouseLayout3D: A Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild

Authors: Valentin Bieri, Marie-Julie Rakotosaona, Keisuke Tateno, Francis Engelmann, Leonidas Guibas

Comments: NeurIPS 2025 (Datasets and Benchmarks Track) Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[449] arXiv:2512.02448 [pdf, ps, other]: Title: nuScenes Revisited: Progress and Challenges in Autonomous Driving

Authors: Whye Kit Fong, Venice Erin Liong, Kok Seang Tan, Holger Caesar

Comments: 18 pages, 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[450] arXiv:2512.02447 [pdf, ps, other]: Title: Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors

Authors: Fan Luo, Zeyu Gao, Xinhao Luo, Kai Zhao, Yanfeng Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451] arXiv:2512.02441 [pdf, ps, other]: Title: Basis-Oriented Low-rank Transfer for Few-Shot and Test-Time Adaptation

Authors: Junghwan Park, Woojin Cho, Junhyuk Heo, Darongsae Kwon, Kookjin Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[452] arXiv:2512.02438 [pdf, ps, other]: Title: Boosting Medical Vision-Language Pretraining via Momentum Self-Distillation under Limited Computing Resources

Authors: Phuc Pham, Nhu Pham, Ngoc Quoc Ly

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[453] arXiv:2512.02437 [pdf, ps, other]: Title: LightHCG: a Lightweight yet powerful HSIC Disentanglement based Causal Glaucoma Detection Model framework

Authors: Daeyoung Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[454] arXiv:2512.02425 [pdf, ps, other]: Title: WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning

Authors: Woongyeong Yeo, Kangsan Kim, Jaehong Yoon, Sung Ju Hwang

Comments: Project page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[455] arXiv:2512.02423 [pdf, ps, other]: Title: GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning

Authors: Haolong Yan, Yeqing Shen, Xin Huang, Jia Wang, Kaijun Tan, Zhixuan Liang, Hongxin Li, Zheng Ge, Osamu Yoshie, Si Li, Xiangyu Zhang, Daxin Jiang

Comments: 26 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456] arXiv:2512.02421 [pdf, ps, other]: Title: Generalizing Vision-Language Models with Dedicated Prompt Guidance

Authors: Xinyao Li, Yinjie Min, Hongbo Chen, Zhekai Du, Fengling Li, Jingjing Li

Comments: Accepted to AAAI26

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[457] arXiv:2512.02413 [pdf, ps, other]: Title: MitUNet: Enhancing Floor Plan Recognition using a Hybrid Mix-Transformer and U-Net Architecture

Authors: Dmitriy Parashchuk, Alexey Kapshitskiy, Yuriy Karyakin

Comments: 9 pages, 4 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[458] arXiv:2512.02405 [pdf, ps, other]: Title: WISE: Weighted Iterative Society-of-Experts for Robust Multimodal Multi-Agent Debate

Authors: Anoop Cherian, River Doyle, Eyal Ben-Dov, Suhas Lohit, Kuan-Chuan Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[459] arXiv:2512.02400 [pdf, ps, other]: Title: Nav-$R^2$ Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation

Authors: Wentao Xiang, Haokang Zhang, Tianhang Yang, Zedong Chu, Ruihang Chu, Shichao Xie, Yujian Yuan, Jian Sun, Zhining Gu, Junjie Wang, Xiaolong Wu, Mu Xu, Yujiu Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460] arXiv:2512.02395 [pdf, ps, other]: Title: Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch

Authors: Yifan Zhang, Liang Hu, Haofeng Sun, Peiyu Wang, Yichen Wei, Shukang Yin, Jiangbo Pei, Wei Shen, Peng Xia, Yi Peng, Tianyidan Xie, Eric Li, Yang Liu, Xuchen Song, Yahui Zhou

Comments: 21 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461] arXiv:2512.02394 [pdf, ps, other]: Title: Reproducing and Extending RaDelft 4D Radar with Camera-Assisted Labels

Authors: Kejia Hu, Mohammed Alsakabi, John M. Dolan, Ozan K. Tonguz

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462] arXiv:2512.02392 [pdf, ps, other]: Title: From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking

Authors: Yuqing Shao, Yuchen Yang, Rui Yu, Weilong Li, Xu Guo, Huaicheng Yan, Wei Wang, Xiao Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463] arXiv:2512.02375 [pdf, ps, other]: Title: On-the-fly Feedback SfM: Online Explore-and-Exploit UAV Photogrammetry with Incremental Mesh Quality-Aware Indicator and Predictive Path Planning

Authors: Liyuan Lou, Wanyun Li, Wentian Gan, Yifei Yu, Tengfei Wang, Xin Wang, Zongqian Zhan

Comments: This work was submitted to IEEE GRSM Journal for consideration.COPYRIGHT would be transferred once it get accepted

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464] arXiv:2512.02369 [pdf, ps, other]: Title: SAGE: Style-Adaptive Generalization for Privacy-Constrained Semantic Segmentation Across Domains

Authors: Qingmei Li, Yang Zhang, Peifeng Zhang, Haohuan Fu, Juepeng Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[465] arXiv:2512.02368 [pdf, ps, other]: Title: Multi-Domain Enhanced Map-Free Trajectory Prediction with Selective Attention

Authors: Wenyi Xiong, Jian Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[466] arXiv:2512.02364 [pdf, ps, other]: Title: Tackling Tuberculosis: A Comparative Dive into Machine Learning for Tuberculosis Detection

Authors: Daanish Hindustani, Sanober Hindustani, Preston Nguyen

Journal-ref: Vol. 6, No. 1 (2024), Minnesota Undergraduate Research & Academic Journal (MURAJ)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[467] arXiv:2512.02361 [pdf, ps, other]: Title: VACoT: Rethinking Visual Data Augmentation with VLMs

Authors: Zhengzhuo Xu, Chong Sun, SiNan Du, Chen Li, Jing Lyu, Chun Yuan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[468] arXiv:2512.02359 [pdf, ps, other]: Title: WSCF-MVCC: Weakly-supervised Calibration-free Multi-view Crowd Counting

Authors: Bin Li, Daijie Chen, Qi Zhang

Comments: PRCV 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469] arXiv:2512.02351 [pdf, ps, other]: Title: Understanding and Harnessing Sparsity in Unified Multimodal Models

Authors: Shwai He, Chaorui Deng, Ang Li, Shen Yan

Comments: 13 pages, 13 figures, 8 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[470] arXiv:2512.02344 [pdf, ps, other]: Title: A multi-weight self-matching visual explanation for cnns on sar images

Authors: Siyuan Sun, Yongping Zhang, Hongcheng Zeng, Yamin Wang, Wei Yang, Wanting Yang, Jie Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[471] arXiv:2512.02341 [pdf, ps, other]: Title: TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction

Authors: Fengyi Zhang, Tianjun Zhang, Kasra Khosoussi, Zheng Zhang, Zi Huang, Yadan Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 778 entries: 1-50 | ... | 272-321 | 322-371 | 372-421 | 422-471 | 472-521 | 522-571 | 572-621 | ... | 772-778 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 421

Wed, 3 Dec 2025 (continued, showing 50 of 141 entries)