Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 664

[ total of 759 entries: 1-50 | ... | 515-564 | 565-614 | 615-664 | 665-714 | 715-759 ]
[ showing 50 entries per page: fewer | more | all ]

Wed, 3 Dec 2025 (continued, showing 50 of 141 entries)

[665] arXiv:2512.02743 [pdf, ps, other]: Title: Reasoning-Aware Multimodal Fusion for Hateful Video Detection

Authors: Shuonan Yang, Tailin Chen, Jiangbei Yue, Guangliang Cheng, Jianbo Jiao, Zeyu Fu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[666] arXiv:2512.02737 [pdf, ps, other]: Title: Beyond Paired Data: Self-Supervised UAV Geo-Localization from Reference Imagery Alone

Authors: Tristan Amadei, Enric Meinhardt-Llopis, Benedicte Bascle, Corentin Abgrall, Gabriele Facciolo

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[667] arXiv:2512.02727 [pdf, ps, other]: Title: DF-Mamba: Deformable State Space Modeling for 3D Hand Pose Estimation in Interactions

Authors: Yifan Zhou, Takehiko Ohkawa, Guwenxiao Zhou, Kanoko Goto, Takumi Hirose, Yusuke Sekikawa, Nakamasa Inoue

Comments: Accepted to WACV 2026. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[668] arXiv:2512.02715 [pdf, ps, other]: Title: GeoViS: Geospatially Rewarded Visual Search for Remote Sensing Visual Grounding

Authors: Peirong Zhang, Yidan Zhang, Luxiao Xu, Jinliang Lin, Zonghao Guo, Fengxiang Wang, Xue Yang, Kaiwen Wei, Lei Wang

Comments: 11 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[669] arXiv:2512.02702 [pdf, ps, other]: Title: Tissue-mask supported inter-subject whole-body image registration in the UK Biobank -- A method benchmarking study

Authors: Yasemin Utkueri, Elin Lundström, Håkan Ahlström, Johan Öfverstedt, Joel Kullberg

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[670] arXiv:2512.02700 [pdf, ps, other]: Title: VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm

Authors: Zhenkai Wu, Xiaowen Ma, Zhenliang Ni, Dengming Zhang, Han Shu, Xin Jiang, Xinghao Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[671] arXiv:2512.02697 [pdf, ps, other]: Title: GeoBridge: A Semantic-Anchored Multi-View Foundation Model Bridging Images and Text for Geo-Localization

Authors: Zixuan Song, Jing Zhang, Di Wang, Zidie Zhou, Wenbin Liu, Haonan Guo, En Wang, Bo Du

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[672] arXiv:2512.02696 [pdf, ps, other]: Title: ALDI-ray: Adapting the ALDI Framework for Security X-ray Object Detection

Authors: Omid Reza Heidari, Yang Wang, Xinxin Zuo

Comments: Submitted to ICASSP 2026 Conference

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[673] arXiv:2512.02686 [pdf, ps, other]: Title: ClimaOoD: Improving Anomaly Segmentation via Physically Realistic Synthetic Data

Authors: Yuxing Liu, Yong Liu

Comments: Under review;

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[674] arXiv:2512.02685 [pdf, ps, other]: Title: Unsupervised Structural Scene Decomposition via Foreground-Aware Slot Attention with Pseudo-Mask Guidance

Authors: Huankun Sheng, Ming Li, Yixiang Wei, Yeying Fan, Yu-Hui Wen, Tieliang Gong, Yong-Jin Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[675] arXiv:2512.02681 [pdf, ps, other]: Title: PGP-DiffSR: Phase-Guided Progressive Pruning for Efficient Diffusion-based Image Super-Resolution

Authors: Zhongbao Yang, Jiangxin Dong, Yazhou Yao, Jinhui Tang, Jinshan Pan

Comments: 10 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[676] arXiv:2512.02668 [pdf, ps, other]: Title: UAUTrack: Towards Unified Multimodal Anti-UAV Visual Tracking

Authors: Qionglin Ren, Dawei Zhang, Chunxu Tian, Dan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[677] arXiv:2512.02664 [pdf, ps, other]: Title: PolarGuide-GSDR: 3D Gaussian Splatting Driven by Polarization Priors and Deferred Reflection for Real-World Reflective Scenes

Authors: Derui Shan, Qian Qiao, Hao Lu, Tao Du, Peng Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[678] arXiv:2512.02660 [pdf, ps, other]: Title: Spatially-Grounded Document Retrieval via Patch-to-Region Relevance Propagation

Authors: Agathoklis Georgiou

Comments: 13 pages, 1 figure, 2 tables. Open-source implementation available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
[679] arXiv:2512.02650 [pdf, ps, other]: Title: Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

Authors: Junwon Lee, Juhan Nam, Jiyoung Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[680] arXiv:2512.02648 [pdf, ps, other]: Title: PoreTrack3D: A Benchmark for Dynamic 3D Gaussian Splatting in Pore-Scale Facial Trajectory Tracking

Authors: Dong Li, Jiahao Xiong, Yingda Huang, Le Chang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[681] arXiv:2512.02643 [pdf, ps, other]: Title: Leveraging Large-Scale Pretrained Spatial-Spectral Priors for General Zero-Shot Pansharpening

Authors: Yongchuan Cui, Peng Liu, Yi Zeng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[682] arXiv:2512.02624 [pdf, ps, other]: Title: PPTBench: Towards Holistic Evaluation of Large Language Models for PowerPoint Layout and Design Understanding

Authors: Zheng Huang, Xukai Liu, Tianyu Hu, Kai Zhang, Ye Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[683] arXiv:2512.02622 [pdf, ps, other]: Title: RULER-Bench: Probing Rule-based Reasoning Abilities of Next-level Video Generation Models for Vision Foundation Intelligence

Authors: Xuming He, Zehao Fan, Hengjia Li, Fan Zhuo, Hankun Xu, Senlin Cheng, Di Weng, Haifeng Liu, Can Ye, Boxi Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[684] arXiv:2512.02621 [pdf, ps, other]: Title: Content-Aware Texturing for Gaussian Splatting

Authors: Panagiotis Papantonakis, Georgios Kopanas, Fredo Durand, George Drettakis

Comments: Project Page: this https URL

Journal-ref: Eurographics Symposium on Rendering (Symposium Track), 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[685] arXiv:2512.02576 [pdf, ps, other]: Title: Co-speech Gesture Video Generation via Motion-Based Graph Retrieval

Authors: Yafei Song, Peng Zhang, Bang Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[686] arXiv:2512.02566 [pdf, ps, other]: Title: From Panel to Pixel: Zoom-In Vision-Language Pretraining from Biomedical Scientific Literature

Authors: Kun Yuan, Min Woo Sun, Zhen Chen, Alejandro Lozano, Xiangteng He, Shi Li, Nassir Navab, Xiaoxiao Sun, Nicolas Padoy, Serena Yeung-Levy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[687] arXiv:2512.02554 [pdf, ps, other]: Title: OmniPerson: Unified Identity-Preserving Pedestrian Generation

Authors: Changxiao Ma, Chao Yuan, Xincheng Shi, Yuzhuo Ma, Yongfei Zhang, Longkun Zhou, Yujia Zhang, Shangze Li, Yifan Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[688] arXiv:2512.02541 [pdf, ps, other]: Title: AVGGT: Rethinking Global Attention for Accelerating VGGT

Authors: Xianbing Sun, Zhikai Zhu, Zhengyu Lou, Bo Yang, Jinyang Tang, Liqing Zhang, He Wang, Jianfu Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[689] arXiv:2512.02536 [pdf, ps, other]: Title: WeMMU: Enhanced Bridging of Vision-Language Models and Diffusion Models via Noisy Query Tokens

Authors: Jian Yang, Dacheng Yin, Xiaoxuan He, Yong Li, Fengyun Rao, Jing Lyu, Wei Zhai, Yang Cao, Zheng-Jun Zha

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[690] arXiv:2512.02520 [pdf, ps, other]: Title: On the Problem of Consistent Anomalies in Zero-Shot Anomaly Detection

Authors: Tai Le-Gia

Comments: PhD Dissertation

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[691] arXiv:2512.02517 [pdf, ps, other]: Title: SkyMoE: A Vision-Language Foundation Model for Enhancing Geospatial Interpretation with Mixture of Experts

Authors: Jiaqi Liu, Ronghao Fu, Lang Sun, Haoran Liu, Xiao Yang, Weipeng Zhang, Xu Na, Zhuoran Duan, Bo Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[692] arXiv:2512.02512 [pdf, ps, other]: Title: Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling

Authors: Aditya Chaudhary, Prachet Dev Singh, Ankit Jha

Comments: Accepted as a Tiny Paper at the 13th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2025), IIT Mandi, India. 3 pages, 1 figure

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[693] arXiv:2512.02505 [pdf, ps, other]: Title: GeoDiT: A Diffusion-based Vision-Language Model for Geospatial Understanding

Authors: Jiaqi Liu, Ronghao Fu, Haoran Liu, Lang Sun, Bo Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[694] arXiv:2512.02498 [pdf, ps, other]: Title: dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model

Authors: Yumeng Li, Guang Yang, Hao Liu, Bowen Wang, Colin Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[695] arXiv:2512.02497 [pdf, ps, other]: Title: A Large Scale Benchmark for Test Time Adaptation Methods in Medical Image Segmentation

Authors: Wenjing Yu, Shuo Jiang, Yifei Chen, Shuo Chang, Yuanhan Wang, Beining Wu, Jie Dong, Mingxuan Liu, Shenghao Zhu, Feiwei Qin, Changmiao Wang, Qiyuan Tian

Comments: 45 pages, 18 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[696] arXiv:2512.02496 [pdf, ps, other]: Title: Attention-guided reference point shifting for Gaussian-mixture-based partial point set registration

Authors: Mizuki Kikkawa, Tatsuya Yatagawa, Yutaka Ohtake, Hiromasa Suzuki

Comments: 16 pages, 9 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[697] arXiv:2512.02492 [pdf, ps, other]: Title: YingVideo-MV: Music-Driven Multi-Stage Video Generation

Authors: Jiahui Chen, Weida Wang, Runhua Shi, Huan Yang, Chaofan Ding, Zihao Chen

Comments: 18 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[698] arXiv:2512.02487 [pdf, ps, other]: Title: Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding

Authors: Yerim Jeon, Miso Lee, WonJun Moon, Jae-Pil Heo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[699] arXiv:2512.02485 [pdf, ps, other]: Title: UCAgents: Unidirectional Convergence for Visual Evidence Anchored Multi-Agent Medical Decision-Making

Authors: Qianhan Feng, Zhongzhen Huang, Yakun Zhu, Xiaofan Zhang, Qi Dou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[700] arXiv:2512.02482 [pdf, ps, other]: Title: G-SHARP: Gaussian Surgical Hardware Accelerated Real-time Pipeline

Authors: Vishwesh Nath, Javier G. Tejero, Ruilong Li, Filippo Filicori, Mahdi Azizian, Sean D. Huver

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[701] arXiv:2512.02473 [pdf, ps, other]: Title: WorldPack: Compressed Memory Improves Spatial Consistency in Video World Modeling

Authors: Yuta Oshima, Yusuke Iwasawa, Masahiro Suzuki, Yutaka Matsuo, Hiroki Furuta

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[702] arXiv:2512.02469 [pdf, ps, other]: Title: TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution

Authors: Fengli Ran, Xiao Pu, Bo Liu, Xiuli Bi, Bin Xiao

Comments: Accepted in AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[703] arXiv:2512.02458 [pdf, ps, other]: Title: Vision to Geometry: 3D Spatial Memory for Sequential Embodied MLLM Reasoning and Exploration

Authors: Zhongyi Cai, Yi Du, Chen Wang, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[704] arXiv:2512.02457 [pdf, ps, other]: Title: Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation

Authors: Jianzong Wu, Hao Lian, Dachao Hao, Ye Tian, Qingyu Shi, Biaolong Chen, Hao Jiang, Yunhai Tong

Comments: Project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[705] arXiv:2512.02456 [pdf, ps, other]: Title: See, Think, Learn: A Self-Taught Multimodal Reasoner

Authors: Sourabh Sharma, Sonam Gupta, Sadbhawna

Comments: Winter Conference on Applications of Computer Vision 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[706] arXiv:2512.02453 [pdf, ps, other]: Title: ClusterStyle: Modeling Intra-Style Diversity with Prototypical Clustering for Stylized Motion Generation

Authors: Kerui Chen, Jianrong Zhang, Ming Li, Zhonglong Zheng, Hehe Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[707] arXiv:2512.02450 [pdf, ps, other]: Title: HouseLayout3D: A Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild

Authors: Valentin Bieri, Marie-Julie Rakotosaona, Keisuke Tateno, Francis Engelmann, Leonidas Guibas

Comments: NeurIPS 2025 (Datasets and Benchmarks Track) Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[708] arXiv:2512.02448 [pdf, ps, other]: Title: nuScenes Revisited: Progress and Challenges in Autonomous Driving

Authors: Whye Kit Fong, Venice Erin Liong, Kok Seang Tan, Holger Caesar

Comments: 18 pages, 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[709] arXiv:2512.02447 [pdf, ps, other]: Title: Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors

Authors: Fan Luo, Zeyu Gao, Xinhao Luo, Kai Zhao, Yanfeng Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[710] arXiv:2512.02441 [pdf, ps, other]: Title: Basis-Oriented Low-rank Transfer for Few-Shot and Test-Time Adaptation

Authors: Junghwan Park, Woojin Cho, Junhyuk Heo, Darongsae Kwon, Kookjin Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[711] arXiv:2512.02438 [pdf, ps, other]: Title: Boosting Medical Vision-Language Pretraining via Momentum Self-Distillation under Limited Computing Resources

Authors: Phuc Pham, Nhu Pham, Ngoc Quoc Ly

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[712] arXiv:2512.02437 [pdf, ps, other]: Title: LightHCG: a Lightweight yet powerful HSIC Disentanglement based Causal Glaucoma Detection Model framework

Authors: Daeyoung Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[713] arXiv:2512.02425 [pdf, ps, other]: Title: WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning

Authors: Woongyeong Yeo, Kangsan Kim, Jaehong Yoon, Sung Ju Hwang

Comments: Project page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[714] arXiv:2512.02423 [pdf, ps, other]: Title: GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning

Authors: Haolong Yan, Yeqing Shen, Xin Huang, Jia Wang, Kaijun Tan, Zhixuan Liang, Hongxin Li, Zheng Ge, Osamu Yoshie, Si Li, Xiangyu Zhang, Daxin Jiang

Comments: 26 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 759 entries: 1-50 | ... | 515-564 | 565-614 | 615-664 | 665-714 | 715-759 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 664

Wed, 3 Dec 2025 (continued, showing 50 of 141 entries)