Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 854

[ total of 759 entries: 1-100 | ... | 360-459 | 460-559 | 560-659 | 660-759 ]
[ showing 100 entries per page: fewer | more | all ]

Wed, 3 Dec 2025 (continued, showing last 100 of 141 entries)

[660] arXiv:2512.02790 [pdf, ps, other]: Title: UnicEdit-10M: A Dataset and Benchmark Breaking the Scale-Quality Barrier via Unified Verification for Reasoning-Enriched Edits

Authors: Keming Ye, Zhipeng Huang, Canmiao Fu, Qingyang Liu, Jiani Cai, Zheqi Lv, Chen Li, Jing Lyu, Zhou Zhao, Shengyu Zhang

Comments: 31 pages, 15 figures, 12 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[661] arXiv:2512.02789 [pdf, ps, other]: Title: TrackNetV5: Residual-Driven Spatio-Temporal Refinement and Motion Direction Decoupling for Fast Object Tracking

Authors: Tang Haonan, Chen Yanjun, Jiang Lezhi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[662] arXiv:2512.02781 [pdf, ps, other]: Title: LumiX: Structured and Coherent Text-to-Intrinsic Generation

Authors: Xu Han, Biao Zhang, Xiangjun Tang, Xianzhi Li, Peter Wonka

Comments: The code will be available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[663] arXiv:2512.02780 [pdf, ps, other]: Title: Rethinking Surgical Smoke: A Smoke-Type-Aware Laparoscopic Video Desmoking Method and Dataset

Authors: Qifan Liang, Junlin Li, Zhen Han, Xihao Wang, Zhongyuan Wang, Bin Mei

Comments: 12 pages, 15 figures. Accepted to AAAI-26 (Main Technical Track)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[664] arXiv:2512.02751 [pdf, ps, other]: Title: AttMetNet: Attention-Enhanced Deep Neural Network for Methane Plume Detection in Sentinel-2 Satellite Imagery

Authors: Rakib Ahsan, MD Sadik Hossain Shanto, Md Sultanul Arifin, Tanzima Hashem

Comments: 15 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[665] arXiv:2512.02743 [pdf, ps, other]: Title: Reasoning-Aware Multimodal Fusion for Hateful Video Detection

Authors: Shuonan Yang, Tailin Chen, Jiangbei Yue, Guangliang Cheng, Jianbo Jiao, Zeyu Fu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[666] arXiv:2512.02737 [pdf, ps, other]: Title: Beyond Paired Data: Self-Supervised UAV Geo-Localization from Reference Imagery Alone

Authors: Tristan Amadei, Enric Meinhardt-Llopis, Benedicte Bascle, Corentin Abgrall, Gabriele Facciolo

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[667] arXiv:2512.02727 [pdf, ps, other]: Title: DF-Mamba: Deformable State Space Modeling for 3D Hand Pose Estimation in Interactions

Authors: Yifan Zhou, Takehiko Ohkawa, Guwenxiao Zhou, Kanoko Goto, Takumi Hirose, Yusuke Sekikawa, Nakamasa Inoue

Comments: Accepted to WACV 2026. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[668] arXiv:2512.02715 [pdf, ps, other]: Title: GeoViS: Geospatially Rewarded Visual Search for Remote Sensing Visual Grounding

Authors: Peirong Zhang, Yidan Zhang, Luxiao Xu, Jinliang Lin, Zonghao Guo, Fengxiang Wang, Xue Yang, Kaiwen Wei, Lei Wang

Comments: 11 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[669] arXiv:2512.02702 [pdf, ps, other]: Title: Tissue-mask supported inter-subject whole-body image registration in the UK Biobank -- A method benchmarking study

Authors: Yasemin Utkueri, Elin Lundström, Håkan Ahlström, Johan Öfverstedt, Joel Kullberg

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[670] arXiv:2512.02700 [pdf, ps, other]: Title: VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm

Authors: Zhenkai Wu, Xiaowen Ma, Zhenliang Ni, Dengming Zhang, Han Shu, Xin Jiang, Xinghao Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[671] arXiv:2512.02697 [pdf, ps, other]: Title: GeoBridge: A Semantic-Anchored Multi-View Foundation Model Bridging Images and Text for Geo-Localization

Authors: Zixuan Song, Jing Zhang, Di Wang, Zidie Zhou, Wenbin Liu, Haonan Guo, En Wang, Bo Du

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[672] arXiv:2512.02696 [pdf, ps, other]: Title: ALDI-ray: Adapting the ALDI Framework for Security X-ray Object Detection

Authors: Omid Reza Heidari, Yang Wang, Xinxin Zuo

Comments: Submitted to ICASSP 2026 Conference

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[673] arXiv:2512.02686 [pdf, ps, other]: Title: ClimaOoD: Improving Anomaly Segmentation via Physically Realistic Synthetic Data

Authors: Yuxing Liu, Yong Liu

Comments: Under review;

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[674] arXiv:2512.02685 [pdf, ps, other]: Title: Unsupervised Structural Scene Decomposition via Foreground-Aware Slot Attention with Pseudo-Mask Guidance

Authors: Huankun Sheng, Ming Li, Yixiang Wei, Yeying Fan, Yu-Hui Wen, Tieliang Gong, Yong-Jin Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[675] arXiv:2512.02681 [pdf, ps, other]: Title: PGP-DiffSR: Phase-Guided Progressive Pruning for Efficient Diffusion-based Image Super-Resolution

Authors: Zhongbao Yang, Jiangxin Dong, Yazhou Yao, Jinhui Tang, Jinshan Pan

Comments: 10 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[676] arXiv:2512.02668 [pdf, ps, other]: Title: UAUTrack: Towards Unified Multimodal Anti-UAV Visual Tracking

Authors: Qionglin Ren, Dawei Zhang, Chunxu Tian, Dan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[677] arXiv:2512.02664 [pdf, ps, other]: Title: PolarGuide-GSDR: 3D Gaussian Splatting Driven by Polarization Priors and Deferred Reflection for Real-World Reflective Scenes

Authors: Derui Shan, Qian Qiao, Hao Lu, Tao Du, Peng Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[678] arXiv:2512.02660 [pdf, ps, other]: Title: Spatially-Grounded Document Retrieval via Patch-to-Region Relevance Propagation

Authors: Agathoklis Georgiou

Comments: 13 pages, 1 figure, 2 tables. Open-source implementation available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
[679] arXiv:2512.02650 [pdf, ps, other]: Title: Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

Authors: Junwon Lee, Juhan Nam, Jiyoung Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[680] arXiv:2512.02648 [pdf, ps, other]: Title: PoreTrack3D: A Benchmark for Dynamic 3D Gaussian Splatting in Pore-Scale Facial Trajectory Tracking

Authors: Dong Li, Jiahao Xiong, Yingda Huang, Le Chang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[681] arXiv:2512.02643 [pdf, ps, other]: Title: Leveraging Large-Scale Pretrained Spatial-Spectral Priors for General Zero-Shot Pansharpening

Authors: Yongchuan Cui, Peng Liu, Yi Zeng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[682] arXiv:2512.02624 [pdf, ps, other]: Title: PPTBench: Towards Holistic Evaluation of Large Language Models for PowerPoint Layout and Design Understanding

Authors: Zheng Huang, Xukai Liu, Tianyu Hu, Kai Zhang, Ye Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[683] arXiv:2512.02622 [pdf, ps, other]: Title: RULER-Bench: Probing Rule-based Reasoning Abilities of Next-level Video Generation Models for Vision Foundation Intelligence

Authors: Xuming He, Zehao Fan, Hengjia Li, Fan Zhuo, Hankun Xu, Senlin Cheng, Di Weng, Haifeng Liu, Can Ye, Boxi Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[684] arXiv:2512.02621 [pdf, ps, other]: Title: Content-Aware Texturing for Gaussian Splatting

Authors: Panagiotis Papantonakis, Georgios Kopanas, Fredo Durand, George Drettakis

Comments: Project Page: this https URL

Journal-ref: Eurographics Symposium on Rendering (Symposium Track), 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[685] arXiv:2512.02576 [pdf, ps, other]: Title: Co-speech Gesture Video Generation via Motion-Based Graph Retrieval

Authors: Yafei Song, Peng Zhang, Bang Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[686] arXiv:2512.02566 [pdf, ps, other]: Title: From Panel to Pixel: Zoom-In Vision-Language Pretraining from Biomedical Scientific Literature

Authors: Kun Yuan, Min Woo Sun, Zhen Chen, Alejandro Lozano, Xiangteng He, Shi Li, Nassir Navab, Xiaoxiao Sun, Nicolas Padoy, Serena Yeung-Levy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[687] arXiv:2512.02554 [pdf, ps, other]: Title: OmniPerson: Unified Identity-Preserving Pedestrian Generation

Authors: Changxiao Ma, Chao Yuan, Xincheng Shi, Yuzhuo Ma, Yongfei Zhang, Longkun Zhou, Yujia Zhang, Shangze Li, Yifan Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[688] arXiv:2512.02541 [pdf, ps, other]: Title: AVGGT: Rethinking Global Attention for Accelerating VGGT

Authors: Xianbing Sun, Zhikai Zhu, Zhengyu Lou, Bo Yang, Jinyang Tang, Liqing Zhang, He Wang, Jianfu Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[689] arXiv:2512.02536 [pdf, ps, other]: Title: WeMMU: Enhanced Bridging of Vision-Language Models and Diffusion Models via Noisy Query Tokens

Authors: Jian Yang, Dacheng Yin, Xiaoxuan He, Yong Li, Fengyun Rao, Jing Lyu, Wei Zhai, Yang Cao, Zheng-Jun Zha

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[690] arXiv:2512.02520 [pdf, ps, other]: Title: On the Problem of Consistent Anomalies in Zero-Shot Anomaly Detection

Authors: Tai Le-Gia

Comments: PhD Dissertation

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[691] arXiv:2512.02517 [pdf, ps, other]: Title: SkyMoE: A Vision-Language Foundation Model for Enhancing Geospatial Interpretation with Mixture of Experts

Authors: Jiaqi Liu, Ronghao Fu, Lang Sun, Haoran Liu, Xiao Yang, Weipeng Zhang, Xu Na, Zhuoran Duan, Bo Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[692] arXiv:2512.02512 [pdf, ps, other]: Title: Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling

Authors: Aditya Chaudhary, Prachet Dev Singh, Ankit Jha

Comments: Accepted as a Tiny Paper at the 13th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2025), IIT Mandi, India. 3 pages, 1 figure

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[693] arXiv:2512.02505 [pdf, ps, other]: Title: GeoDiT: A Diffusion-based Vision-Language Model for Geospatial Understanding

Authors: Jiaqi Liu, Ronghao Fu, Haoran Liu, Lang Sun, Bo Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[694] arXiv:2512.02498 [pdf, ps, other]: Title: dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model

Authors: Yumeng Li, Guang Yang, Hao Liu, Bowen Wang, Colin Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[695] arXiv:2512.02497 [pdf, ps, other]: Title: A Large Scale Benchmark for Test Time Adaptation Methods in Medical Image Segmentation

Authors: Wenjing Yu, Shuo Jiang, Yifei Chen, Shuo Chang, Yuanhan Wang, Beining Wu, Jie Dong, Mingxuan Liu, Shenghao Zhu, Feiwei Qin, Changmiao Wang, Qiyuan Tian

Comments: 45 pages, 18 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[696] arXiv:2512.02496 [pdf, ps, other]: Title: Attention-guided reference point shifting for Gaussian-mixture-based partial point set registration

Authors: Mizuki Kikkawa, Tatsuya Yatagawa, Yutaka Ohtake, Hiromasa Suzuki

Comments: 16 pages, 9 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[697] arXiv:2512.02492 [pdf, ps, other]: Title: YingVideo-MV: Music-Driven Multi-Stage Video Generation

Authors: Jiahui Chen, Weida Wang, Runhua Shi, Huan Yang, Chaofan Ding, Zihao Chen

Comments: 18 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[698] arXiv:2512.02487 [pdf, ps, other]: Title: Masking Matters: Unlocking the Spatial Reasoning Capabilities of LLMs for 3D Scene-Language Understanding

Authors: Yerim Jeon, Miso Lee, WonJun Moon, Jae-Pil Heo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[699] arXiv:2512.02485 [pdf, ps, other]: Title: UCAgents: Unidirectional Convergence for Visual Evidence Anchored Multi-Agent Medical Decision-Making

Authors: Qianhan Feng, Zhongzhen Huang, Yakun Zhu, Xiaofan Zhang, Qi Dou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[700] arXiv:2512.02482 [pdf, ps, other]: Title: G-SHARP: Gaussian Surgical Hardware Accelerated Real-time Pipeline

Authors: Vishwesh Nath, Javier G. Tejero, Ruilong Li, Filippo Filicori, Mahdi Azizian, Sean D. Huver

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[701] arXiv:2512.02473 [pdf, ps, other]: Title: WorldPack: Compressed Memory Improves Spatial Consistency in Video World Modeling

Authors: Yuta Oshima, Yusuke Iwasawa, Masahiro Suzuki, Yutaka Matsuo, Hiroki Furuta

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[702] arXiv:2512.02469 [pdf, ps, other]: Title: TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution

Authors: Fengli Ran, Xiao Pu, Bo Liu, Xiuli Bi, Bin Xiao

Comments: Accepted in AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[703] arXiv:2512.02458 [pdf, ps, other]: Title: Vision to Geometry: 3D Spatial Memory for Sequential Embodied MLLM Reasoning and Exploration

Authors: Zhongyi Cai, Yi Du, Chen Wang, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[704] arXiv:2512.02457 [pdf, ps, other]: Title: Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation

Authors: Jianzong Wu, Hao Lian, Dachao Hao, Ye Tian, Qingyu Shi, Biaolong Chen, Hao Jiang, Yunhai Tong

Comments: Project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[705] arXiv:2512.02456 [pdf, ps, other]: Title: See, Think, Learn: A Self-Taught Multimodal Reasoner

Authors: Sourabh Sharma, Sonam Gupta, Sadbhawna

Comments: Winter Conference on Applications of Computer Vision 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[706] arXiv:2512.02453 [pdf, ps, other]: Title: ClusterStyle: Modeling Intra-Style Diversity with Prototypical Clustering for Stylized Motion Generation

Authors: Kerui Chen, Jianrong Zhang, Ming Li, Zhonglong Zheng, Hehe Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[707] arXiv:2512.02450 [pdf, ps, other]: Title: HouseLayout3D: A Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild

Authors: Valentin Bieri, Marie-Julie Rakotosaona, Keisuke Tateno, Francis Engelmann, Leonidas Guibas

Comments: NeurIPS 2025 (Datasets and Benchmarks Track) Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[708] arXiv:2512.02448 [pdf, ps, other]: Title: nuScenes Revisited: Progress and Challenges in Autonomous Driving

Authors: Whye Kit Fong, Venice Erin Liong, Kok Seang Tan, Holger Caesar

Comments: 18 pages, 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[709] arXiv:2512.02447 [pdf, ps, other]: Title: Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors

Authors: Fan Luo, Zeyu Gao, Xinhao Luo, Kai Zhao, Yanfeng Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[710] arXiv:2512.02441 [pdf, ps, other]: Title: Basis-Oriented Low-rank Transfer for Few-Shot and Test-Time Adaptation

Authors: Junghwan Park, Woojin Cho, Junhyuk Heo, Darongsae Kwon, Kookjin Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[711] arXiv:2512.02438 [pdf, ps, other]: Title: Boosting Medical Vision-Language Pretraining via Momentum Self-Distillation under Limited Computing Resources

Authors: Phuc Pham, Nhu Pham, Ngoc Quoc Ly

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[712] arXiv:2512.02437 [pdf, ps, other]: Title: LightHCG: a Lightweight yet powerful HSIC Disentanglement based Causal Glaucoma Detection Model framework

Authors: Daeyoung Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[713] arXiv:2512.02425 [pdf, ps, other]: Title: WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning

Authors: Woongyeong Yeo, Kangsan Kim, Jaehong Yoon, Sung Ju Hwang

Comments: Project page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[714] arXiv:2512.02423 [pdf, ps, other]: Title: GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning

Authors: Haolong Yan, Yeqing Shen, Xin Huang, Jia Wang, Kaijun Tan, Zhixuan Liang, Hongxin Li, Zheng Ge, Osamu Yoshie, Si Li, Xiangyu Zhang, Daxin Jiang

Comments: 26 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[715] arXiv:2512.02421 [pdf, ps, other]: Title: Generalizing Vision-Language Models with Dedicated Prompt Guidance

Authors: Xinyao Li, Yinjie Min, Hongbo Chen, Zhekai Du, Fengling Li, Jingjing Li

Comments: Accepted to AAAI26

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[716] arXiv:2512.02413 [pdf, ps, other]: Title: MitUNet: Enhancing Floor Plan Recognition using a Hybrid Mix-Transformer and U-Net Architecture

Authors: Dmitriy Parashchuk, Alexey Kapshitskiy, Yuriy Karyakin

Comments: 9 pages, 4 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[717] arXiv:2512.02405 [pdf, ps, other]: Title: WISE: Weighted Iterative Society-of-Experts for Robust Multimodal Multi-Agent Debate

Authors: Anoop Cherian, River Doyle, Eyal Ben-Dov, Suhas Lohit, Kuan-Chuan Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[718] arXiv:2512.02400 [pdf, ps, other]: Title: Nav-$R^2$ Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation

Authors: Wentao Xiang, Haokang Zhang, Tianhang Yang, Zedong Chu, Ruihang Chu, Shichao Xie, Yujian Yuan, Jian Sun, Zhining Gu, Junjie Wang, Xiaolong Wu, Mu Xu, Yujiu Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[719] arXiv:2512.02395 [pdf, ps, other]: Title: Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch

Authors: Yifan Zhang, Liang Hu, Haofeng Sun, Peiyu Wang, Yichen Wei, Shukang Yin, Jiangbo Pei, Wei Shen, Peng Xia, Yi Peng, Tianyidan Xie, Eric Li, Yang Liu, Xuchen Song, Yahui Zhou

Comments: 21 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[720] arXiv:2512.02394 [pdf, ps, other]: Title: Reproducing and Extending RaDelft 4D Radar with Camera-Assisted Labels

Authors: Kejia Hu, Mohammed Alsakabi, John M. Dolan, Ozan K. Tonguz

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[721] arXiv:2512.02392 [pdf, ps, other]: Title: From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking

Authors: Yuqing Shao, Yuchen Yang, Rui Yu, Weilong Li, Xu Guo, Huaicheng Yan, Wei Wang, Xiao Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[722] arXiv:2512.02375 [pdf, ps, other]: Title: On-the-fly Feedback SfM: Online Explore-and-Exploit UAV Photogrammetry with Incremental Mesh Quality-Aware Indicator and Predictive Path Planning

Authors: Liyuan Lou, Wanyun Li, Wentian Gan, Yifei Yu, Tengfei Wang, Xin Wang, Zongqian Zhan

Comments: This work was submitted to IEEE GRSM Journal for consideration.COPYRIGHT would be transferred once it get accepted

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[723] arXiv:2512.02369 [pdf, ps, other]: Title: SAGE: Style-Adaptive Generalization for Privacy-Constrained Semantic Segmentation Across Domains

Authors: Qingmei Li, Yang Zhang, Peifeng Zhang, Haohuan Fu, Juepeng Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[724] arXiv:2512.02368 [pdf, ps, other]: Title: Multi-Domain Enhanced Map-Free Trajectory Prediction with Selective Attention

Authors: Wenyi Xiong, Jian Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[725] arXiv:2512.02364 [pdf, ps, other]: Title: Tackling Tuberculosis: A Comparative Dive into Machine Learning for Tuberculosis Detection

Authors: Daanish Hindustani, Sanober Hindustani, Preston Nguyen

Journal-ref: Vol. 6, No. 1 (2024), Minnesota Undergraduate Research & Academic Journal (MURAJ)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[726] arXiv:2512.02361 [pdf, ps, other]: Title: VACoT: Rethinking Visual Data Augmentation with VLMs

Authors: Zhengzhuo Xu, Chong Sun, SiNan Du, Chen Li, Jing Lyu, Chun Yuan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[727] arXiv:2512.02359 [pdf, ps, other]: Title: WSCF-MVCC: Weakly-supervised Calibration-free Multi-view Crowd Counting

Authors: Bin Li, Daijie Chen, Qi Zhang

Comments: PRCV 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[728] arXiv:2512.02351 [pdf, ps, other]: Title: Understanding and Harnessing Sparsity in Unified Multimodal Models

Authors: Shwai He, Chaorui Deng, Ang Li, Shen Yan

Comments: 13 pages, 13 figures, 8 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[729] arXiv:2512.02344 [pdf, ps, other]: Title: A multi-weight self-matching visual explanation for cnns on sar images

Authors: Siyuan Sun, Yongping Zhang, Hongcheng Zeng, Yamin Wang, Wei Yang, Wanting Yang, Jie Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[730] arXiv:2512.02341 [pdf, ps, other]: Title: TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction

Authors: Fengyi Zhang, Tianjun Zhang, Kasra Khosoussi, Zheng Zhang, Zi Huang, Yadan Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[731] arXiv:2512.02339 [pdf, ps, other]: Title: Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision

Authors: Chenshuang Zhang, Kang Zhang, Joon Son Chung, In So Kweon, Junmo Kim, Chengzhi Mao

Comments: Accepted at NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[732] arXiv:2512.02290 [pdf, ps, other]: Title: Enhancing Cross Domain SAR Oil Spill Segmentation via Morphological Region Perturbation and Synthetic Label-to-SAR Generation

Authors: Andre Juarez, Luis Salsavilca, Frida Coaquira, Celso Gonzales

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[733] arXiv:2512.02273 [pdf, ps, other]: Title: Progressive Image Restoration via Text-Conditioned Video Generation

Authors: Peng Kang, Xijun Wang, Yu Yuan

Comments: First two authors contributed equally to this work. IEEE ICNC Accepted

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[734] arXiv:2512.02268 [pdf, ps, other]: Title: Spatiotemporal Pyramid Flow Matching for Climate Emulation

Authors: Jeremy Andrew Irvin, Jiaqi Han, Zikui Wang, Abdulaziz Alharbi, Yufei Zhao, Nomin-Erdene Bayarsaikhan, Daniele Visioni, Andrew Y. Ng, Duncan Watson-Parris

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Machine Learning (stat.ML)
[735] arXiv:2512.02258 [pdf, ps, other]: Title: Exploring the Potentials of Spiking Neural Networks for Image Deraining

Authors: Shuang Chen, Tomas Krajnik, Farshad Arvin, Amir Atapour-Abarghouei

Comments: Accepted By AAAI2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[736] arXiv:2512.02231 [pdf, ps, other]: Title: See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models

Authors: Le Thien Phuc Nguyen, Zhuoran Yu, Samuel Low Yu Hang, Subin An, Jeongik Lee, Yohan Ban, SeungEun Chung, Thanh-Huy Nguyen, JuWan Maeng, Soochahn Lee, Yong Jae Lee

Comments: preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[737] arXiv:2512.02224 [pdf, ps, other]: Title: Towards Unified Video Quality Assessment

Authors: Chen Feng, Tianhao Peng, Fan Zhang, David Bull

Comments: 8 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[738] arXiv:2512.02198 [pdf, ps, other]: Title: Multifractal Recalibration of Neural Networks for Medical Imaging Segmentation

Authors: Miguel L. Martins, Miguel T. Coimbra, Francesco Renna

Comments: 30 pages, 9 figures, journal paper

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[739] arXiv:2512.02188 [pdf, ps, other]: Title: RobustSurg: Tackling domain generalisation for out-of-distribution surgical scene segmentation

Authors: Mansoor Ali, Maksim Richards, Gilberto Ochoa-Ruiz, Sharib Ali

Comments: Submitted to Medical Image Analysis

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[740] arXiv:2512.02172 [pdf, ps, other]: Title: SplatSuRe: Selective Super-Resolution for Multi-view Consistent 3D Gaussian Splatting

Authors: Pranav Asthana, Alex Hanson, Allen Tu, Tom Goldstein, Matthias Zwicker, Amitabh Varshney

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[741] arXiv:2512.02162 [pdf, ps, other]: Title: Mapping of Lesion Images to Somatic Mutations

Authors: Rahul Mehta

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[742] arXiv:2512.02161 [pdf, ps, other]: Title: FineGRAIN: Evaluating Failure Modes of Text-to-Image Models with Vision Language Model Judges

Authors: Kevin David Hayes, Micah Goldblum, Vikash Sehwag, Gowthami Somepalli, Ashwinee Panda, Tom Goldstein

Comments: Accepted to NeurIPS 2025 Datasets and Benchmarks Track

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[743] arXiv:2512.02152 [pdf, ps, other]: Title: Context-Enriched Contrastive Loss: Enhancing Presentation of Inherent Sample Connections in Contrastive Learning Framework

Authors: Haojin Deng, Yimin Yang

Comments: 13 pages, 7 figures. Published in IEEE Transactions on Multimedia. Code available at: this https URL

Journal-ref: IEEE Transactions on Multimedia, Vol. 27, pp. 429-441, December 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[744] arXiv:2512.02055 [pdf, ps, other]: Title: Leveraging AI multimodal geospatial foundation models for improved near-real-time flood mapping at a global scale

Authors: Mirela G. Tulbure, Julio Caineta, Mark Broich, Mollie D. Gaines, Philippe Rufin, Leon-Friedrich Thomas, Hamed Alemohammad, Jan Hemmerling, Patrick Hostert

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[745] arXiv:2512.03028 (cross-list from cs.GR) [pdf, ps, other]: Title: SMP: Reusable Score-Matching Motion Priors for Physics-Based Character Control

Authors: Yuxuan Mu, Ziyu Zhang, Yi Shi, Minami Matsumoto, Kotaro Imamura, Guy Tevet, Chuan Guo, Michael Taylor, Chang Shu, Pengcheng Xi, Xue Bin Peng

Comments: 14 pages, 9 figures

Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[746] arXiv:2512.02920 (cross-list from cs.LG) [pdf, ps, other]: Title: Learning Multimodal Embeddings for Traffic Accident Prediction and Causal Estimation

Authors: Ziniu Zhang, Minxuan Duan, Haris N. Koutsopoulos, Hongyang R. Zhang

Comments: 17 pages. To appear in KDD'26 Datasets

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)
[747] arXiv:2512.02787 (cross-list from cs.RO) [pdf, ps, other]: Title: Diagnose, Correct, and Learn from Manipulation Failures via Visual Symbols

Authors: Xianchao Zeng, Xinyu Zhou, Youcheng Li, Jiayou Shi, Tianle Li, Liangming Chen, Lei Ren, Yong-Lu Li

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[748] arXiv:2512.02719 (cross-list from cs.CL) [pdf, ps, other]: Title: Emergent Bayesian Behaviour and Optimal Cue Combination in LLMs

Authors: Julian Ma, Jun Wang, Zafeirios Fountas

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
[749] arXiv:2512.02651 (cross-list from cs.HC) [pdf, ps, other]: Title: Real-Time Multimodal Data Collection Using Smartwatches and Its Visualization in Education

Authors: Alvaro Becerra, Pablo Villegas, Ruth Cobos

Comments: Accepted in Technological Ecosystems for Enhancing Multiculturality (TEEM) 2025

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[750] arXiv:2512.02636 (cross-list from cs.LG) [pdf, ps, other]: Title: Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models

Authors: Xinyue Ai, Yutong He, Albert Gu, Ruslan Salakhutdinov, J Zico Kolter, Nicholas Matthew Boffi, Max Simchowitz

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[751] arXiv:2512.02609 (cross-list from cs.RO) [pdf, ps, other]: Title: SAM2Grasp: Resolve Multi-modal Grasping via Prompt-conditioned Temporal Action Prediction

Authors: Shengkai Wu, Jinrong Yang, Wenqiu Luo, Linfeng Gao, Chaohui Shang, Meiyu Zhi, Mingshan Sun, Fangping Yang, Liangliang Ren, Yong Zhao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[752] arXiv:2512.02340 (cross-list from cs.AI) [pdf, ps, other]: Title: Reasoning Path and Latent State Analysis for Multi-view Visual Spatial Reasoning: A Cognitive Science Perspective

Authors: Qiyao Xue, Weichen Liu, Shiqi Wang, Haoming Wang, Yuyang Wu, Wei Gao

Comments: 23 pages, 37 figures

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[753] arXiv:2512.02306 (cross-list from cs.AI) [pdf, ps, other]: Title: OmniGuard: Unified Omni-Modal Guardrails with Deliberate Reasoning

Authors: Boyu Zhu, Xiaofei Wen, Wenjie Jacky Mo, Tinghui Zhu, Yanan Xie, Peng Qi, Muhao Chen

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[754] arXiv:2512.02293 (cross-list from cs.RO) [pdf, ps, other]: Title: VIGS-SLAM: Visual Inertial Gaussian Splatting SLAM

Authors: Zihan Zhu, Wei Zhang, Norbert Haala, Marc Pollefeys, Daniel Barath

Comments: Project page: this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[755] arXiv:2512.02280 (cross-list from cs.AI) [pdf, ps, other]: Title: Bridging the Gap: Toward Cognitive Autonomy in Artificial Intelligence

Authors: Noorbakhsh Amiri Golilarz, Sindhuja Penchala, Shahram Rahimi

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[756] arXiv:2512.02243 (cross-list from cs.CR) [pdf, ps, other]: Title: PhishSnap: Image-Based Phishing Detection Using Perceptual Hashing

Authors: Md Abdul Ahad Minhaz, Zannatul Zahan Meem, Md. Shohrab Hossain

Comments: IEE Standard Formatting, 3 pages, 3 figures

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[757] arXiv:2512.02143 (cross-list from cs.GR) [pdf, ps, other]: Title: CoatFusion: Controllable Material Coating in Images

Authors: Sagie Levy, Elad Aharoni, Matan Levy, Ariel Shamir, Dani Lischinski

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[758] arXiv:2512.02088 (cross-list from eess.IV) [pdf, ps, other]: Title: Comparing Baseline and Day-1 Diffusion MRI Using Multimodal Deep Embeddings for Stroke Outcome Prediction

Authors: Sina Raeisadigh, Myles Joshua Toledo Tan, Henning Müller, Abderrahmane Hedjoudje

Comments: 5 pages, 5 figures, 2 tables

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[759] arXiv:2512.02062 (cross-list from cs.CR) [pdf, ps, other]: Title: Superpixel Attack: Enhancing Black-box Adversarial Attack with Image-driven Division Areas

Authors: Issa Oe, Keiichiro Yamamura, Hiroki Ishikura, Ryo Hamahira, Katsuki Fujisawa

Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

[ total of 759 entries: 1-100 | ... | 360-459 | 460-559 | 560-659 | 660-759 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 854

Wed, 3 Dec 2025 (continued, showing last 100 of 141 entries)