Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Wed, 10 Dec 2025
Tue, 9 Dec 2025
Mon, 8 Dec 2025
Fri, 5 Dec 2025
Thu, 4 Dec 2025

[ total of 749 entries: 1-749 ]
[ showing up to 1000 entries per page: fewer | more ]

Wed, 10 Dec 2025

[1] arXiv:2512.08931 [pdf, ps, other]: Title: Astra: General Interactive World Model with Autoregressive Denoising

Authors: Yixuan Zhu, Jiaqi Feng, Wenzhao Zheng, Yuan Gao, Xin Tao, Pengfei Wan, Jie Zhou, Jiwen Lu

Comments: Code is available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[2] arXiv:2512.08930 [pdf, ps, other]: Title: Selfi: Self Improving Reconstruction Engine via 3D Geometric Feature Alignment

Authors: Youming Deng, Songyou Peng, Junyi Zhang, Kathryn Heal, Tiancheng Sun, John Flynn, Steve Marschner, Lucy Chai

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[3] arXiv:2512.08924 [pdf, ps, other]: Title: Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

Authors: Chuhan Zhang, Guillaume Le Moing, Skanda Koppula, Ignacio Rocco, Liliane Momeni, Junyu Xie, Shuyang Sun, Rahul Sukthankar, Joëlle K Barral, Raia Hadsell, Zoubin Ghahramani, Andrew Zisserman, Junlin Zhang, Mehdi SM Sajjadi

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2512.08922 [pdf, ps, other]: Title: Unified Diffusion Transformer for High-fidelity Text-Aware Image Restoration

Authors: Jin Hyeon Kim, Paul Hyunbin Cho, Claire Kim, Jaewon Min, Jaeeun Lee, Jihye Park, Yeji Choi, Seungryong Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[5] arXiv:2512.08912 [pdf, ps, other]: Title: LiDAS: Lighting-driven Dynamic Active Sensing for Nighttime Perception

Authors: Simon de Moreau, Andrei Bursuc, Hafid El-Idrissi, Fabien Moutarde

Comments: Preprint. 12 pages, 9 figures. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[6] arXiv:2512.08905 [pdf, ps, other]: Title: Self-Evolving 3D Scene Generation from a Single Image

Authors: Kaizhi Zheng, Yue Fan, Jing Gu, Zishuo Xu, Xuehai He, Xin Eric Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7] arXiv:2512.08897 [pdf, ps, other]: Title: UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation

Authors: Zeyang Liu, Le Wang, Sanping Zhou, Yuxuan Wu, Xiaolong Sun, Gang Hua, Haoxiang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8] arXiv:2512.08889 [pdf, ps, other]: Title: No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers

Authors: Damiano Marsili, Georgia Gkioxari

Comments: Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[9] arXiv:2512.08888 [pdf, ps, other]: Title: Accelerated Rotation-Invariant Convolution for UAV Image Segmentation

Authors: Manduhu Manduhu, Alexander Dow, Gerard Dooly, James Riordan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[10] arXiv:2512.08881 [pdf, ps, other]: Title: SATGround: A Spatially-Aware Approach for Visual Grounding in Remote Sensing

Authors: Aysim Toker, Andreea-Maria Oncescu, Roy Miles, Ismail Elezi, Jiankang Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[11] arXiv:2512.08873 [pdf, ps, other]: Title: Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image Captioning

Authors: Jing Jie Tan, Anissa Mokraoui, Ban-Hoe Kwan, Danny Wee-Kiat Ng, Yan-Chai Hum

Comments: 6 pages

Journal-ref: 2024 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
[12] arXiv:2512.08860 [pdf, ps, other]: Title: Tri-Bench: Stress-Testing VLM Reliability on Spatial Reasoning under Camera Tilt and Object Interference

Authors: Amit Bendkhale

Comments: 6 pages, 3 figures. Code and data: this https URL Accepted to the AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13] arXiv:2512.08854 [pdf, ps, other]: Title: Generation is Required for Data-Efficient Perception

Authors: Jack Brady, Bernhard Schölkopf, Thomas Kipf, Simon Buchholz, Wieland Brendel

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[14] arXiv:2512.08829 [pdf, ps, other]: Title: InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

Authors: Hongyuan Tao, Bencheng Liao, Shaoyu Chen, Haoran Yin, Qian Zhang, Wenyu Liu, Xinggang Wang

Comments: 16 pages, 8 figures, conference or other essential info

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15] arXiv:2512.08820 [pdf, ps, other]: Title: Training-Free Dual Hyperbolic Adapters for Better Cross-Modal Reasoning

Authors: Yi Zhang, Chun-Wun Cheng, Junyi He, Ke Yu, Yushun Tang, Carola-Bibiane Schönlieb, Zhihai He, Angelica I. Aviles-Rivero

Comments: Accepted in IEEE Transactions on Multimedia (TMM)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[16] arXiv:2512.08789 [pdf, ps, other]: Title: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance

Authors: Chaewon Kim, Seoyeon Lee, Jonghyuk Park

Comments: 10 pages, 7 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[17] arXiv:2512.08785 [pdf, ps, other]: Title: LoFA: Learning to Predict Personalized Priors for Fast Adaptation of Visual Generative Models

Authors: Yiming Hao, Mutian Xu, Chongjie Ye, Jie Qin, Shunlin Lu, Yipeng Qin, Xiaoguang Han

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2512.08774 [pdf, ps, other]: Title: Refining Visual Artifacts in Diffusion Models via Explainable AI-based Flaw Activation Maps

Authors: Seoyeon Lee, Gwangyeol Yu, Chaewon Kim, Jonghyuk Park

Comments: 10 pages, 9 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[19] arXiv:2512.08765 [pdf, ps, other]: Title: Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Authors: Ruihang Chu, Yefei He, Zhekai Chen, Shiwei Zhang, Xiaogang Xu, Bin Xia, Dingdong Wang, Hongwei Yi, Xihui Liu, Hengshuang Zhao, Yu Liu, Yingya Zhang, Yujiu Yang

Comments: NeurlPS 2025. Code and data available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2512.08751 [pdf, ps, other]: Title: Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge Devices

Authors: Kuniko Paxton, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis Papadopoulos

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[21] arXiv:2512.08747 [pdf, ps, other]: Title: A Scalable Pipeline Combining Procedural 3D Graphics and Guided Diffusion for Photorealistic Synthetic Training Data Generation in White Button Mushroom Segmentation

Authors: Artúr I. Károly, Péter Galambos

Comments: 20 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2512.08738 [pdf, ps, other]: Title: Pose-Based Sign Language Spotting via an End-to-End Encoder Architecture

Authors: Samuel Ebimobowei Johnny, Blessed Guda, Emmanuel Enejo Aaron, Assane Gueye

Comments: To appear at AACL-IJCNLP 2025 Workshop WSLP

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[23] arXiv:2512.08733 [pdf, ps, other]: Title: Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting

Authors: Kuniko Paxton, Zeinab Dehghani, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis Papadopoulos

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[24] arXiv:2512.08730 [pdf, ps, other]: Title: SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images

Authors: Kaiyu Li, Shengqi Zhang, Yupeng Deng, Zhi Wang, Deyu Meng, Xiangyong Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2512.08700 [pdf, ps, other]: Title: Scale-invariant and View-relational Representation Learning for Full Surround Monocular Depth

Authors: Kyumin Hwang, Wonhyeok Choi, Kiljoon Han, Wonjoon Choi, Minwoo Choi, Yongcheon Na, Minwoo Park, Sunghoon Im

Comments: Accepted at IEEE Robotics and Automation Letters (RA-L) 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2512.08697 [pdf, ps, other]: Title: What really matters for person re-identification? A Mixture-of-Experts Framework for Semantic Attribute Importance

Authors: Athena Psalta, Vasileios Tsironis, Konstantinos Karantzalos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27] arXiv:2512.08673 [pdf, ps, other]: Title: Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds

Authors: Shaofeng Zhang, Xuanqi Chen, Xiangdong Zhang, Sitong Wu, Junchi Yan

Comments: 16 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2512.08648 [pdf, ps, other]: Title: Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank

Authors: Shaofeng Zhang, Xuanqi Chen, Ning Liao, Haoxiang Zhao, Xiaoxing Wang, Haoru Tan, Sitong Wu, Xiaosong Jia, Qi Fan, Junchi Yan

Comments: 19 pages, 19 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2512.08647 [pdf, ps, other]: Title: C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior Recognition

Authors: Keito Inoshita

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30] arXiv:2512.08645 [pdf, ps, other]: Title: Chain-of-Image Generation: Toward Monitorable and Controllable Image Generation

Authors: Young Kyung Kim, Oded Schlesinger, Yuzhou Zhao, J. Matias Di Martino, Guillermo Sapiro

Comments: 19 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2512.08639 [pdf, ps, other]: Title: Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

Authors: Huilin Xu, Zhuoyang Liu, Yixiang Luomei, Feng Xu

Comments: Under Review, 12 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[32] arXiv:2512.08627 [pdf, ps, other]: Title: Trajectory Densification and Depth from Perspective-based Blur

Authors: Tianchen Qiu, Qirun Zhang, Jiajian He, Zhengyue Zhuge, Jiahui Xu, Yueting Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2512.08625 [pdf, ps, other]: Title: OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set Semantics

Authors: Jisang Yoo, Gyeongjin Kang, Hyun-kyu Ko, Hyeonwoo Yu, Eunbyung Park

Comments: 8 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2512.08606 [pdf, ps, other]: Title: Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot Learning

Authors: Zhenyu Zhang, Guangyao Chen, Yixiong Zou, Zhimeng Huang, Yuhua Li

Comments: 14 pages, 8 figures, Association for the Advancement of Artificial Intelligence (AAAI2026, poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[35] arXiv:2512.08589 [pdf, ps, other]: Title: Automated Pollen Recognition in Optical and Holographic Microscopy Images

Authors: Swarn Singh Warshaneyan, Maksims Ivanovs, Blaž Cugmas, Inese Bērziņa, Laura Goldberga, Mindaugas Tamosiunas, Roberts Kadiķis

Comments: 08 pages, 10 figures, 04 tables, 20 references. Date of Conference: 13-14 June 2025 Date Added to IEEE Xplore: 10 July 2025 Electronic ISBN: 979-8-3315-0969-9 Print on Demand(PoD) ISBN: 979-8-3315-0970-5 DOI: 10.1109/AICCONF64766.2025.11064260 Conference Location: Prague, Czech Republic Online Access: this https URL

Journal-ref: 2025 3rd Cognitive Models and Artificial Intelligence Conference (AICCONF), vol. 1, no. 1, pp. 1-8, Prague, Czech Republic, IEEE, 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2512.08577 [pdf, ps, other]: Title: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery

Authors: Yuna Kato, Shohei Mori, Hideo Saito, Yoshifumi Takatsume, Hiroki Kajita, Mariko Isogawa

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[37] arXiv:2512.08572 [pdf, ps, other]: Title: From Cells to Survival: Hierarchical Analysis of Cell Inter-Relations in Multiplex Microscopy for Lung Cancer Prognosis

Authors: Olle Edgren Schüllerqvist, Jens Baumann, Joakim Lindblad, Love Nordling, Artur Mezheyeuski, Patrick Micke, Nataša Sladoje

Comments: 5 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38] arXiv:2512.08569 [pdf, ps, other]: Title: Instance-Aware Test-Time Segmentation for Continual Domain Shifts

Authors: Seunghwan Lee, Inyoung Jung, Hojoon Lee, Eunil Park, Sungeun Hong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39] arXiv:2512.08564 [pdf, ps, other]: Title: Modular Neural Image Signal Processing

Authors: Mahmoud Afifi, Zhongling Wang, Ran Zhang, Michael S. Brown

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40] arXiv:2512.08560 [pdf, ps, other]: Title: BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain

Authors: Navve Wasserman, Matias Cosarinsky, Yuval Golbari, Aude Oliva, Antonio Torralba, Tamar Rott Shaham, Michal Irani

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2512.08557 [pdf, ps, other]: Title: SSCATeR: Sparse Scatter-Based Convolution Algorithm with Temporal Data Recycling for Real-Time 3D Object Detection in LiDAR Point Clouds

Authors: Alexander Dow, Manduhu Manduhu, Matheus Santos, Ben Bartlett, Gerard Dooly, James Riordan

Comments: 22 Pages, 26 Figures, This work has been submitted to the IEEE Sensors Journal for possible publication

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42] arXiv:2512.08547 [pdf, ps, other]: Title: An Iteration-Free Fixed-Point Estimator for Diffusion Inversion

Authors: Yifei Chen, Kaiyu Song, Yan Pan, Jianxing Yu, Jian Yin, Hanjiang Lai

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43] arXiv:2512.08542 [pdf, ps, other]: Title: A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image Generation

Authors: Zhigang Jia, Duan Wang, Hengkai Wang, Yajun Xie, Meixiang Zhao, Xiaoyu Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
[44] arXiv:2512.08537 [pdf, ps, other]: Title: Fast-ARDiff: An Entropy-informed Acceleration Framework for Continuous Space Autoregressive Generation

Authors: Zhen Zou, Xiaoxiao Ma, Jie Huang, Zichao Yu, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2512.08535 [pdf, ps, other]: Title: Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail Enhancement

Authors: Xinyue Liang, Zhinyuan Ma, Lingchen Sun, Yanjun Guo, Lei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2512.08534 [pdf, ps, other]: Title: PaintFlow: A Unified Framework for Interactive Oil Paintings Editing and Generation

Authors: Zhangli Hu, Ye Chen, Jiajun Yao, Bingbing Ni

Comments: 14 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47] arXiv:2512.08529 [pdf, ps, other]: Title: MVP: Multiple View Prediction Improves GUI Grounding

Authors: Yunzhu Zhang, Zeyu Pan, Zhengwen Zeng, Shuheng Shen, Changhua Meng, Linchao Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2512.08524 [pdf, ps, other]: Title: Beyond Real Weights: Hypercomplex Representations for Stable Quantization

Authors: Jawad Ibn Ahad, Maisha Rahman, Amrijit Biswas, Muhammad Rafsan Kabir, Robin Krambroeckers, Sifat Momen, Nabeel Mohammed, Shafin Rahman

Comments: Accepted in Winter Conference on Applications of Computer Vision (WACV) 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[49] arXiv:2512.08511 [pdf, ps, other]: Title: Thinking with Images via Self-Calling Agent

Authors: Wenxi Yang, Yuzhong Zhao, Fang Wan, Qixiang Ye

Comments: Code would be released at this https URL soon

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50] arXiv:2512.08506 [pdf, ps, other]: Title: OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point Clouds

Authors: Jialu Sui, Rui Liu, Hongsheng Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51] arXiv:2512.08505 [pdf, ps, other]: Title: Beyond the Noise: Aligning Prompts with Latent Representations in Diffusion Models

Authors: Vasco Ramos, Regev Cohen, Idan Szpektor, Joao Magalhaes

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[52] arXiv:2512.08503 [pdf, ps, other]: Title: Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning Models

Authors: Jiaming Zhang, Che Wang, Yang Cao, Longtao Huang, Wei Yang Bryan Lim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[53] arXiv:2512.08498 [pdf, ps, other]: Title: On-the-fly Large-scale 3D Reconstruction from Multi-Camera Rigs

Authors: Yijia Guo, Tong Hu, Zhiwei Li, Liwen Hu, Keming Qian, Xitong Lin, Shengbo Chen, Tiejun Huang, Lei Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[54] arXiv:2512.08486 [pdf, ps, other]: Title: Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned Interventions

Authors: Ada Gorgun, Fawaz Sammani, Nikos Deligiannis, Bernt Schiele, Jonas Fischer

Comments: Code is available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[55] arXiv:2512.08478 [pdf, ps, other]: Title: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Authors: Yuning Gong, Yifei Liu, Yifan Zhan, Muyao Niu, Xueying Li, Yuanjun Liao, Jiaming Chen, Yuanyuan Gao, Jiaqi Chen, Minming Chen, Li Zhou, Yuning Zhang, Wei Wang, Xiaoqing Hou, Huaxi Huang, Shixiang Tang, Le Ma, Dingwen Zhang, Xue Yang, Junchi Yan, Yanchi Zhang, Yinqiang Zheng, Xiao Sun, Zhihang Zhong

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[56] arXiv:2512.08477 [pdf, ps, other]: Title: ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Consistent Attention

Authors: Huiguo He, Pengyu Yan, Ziqi Yi, Weizhi Zhong, Zheng Liu, Yejun Tang, Huan Yang, Kun Gai, Guanbin Li, Lianwen Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[57] arXiv:2512.08467 [pdf, ps, other]: Title: Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion Recovery

Authors: Chamath Ranasinghe, Uthayasanker Thayasivam

Comments: 8 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[58] arXiv:2512.08445 [pdf, ps, other]: Title: Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts

Authors: Madhav Gupta, Vishak Prasad C, Ganesh Ramakrishnan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[59] arXiv:2512.08441 [pdf, ps, other]: Title: Leveraging Multispectral Sensors for Color Correction in Mobile Cameras

Authors: Luca Cogo, Marco Buzzelli, Simone Bianco, Javier Vazquez-Corral, Raimondo Schettini

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[60] arXiv:2512.08439 [pdf, ps, other]: Title: LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training

Authors: Qing Xu, Kun Yuan, Yuxiang Luo, Yuhao Zhai, Wenting Duan, Nassir Navab, Zhen Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[61] arXiv:2512.08430 [pdf, ps, other]: Title: SDT-6D: Fully Sparse Depth-Transformer for Staged End-to-End 6D Pose Estimation in Industrial Multi-View Bin Picking

Authors: Nico Leuze, Maximilian Hoh, Samed Doğan, Nicolas R.-Peña, Alfred Schoettl

Comments: Accepted to WACV 2026. Preprint version

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[62] arXiv:2512.08410 [pdf, ps, other]: Title: Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval

Authors: Tao Chen, Shaobo Ju, Qiong Wu, Chenxin Fang, Kun Zhang, Jun Peng, Hui Li, Yiyi Zhou, Rongrong Ji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63] arXiv:2512.08406 [pdf, ps, other]: Title: SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos

Authors: Mingqi Gao, Yunqi Miao, Jungong Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64] arXiv:2512.08400 [pdf, ps, other]: Title: Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries

Authors: Samitha Nuwan Thilakarathna, Ercan Avsar, Martin Mathias Nielsen, Malte Pedersen

Comments: The paper has been accepted for publication at Northern Lights Deep Learning (NLDL) Conference 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[65] arXiv:2512.08397 [pdf, ps, other]: Title: Detection of Digital Facial Retouching utilizing Face Beauty Information

Authors: Philipp Srock, Juan E. Tapia, Christoph Busch

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66] arXiv:2512.08378 [pdf, ps, other]: Title: Simultaneous Enhancement and Noise Suppression under Complex Illumination Conditions

Authors: Jing Tao, You Li, Banglei Guan, Yang Shang, Qifeng Yu

Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[67] arXiv:2512.08374 [pdf, ps, other]: Title: The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss

Authors: Bozhou Li, Xinda Xue, Sihan Yang, Yang Shi, Xinlong Chen, Yushuo Guan, Yuanxing Zhang, Wentao Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[68] arXiv:2512.08362 [pdf, ps, other]: Title: SCU-CGAN: Enhancing Fire Detection through Synthetic Fire Image Generation and Dataset Augmentation

Authors: Ju-Young Kim, Ji-Hong Park, Gun-Woo Kim

Comments: Accepted for main track at MobieSec 2024 (not published in the proceedings)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[69] arXiv:2512.08358 [pdf, ps, other]: Title: TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels

Authors: Jiahao Lu, Weitao Xiong, Jiacheng Deng, Peng Li, Tianyu Huang, Zhiyang Dou, Cheng Lin, Sai-Kit Yeung, Yuan Liu

Comments: Accepted by NeurIPS 2025. Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70] arXiv:2512.08337 [pdf, ps, other]: Title: DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD Generation

Authors: Jianwei Wang, Qing Wang, Menglan Ruan, Rongjun Ge, Chunfeng Yang, Yang Chen, Chunming Xie

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71] arXiv:2512.08334 [pdf, ps, other]: Title: HybridSplat: Fast Reflection-baked Gaussian Tracing using Hybrid Splatting

Authors: Chang Liu, Hongliang Yuan, Lianghao Zhang, Sichao Wang, Jianwei Guo, Shi-Sheng Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72] arXiv:2512.08331 [pdf, ps, other]: Title: Bi^2MAC: Bimodal Bi-Adaptive Mask-Aware Convolution for Remote Sensing Pansharpening

Authors: Xianghong Xiao, Zeyu Xia, Zhou Fei, Jinliang Xiao, Haorui Chen, Liangjian Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73] arXiv:2512.08330 [pdf, ps, other]: Title: PointDico: Contrastive 3D Representation Learning Guided by Diffusion Models

Authors: Pengbo Li, Yiding Sun, Haozhe Cheng

Comments: Accepted by IJCNN 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[74] arXiv:2512.08329 [pdf, ps, other]: Title: Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models

Authors: Michael R. Martin, Garrick Chan, Kwan-Liu Ma

Comments: 32 pages, 17 figures, 1 table, 5 algorithms, preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[75] arXiv:2512.08327 [pdf, ps, other]: Title: Low Rank Support Quaternion Matrix Machine

Authors: Wang Chen, Ziyan Luo, Shuangyue Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[76] arXiv:2512.08325 [pdf, ps, other]: Title: GeoDiffMM: Geometry-Guided Conditional Diffusion for Motion Magnification

Authors: Xuedeng Liu, Jiabao Guo, Zheng Zhang, Fei Wang, Zhi Liu, Dan Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77] arXiv:2512.08323 [pdf, ps, other]: Title: Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challenge

Authors: Achraf Ben-Hamadou, Nour Neifar, Ahmed Rekik, Oussama Smaoui, Firas Bouzguenda, Sergi Pujades, Niels van Nistelrooij, Shankeeth Vinayahalingam, Kaibo Shi, Hairong Jin, Youyi Zheng, Tibor Kubík, Oldřich Kodym, Petr Šilling, Kateřina Trávníčková, Tomáš Mojžiš, Jan Matula, Jeffry Hartanto, Xiaoying Zhu, Kim-Ngan Nguyen, Tudor Dascalu, Huikai Wu, and Weijie Liu, Shaojie Zhuang, Guangshun Wei, Yuanfeng Zhou

Comments: MICCAI 2024, 3DTeethLand, Challenge report, under review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78] arXiv:2512.08317 [pdf, ps, other]: Title: GeoDM: Geometry-aware Distribution Matching for Dataset Distillation

Authors: Xuhui Li, Zhengquan Luo, Zihui Cui, Zhiqiang Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[79] arXiv:2512.08309 [pdf, ps, other]: Title: Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise in Infinite, Real-Time Terrain Generation

Authors: Alexander Goslin

Comments: Project website: this https URL Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
[80] arXiv:2512.08294 [pdf, ps, other]: Title: OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation

Authors: Yexin Liu, Manyuan Zhang, Yueze Wang, Hongyu Li, Dian Zheng, Weiming Zhang, Changsheng Lu, Xunliang Cai, Yan Feng, Peng Pei, Harry Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[81] arXiv:2512.08282 [pdf, ps, other]: Title: PAVAS: Physics-Aware Video-to-Audio Synthesis

Authors: Oh Hyun-Bin, Yuhta Takida, Toshimitsu Uesaka, Tae-Hyun Oh, Yuki Mitsufuji

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
[82] arXiv:2512.08269 [pdf, ps, other]: Title: EgoX: Egocentric Video Generation from a Single Exocentric Video

Authors: Taewoong Kang, Kinam Kim, Dohyeon Kim, Minho Park, Junha Hyung, Jaegul Choo

Comments: 21 pages, project page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83] arXiv:2512.08262 [pdf, ps, other]: Title: RLCNet: An end-to-end deep learning framework for simultaneous online calibration of LiDAR, RADAR, and Camera

Authors: Hafeez Husain Cholakkal, Stefano Arrigoni, Francesco Braghin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[84] arXiv:2512.08254 [pdf, ps, other]: Title: SFP: Real-World Scene Recovery Using Spatial and Frequency Priors

Authors: Yun Liu, Tao Li, Cosmin Ancuti, Wenqi Ren, Weisi Lin

Comments: 10 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[85] arXiv:2512.08253 [pdf, ps, other]: Title: Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic Segmentation

Authors: YiLin Zhou, Lili Wei, Zheming Xu, Ziyi Chen, Congyan Lang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[86] arXiv:2512.08247 [pdf, ps, other]: Title: Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object Detection

Authors: Haowen Zheng, Hu Zhu, Lu Deng, Weihao Gu, Yang Yang, Yanyan Liang

Comments: AAAI-26

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[87] arXiv:2512.08243 [pdf, ps, other]: Title: Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSI

Authors: Saeeda Naz, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences (UEAS), Swat, Pakistan)

Comments: 26 Pages, 10 Figures, 4 Tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[88] arXiv:2512.08240 [pdf, ps, other]: Title: HybridToken-VLM: Hybrid Token Compression for Vision-Language Models

Authors: Jusheng Zhang, Xiaoyang Guo, Kaitong Cai, Qinhan Lv, Yijia Fan, Wenhao Chai, Jian Wang, Keze Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[89] arXiv:2512.08237 [pdf, ps, other]: Title: FastBEV++: Fast by Algorithm, Deployable by Design

Authors: Yuanpeng Chen, Hui Song, Wei Tao, ShanHui Mo, Shuang Zhang, Xiao Hua, TianKun Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[90] arXiv:2512.08229 [pdf, ps, other]: Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems

Authors: Tony Salloom, Dandi Zhou, Xinhai Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[91] arXiv:2512.08228 [pdf, ps, other]: Title: MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal Models

Authors: Jusheng Zhang, Kaitong Cai, Xiaoyang Guo, Sidi Liu, Qinhan Lv, Ruiqi Chen, Jing Yang, Yijia Fan, Xiaofei Sun, Jian Wang, Ziliang Chen, Liang Lin, Keze Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[92] arXiv:2512.08227 [pdf, ps, other]: Title: New VVC profiles targeting Feature Coding for Machines

Authors: Md Eimran Hossain Eimon, Ashan Perera, Juan Merlos, Velibor Adzic, Hari Kalva

Comments: Accepted for presentation at ICIP 2025 workshop on Coding for Machines

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93] arXiv:2512.08223 [pdf, ps, other]: Title: SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection

Authors: Ching-Hung Cheng, Hsiu-Fu Wu, Bing-Chen Wu, Khanh-Phong Bui, Van-Tin Luu, Ching-Chun Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94] arXiv:2512.08221 [pdf, ps, other]: Title: VisKnow: Constructing Visual Knowledge Base for Object Understanding

Authors: Ziwei Yao, Qiyang Wan, Ruiping Wang, Xilin Chen

Comments: 16 pages, 12 figures, 7 tables. Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95] arXiv:2512.08215 [pdf, ps, other]: Title: Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior Refinement

Authors: Chia-Hern Lai, I-Hsuan Lo, Yen-Ku Yeh, Thanh-Nguyen Truong, Ching-Chun Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[96] arXiv:2512.08198 [pdf, ps, other]: Title: Animal Re-Identification on Microcontrollers

Authors: Yubo Chen, Di Zhao, Yun Sing Koh, Talia Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[97] arXiv:2512.08180 [pdf, ps, other]: Title: GeoLoom: High-quality Geometric Diagram Generation from Textual Input

Authors: Xiaojing Wei, Ting Zhang, Wei He, Jingdong Wang, Hua Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[98] arXiv:2512.08163 [pdf, ps, other]: Title: Accuracy Does Not Guarantee Human-Likeness in Monocular Depth Estimators

Authors: Yuki Kubota, Taiki Fukiage

Comments: 22 pages, 12 figures, 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[99] arXiv:2512.08161 [pdf, ps, other]: Title: Fourier-RWKV: A Multi-State Perception Network for Efficient Image Dehazing

Authors: Lirong Zheng, Yanshan Li, Rui Yu, Kaihao Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100] arXiv:2512.08135 [pdf, ps, other]: Title: CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning

Authors: Zeyuan Chen, Xiang Zhang, Haiyang Xu, Jianwen Xie, Zhuowen Tu

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[101] arXiv:2512.08075 [pdf, ps, other]: Title: Identification of Deforestation Areas in the Amazon Rainforest Using Change Detection Models

Authors: Christian Massao Konishi, Helio Pedrini

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102] arXiv:2512.08048 [pdf, ps, other]: Title: Mask to Adapt: Simple Random Masking Enables Robust Continual Test-Time Learning

Authors: Chandler Timm C. Doloriel

Comments: ongoing work

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[103] arXiv:2512.08042 [pdf, ps, other]: Title: Towards Sustainable Universal Deepfake Detection with Frequency-Domain Masking

Authors: Chandler Timm C. Doloriel, Habib Ullah, Kristian Hovde Liland, Fadi Al Machot, Ngai-Man Cheung

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[104] arXiv:2512.08040 [pdf, ps, other]: Title: Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment

Authors: Youngjoon Jang, Liliane Momeni, Zifan Jiang, Joon Son Chung, Gül Varol, Andrew Zisserman

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105] arXiv:2512.08038 [pdf, ps, other]: Title: SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity Classification

Authors: Elifnur Sunger, Tales Imbiriba, Peter Campbell, Deniz Erdogmus, Stratis Ioannidis, Jennifer Dy

Comments: 20 pages, 16 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[106] arXiv:2512.08016 [pdf, ps, other]: Title: FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models

Authors: Jiyoon Pyo, Yuankun Jiao, Dongwon Jung, Zekun Li, Leeje Jang, Sofia Kirsanova, Jina Kim, Yijun Lin, Qin Liu, Junyi Xie, Hadi Askari, Nan Xu, Muhao Chen, Yao-Yi Chiang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[107] arXiv:2512.07984 [pdf, ps, other]: Title: Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection

Authors: Ryan Banks, Camila Lindoni Azevedo, Hongying Tang, Yunpeng Li

Comments: 13 pages, 7 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[108] arXiv:2512.07951 [pdf, ps, other]: Title: Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality

Authors: Zekai Luo, Zongze Du, Zhouhang Zhu, Hao Zhong, Muzhi Zhu, Wen Wang, Yuling Xi, Chenchen Jing, Hao Chen, Chunhua Shen

Comments: Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[109] arXiv:2512.07925 [pdf, ps, other]: Title: Near-real time fires detection using satellite imagery in Sudan conflict

Authors: Kuldip Singh Atwal, Dieter Pfoser, Daniel Rothbart

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[110] arXiv:2512.07838 [pdf, ps, other]: Title: Detection of Cyberbullying in GIF using AI

Authors: Pal Dave, Xiaohong Yuan, Madhuri Siddula, Kaushik Roy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[111] arXiv:2512.08715 (cross-list from cs.PF) [pdf, ps, other]: Title: Multi-domain performance analysis with scores tailored to user preferences

Authors: Sébastien Piérard, Adrien Deliège, Marc Van Droogenbroeck

Subjects: Performance (cs.PF); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[112] arXiv:2512.08629 (cross-list from cs.AI) [pdf, ps, other]: Title: See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm

Authors: Haoyu Zhao, Weizhong Ding, Yuhao Yang, Zheng Tian, Linyi Yang, Kun Shao, Jun Wang

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[113] arXiv:2512.08545 (cross-list from cs.CL) [pdf, ps, other]: Title: Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks

Authors: Indrajit Kar, Kalathur Chenchu Kishore Kumar

Comments: 22 pages, 2 tables, 9 figures

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[114] arXiv:2512.08500 (cross-list from cs.GR) [pdf, ps, other]: Title: Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions

Authors: Jianan Li, Xiao Chen, Tao Huang, Tien-Tsin Wong

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[115] arXiv:2512.08360 (cross-list from cs.NE) [pdf, ps, other]: Title: Conditional Morphogenesis: Emergent Generation of Structural Digits via Neural Cellular Automata

Authors: Ali Sakour

Comments: 13 pages, 5 figures. Code available at: this https URL

Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[116] arXiv:2512.08284 (cross-list from physics.geo-ph) [pdf, ps, other]: Title: Self-Reinforced Deep Priors for Reparameterized Full Waveform Inversion

Authors: Guangyuan Zou, Junlun Li, Feng Liu, Xuejing Zheng, Jianjian Xie, Guoyi Chen

Comments: Submitted to GEOPHYSICS

Subjects: Geophysics (physics.geo-ph); Computer Vision and Pattern Recognition (cs.CV)
[117] arXiv:2512.08271 (cross-list from cs.RO) [pdf, ps, other]: Title: Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic Teleoperation

Authors: Srijan Dokania, Dharini Raghavan

Comments: Published and Presented at 3rd Workshop on Human-Centric Multilateral Teleoperation in ICRA 2025

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[118] arXiv:2512.08216 (cross-list from eess.IV) [pdf, ps, other]: Title: Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation

Authors: Aneesh Rangnekar, Harini Veeraraghavan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[119] arXiv:2512.08188 (cross-list from cs.RO) [pdf, ps, other]: Title: Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World Model

Authors: Wenjiang Xu, Cindy Wang, Rui Fang, Mingkang Zhang, Lusong Li, Jing Xu, Jiayuan Gu, Zecui Zeng, Rui Chen

Comments: Website at this https URL

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2512.08170 (cross-list from cs.RO) [pdf, ps, other]: Title: RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric Features

Authors: Haoxin Zhang, Shuaixin Li, Xiaozhou Zhu, Hongbo Chen, Wen Yao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2512.08153 (cross-list from cs.LG) [pdf, ps, other]: Title: TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models

Authors: Zheng Ding, Weirui Ye

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2512.08125 (cross-list from eess.IV) [pdf, ps, other]: Title: FlowSteer: Conditioning Flow Field for Consistent Image Restoration

Authors: Tharindu Wickremasinghe, Chenyang Qi, Harshana Weligampola, Zhengzhong Tu, Stanley H. Chan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[123] arXiv:2512.08099 (cross-list from math.NA) [pdf, ps, other]: Title: Generalizations of the Normalized Radon Cumulative Distribution Transform for Limited Data Recognition

Authors: Matthias Beckmann, Robert Beinert, Jonas Bresch

Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
[124] arXiv:2512.08029 (cross-list from cs.LG) [pdf, ps, other]: Title: CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space

Authors: Tianxingjian Ding, Yuanhao Zou, Chen Chen, Mubarak Shah, Yu Tian

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[125] arXiv:2512.07998 (cross-list from cs.RO) [pdf, ps, other]: Title: DIJIT: A Robotic Head for an Active Observer

Authors: Mostafa Kamali Tabrizi, Mingshi Chi, Bir Bikram Dey, Yu Qing Yuan, Markus D. Solbach, Yiqian Liu, Michael Jenkin, John K. Tsotsos

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[126] arXiv:2512.07981 (cross-list from cs.LG) [pdf, ps, other]: Title: CIP-Net: Continual Interpretable Prototype-based Network

Authors: Federico Di Valerio, Michela Proietti, Alessio Ragno, Roberto Capobianco

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2512.07976 (cross-list from cs.RO) [pdf, ps, other]: Title: VLD: Visual Language Goal Distance for Reinforcement Learning Navigation

Authors: Lazar Milikic, Manthan Patel, Jonas Frey

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2512.07969 (cross-list from cs.RO) [pdf, ps, other]: Title: Sparse Variable Projection in Robotic Perception: Exploiting Separable Structure for Efficient Nonlinear Optimization

Authors: Alan Papalia, Nikolas Sanderson, Haoyu Han, Heng Yang, Hanumant Singh, Michael Everett

Comments: 8 pages, submitted for review

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[129] arXiv:2512.07884 (cross-list from cs.LG) [pdf, ps, other]: Title: GSPN-2: Efficient Parallel Sequence Modeling

Authors: Hongjun Wang, Yitong Jiang, Collin McCarthy, David Wehr, Hanrong Ye, Xinhao Li, Ka Chun Cheung, Wonmin Byeon, Jinwei Gu, Ke Chen, Kai Han, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Sifei Liu

Comments: NeurIPS 2025

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2512.07855 (cross-list from cs.LG) [pdf, ps, other]: Title: LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer Model

Authors: Huizheng Wang, Hongbin Wang, Shaojun Wei, Yang Hu, Shouyi Yin

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[131] arXiv:2512.05791 (cross-list from physics.med-ph) [pdf, ps, other]: Title: Fast and Robust Diffusion Posterior Sampling for MR Image Reconstruction Using the Preconditioned Unadjusted Langevin Algorithm

Authors: Moritz Blumenthal, Tina Holliber, Jonathan I. Tamir, Martin Uecker

Comments: Submitted to Magnetic Resonance in Medicine

Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Probability (math.PR)

Tue, 9 Dec 2025

[132] arXiv:2512.07834 [pdf, ps, other]: Title: Voxify3D: Pixel Art Meets Volumetric Rendering

Authors: Yi-Chuan Huang, Jiewen Chan, Hao-Jen Chien, Yu-Lun Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2512.07833 [pdf, ps, other]: Title: Relational Visual Similarity

Authors: Thao Nguyen, Sicheng Mo, Krishna Kumar Singh, Yilin Wang, Jing Shi, Nicholas Kolkin, Eli Shechtman, Yong Jae Lee, Yuheng Li

Comments: Project page, data, and code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[134] arXiv:2512.07831 [pdf, ps, other]: Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

Authors: Jiehui Huang, Yuechen Zhang, Xu He, Yuan Gao, Zhi Cen, Bin Xia, Yan Zhou, Xin Tao, Pengfei Wan, Jiaya Jia

Comments: Project Website this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2512.07829 [pdf, ps, other]: Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation

Authors: Yuan Gao, Chen Chen, Tianrong Chen, Jiatao Gu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[136] arXiv:2512.07826 [pdf, ps, other]: Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

Authors: Haoyang He, Jie Wang, Jiangning Zhang, Zhucun Xue, Xingyuan Bu, Qiangpeng Yang, Shilei Wen, Lei Xie

Comments: 38 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137] arXiv:2512.07821 [pdf, ps, other]: Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion Modeling

Authors: Shaoheng Fang, Hanwen Jiang, Yunpeng Bai, Niloy J. Mitra, Qixing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[138] arXiv:2512.07807 [pdf, ps, other]: Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale Scenes

Authors: Shai Krakovsky, Gal Fiebelman, Sagie Benaim, Hadar Averbuch-Elor

Comments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[139] arXiv:2512.07806 [pdf, ps, other]: Title: Multi-view Pyramid Transformer: Look Coarser to See Broader

Authors: Gyeongjin Kang, Seungkwon Yang, Seungtae Nam, Younggeun Lee, Jungwoo Kim, Eunbyung Park

Comments: Project page: see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140] arXiv:2512.07802 [pdf, ps, other]: Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Authors: Zhaochong An, Menglin Jia, Haonan Qiu, Zijian Zhou, Xiaoke Huang, Zhiheng Liu, Weiming Ren, Kumara Kahatapitiya, Ding Liu, Sen He, Chenyang Zhang, Tao Xiang, Fanny Yang, Serge Belongie, Tian Xie

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2512.07778 [pdf, ps, other]: Title: Distribution Matching Variational AutoEncoder

Authors: Sen Ye, Jianning Pei, Mengde Xu, Shuyang Gu, Chunyu Wang, Liwei Wang, Han Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142] arXiv:2512.07776 [pdf, ps, other]: Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring

Authors: Maximilian Schall, Felix Leonard Knöfel, Noah Elias König, Jan Jonas Kubeler, Maximilian von Klinski, Joan Wilhelm Linnemann, Xiaoshi Liu, Iven Jelle Schlegelmilch, Ole Woyciniuk, Alexandra Schild, Dante Wasmuht, Magdalena Bermejo Espinet, German Illera Basas, Gerard de Melo

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[143] arXiv:2512.07760 [pdf, ps, other]: Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-Identification

Authors: Menglin Wang, Xiaojin Gong, Jiachen Li, Genlin Ji

Comments: Accepted to AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144] arXiv:2512.07756 [pdf, ps, other]: Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound Reconstruction

Authors: Mayank Anand, Ujair Alam, Surya Prakash, Priya Shukla, Gora Chand Nandi, Domenec Puig

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[145] arXiv:2512.07747 [pdf, ps, other]: Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and Generation

Authors: Shihao Zhao, Yitong Chen, Zeyinzi Jiang, Bojia Zi, Shaozhe Hao, Yu Liu, Chaojie Mao, Kwan-Yee K. Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146] arXiv:2512.07745 [pdf, ps, other]: Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

Authors: Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, Xinggang Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2512.07738 [pdf, ps, other]: Title: HLTCOE Evaluation Team at TREC 2025: VQA Track

Authors: Dengjia Zhang, Charles Weng, Katherine Guerrerio, Yi Lu, Kenton Murray, Alexander Martin, Reno Kriz, Benjamin Van Durme

Comments: 7 pages, 1 figure

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2512.07733 [pdf, ps, other]: Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery

Authors: Meng Cao, Xingyu Li, Xue Liu, Ian Reid, Xiaodan Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149] arXiv:2512.07730 [pdf, ps, other]: Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination

Authors: Sangha Park, Seungryong Yoo, Jisoo Mok, Sungroh Yoon

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[150] arXiv:2512.07729 [pdf, ps, other]: Title: Improving action classification with brain-inspired deep networks

Authors: Aidas Aglinskas, Stefano Anzellotti

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[151] arXiv:2512.07720 [pdf, ps, other]: Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation

Authors: Fan Yang, Heyuan Li, Peihao Li, Weihao Yuan, Lingteng Qiu, Chaoyue Song, Cheng Chen, Yisheng He, Shifeng Zhang, Xiaoguang Han, Steven Hoi, Guosheng Lin

Comments: Project page: \url{this https URL}

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152] arXiv:2512.07712 [pdf, ps, other]: Title: UnCageNet: Tracking and Pose Estimation of Caged Animal

Authors: Sayak Dutta, Harish Katti, Shashikant Verma, Shanmuganathan Raman

Comments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, India

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153] arXiv:2512.07703 [pdf, ps, other]: Title: PVeRA: Probabilistic Vector-Based Random Matrix Adaptation

Authors: Leo Fillioux, Enzo Ferrante, Paul-Henry Cournède, Maria Vakalopoulou, Stergios Christodoulidis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[154] arXiv:2512.07702 [pdf, ps, other]: Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment

Authors: Sangha Park, Eunji Kim, Yeongtak Oh, Jooyoung Choi, Sungroh Yoon

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[155] arXiv:2512.07698 [pdf, ps, other]: Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data Only

Authors: Arslan Artykov, Corentin Sautier, Vincent Lepetit

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[156] arXiv:2512.07674 [pdf, ps, other]: Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast Representations

Authors: Mehmet Yigit Avci, Pedro Borges, Virginia Fernandez, Paul Wright, Mehmet Yigitsoy, Sebastien Ourselin, Jorge Cardoso

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[157] arXiv:2512.07668 [pdf, ps, other]: Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset

Authors: Ronan John, Aditya Kesari, Vincenzo DiMatteo, Kristin Dana

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158] arXiv:2512.07661 [pdf, ps, other]: Title: Optimization-Guided Diffusion for Interactive Scene Generation

Authors: Shiaho Li, Naisheng Ye, Tianyu Li, Kashyap Chitta, Tuo An, Peng Su, Boyang Wang, Haiou Liu, Chen Lv, Hongyang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159] arXiv:2512.07652 [pdf, ps, other]: Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research

Authors: Hamad Almazrouei, Mariam Al Nasseri, Maha Alzaabi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[160] arXiv:2512.07651 [pdf, ps, other]: Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline Method

Authors: Yuanye Liu, Hanxiao Zhang, Nannan Shi, Yuxin Shi, Arif Mahmood, Murtaza Taj, Xiahai Zhuang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161] arXiv:2512.07628 [pdf, ps, other]: Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation

Authors: Zhiqi Li, Wenhuan Li, Tengfei Wang, Zhenwei Wang, Junta Wu, Haoyuan Wang, Yunhan Yang, Zehuan Huang, Yang Li, Peidong Liu, Chunchao Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2512.07606 [pdf, ps, other]: Title: Decomposition Sampling for Efficient Region Annotations in Active Learning

Authors: Jingna Qiu, Frauke Wilm, Mathias Öttl, Jonas Utz, Maja Schlereth, Moritz Schillinger, Marc Aubreville, Katharina Breininger

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163] arXiv:2512.07599 [pdf, ps, other]: Title: Online Segment Any 3D Thing as Instance Tracking

Authors: Hanshi Wang, Zijian Cai, Jin Gao, Yiwei Zhang, Weiming Hu, Ke Wang, Zhipeng Zhang

Comments: NeurIPS 2025, Code is at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2512.07596 [pdf, ps, other]: Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery

Authors: Wenzhen Dong, Jieming Yu, Yiming Huang, Hongqiu Wang, Lei Zhu, Albert C. S. Chung, Hongliang Ren, Long Bai

Comments: Technical Report

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[165] arXiv:2512.07590 [pdf, ps, other]: Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image Segmentation

Authors: Kaili Qi, Zhongyi Huang, Wenli Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166] arXiv:2512.07584 [pdf, ps, other]: Title: LongCat-Image Technical Report

Authors: Meituan LongCat Team: Hanghang Ma, Haoxian Tan, Jiale Huang, Junqiang Wu, Jun-Yan He, Lishuai Gao, Songlin Xiao, Xiaoming Wei, Xiaoqi Ma, Xunliang Cai, Yayong Guan, Jie Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167] arXiv:2512.07580 [pdf, ps, other]: Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs

Authors: Yahong Wang, Juncheng Wu, Zhangkai Ni, Longzhen Yang, Yihang Liu, Chengmei Yang, Ying Wen, Xianfeng Tang, Hui Liu, Yuyin Zhou, Lianghua He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168] arXiv:2512.07568 [pdf, ps, other]: Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation

Authors: Xuecheng Li, Weikuan Jia, Alisher Kurbonaliev, Qurbonaliev Alisher, Khudzhamkulov Rustam, Ismoilov Shuhratjon, Eshmatov Javhariddin, Yuanjie Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[169] arXiv:2512.07564 [pdf, ps, other]: Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models

Authors: Kassoum Sanogo, Renzo Ardiccioni

Comments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[170] arXiv:2512.07527 [pdf, ps, other]: Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images

Authors: Fei Yu, Yu Liu, Luyang Tang, Mingchao Sun, Zengye Ge, Rui Bu, Yuchao Jin, Haisen Zhao, He Sun, Yangyan Li, Mu Xu, Wenzheng Chen, Baoquan Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[171] arXiv:2512.07514 [pdf, ps, other]: Title: MeshRipple: Structured Autoregressive Generation of Artist-Meshes

Authors: Junkai Lin, Hang Long, Huipeng Guo, Jielei Zhang, JiaYi Yang, Tianle Guo, Yang Yang, Jianwen Li, Wenxiao Zhang, Matthias Nießner, Wei Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172] arXiv:2512.07504 [pdf, ps, other]: Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points

Authors: Ryota Okumura, Kaede Shiohara, Toshihiko Yamasaki

Comments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[173] arXiv:2512.07503 [pdf, ps, other]: Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation

Authors: Yao Teng, Zhihuan Jiang, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2512.07500 [pdf, ps, other]: Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer

Authors: Penghui Liu, Jiangshan Wang, Yutong Shen, Shanhui Mo, Chenyang Qi, Yue Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175] arXiv:2512.07498 [pdf, ps, other]: Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior

Authors: Chih-Chung Hsu, Shao-Ning Chen, Chia-Ming Lee, Yi-Fang Wang, Yi-Shiuan Chou

Comments: 16 pages (including appendix)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176] arXiv:2512.07480 [pdf, ps, other]: Title: Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance

Authors: Naifu Xue, Zhaoyang Jia, Jiahao Li, Bin Li, Zihan Zheng, Yuan Zhang, Yan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2512.07469 [pdf, ps, other]: Title: Unified Video Editing with Temporal Reasoner

Authors: Xiangpeng Yang, Ji Xie, Yiyuan Yang, Yan Huang, Min Xu, Qiang Wu

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178] arXiv:2512.07426 [pdf, ps, other]: Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processing

Authors: Karel Moens, Matthew B. Blaschko, Tinne Tuytelaars, Bart Diricx, Jonas De Vylder, Mustafa Yousif

Comments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[179] arXiv:2512.07415 [pdf, ps, other]: Title: Data-driven Exploration of Mobility Interaction Patterns

Authors: Gabriele Galatolo, Mirco Nanni

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[180] arXiv:2512.07410 [pdf, ps, other]: Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs

Authors: Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang, Xin Chen, Lan Xu, Jingyi Yu, Jingya Wang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181] arXiv:2512.07394 [pdf, ps, other]: Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric Video

Authors: Zhifan Zhu, Siddhant Bansal, Shashank Tripathi, Dima Damen

Comments: webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182] arXiv:2512.07391 [pdf, ps, other]: Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency Monitoring

Authors: Đorđe Nedeljković

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183] arXiv:2512.07385 [pdf, ps, other]: Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline

Authors: Chunhui Zhang, Li Liu, Zhipeng Zhang, Yong Wang, Hao Wen, Xi Zhou, Shiming Ge, Yanfeng Wang

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2512.07383 [pdf, ps, other]: Title: LogicCBMs: Logic-Enhanced Concept-Based Learning

Authors: Deepika SN Vemuri, Gautham Bellamkonda, Aditya Pola, Vineeth N Balasubramanian

Comments: 18 pages, 19 figures, WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185] arXiv:2512.07381 [pdf, ps, other]: Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic Objects

Authors: Shuohan Tao, Boyao Zhou, Hanzhang Tu, Yuwang Wang, Yebin Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[186] arXiv:2512.07379 [pdf, ps, other]: Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and Efficiency

Authors: Mahila Moghadami, Mohammad Ali Keyvanrad, Melika Sabaghian

Comments: 22 pages, 16 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187] arXiv:2512.07360 [pdf, ps, other]: Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic Segmentation

Authors: Qiming Huang, Hao Ai, Jianbo Jiao

Comments: Accepted to WACV2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[188] arXiv:2512.07351 [pdf, ps, other]: Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake Detection

Authors: Sayeem Been Zaman, Wasimul Karim, Arefin Ittesafun Abian, Reem E. Mohamed, Md Rafiqul Islam, Asif Karim, Sami Azam

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
[189] arXiv:2512.07348 [pdf, ps, other]: Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition

Authors: Xinyu Wei, Kangrui Cen, Hongyang Wei, Zhen Guo, Bairui Li, Zeqing Wang, Jinrui Zhang, Lei Zhang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190] arXiv:2512.07345 [pdf, ps, other]: Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian Splatting

Authors: Shilong Jin, Haoran Duan, Litao Hua, Wentao Huang, Yuan Zhou

Comments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191] arXiv:2512.07338 [pdf, ps, other]: Title: Generalized Referring Expression Segmentation on Aerial Photos

Authors: Luís Marnoto, Alexandre Bernardino, Bruno Martins

Comments: Submitted to IEEE J-STARS

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192] arXiv:2512.07331 [pdf, ps, other]: Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision Transformers

Authors: Kanishk Awadhiya

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193] arXiv:2512.07328 [pdf, ps, other]: Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation

Authors: Ziyang Mai, Yu-Wing Tai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[194] arXiv:2512.07305 [pdf, ps, other]: Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image Dataset

Authors: Tobias Abraham Haider

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195] arXiv:2512.07302 [pdf, ps, other]: Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts

Authors: Mingning Guo, Mengwei Wu, Shaoxian Li, Haifeng Li, Chao Tao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[196] arXiv:2512.07276 [pdf, ps, other]: Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery

Authors: Mai Tsujimoto, Junjue Wang, Weihao Xuan, Naoto Yokoya

Comments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[197] arXiv:2512.07275 [pdf, ps, other]: Title: Effective Attention-Guided Multi-Scale Medical Network for Skin Lesion Segmentation

Authors: Siyu Wang, Hua Wang, Huiyu Li, Fan Zhang

Comments: The paper has been accepted by BIBM 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[198] arXiv:2512.07273 [pdf, ps, other]: Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language Translation

Authors: Zhi Rao, Yucheng Zhou, Benjia Zhou, Yiqing Huang, Sergio Escalera, Jun Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199] arXiv:2512.07269 [pdf, ps, other]: Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth data

Authors: Mike Diessner, Yannick Tarant

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[200] arXiv:2512.07253 [pdf, ps, other]: Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement

Authors: Handing Xu, Zhenguo Nie, Tairan Peng, Huimin Pan, Xin-Jun Liu

Comments: 18 pages, 8 figures, and 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[201] arXiv:2512.07251 [pdf, ps, other]: Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement

Authors: Junqi Liu, Zejun Wu, Pedro R. A. S. Bassi, Xinze Zhou, Wenxuan Li, Ibrahim E. Hamamci, Sezgin Er, Tianyu Lin, Yi Luo, Szymon Płotka, Bjoern Menze, Daguang Xu, Kai Ding, Kang Wang, Yang Yang, Yucheng Tang, Alan L. Yuille, Zongwei Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202] arXiv:2512.07247 [pdf, ps, other]: Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing

Authors: Ziming Hong, Tianyu Huang, Runnan Chen, Shanshan Ye, Mingming Gong, Bo Han, Tongliang Liu

Comments: 40 pages, 34 figures, 18 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[203] arXiv:2512.07245 [pdf, ps, other]: Title: Zero-Shot Textual Explanations via Translating Decision-Critical Features

Authors: Toshinori Yamauchi, Hiroshi Kera, Kazuhiko Kawamoto

Comments: 11+6 pages, 8 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204] arXiv:2512.07241 [pdf, ps, other]: Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network Architecture

Authors: Md. Srabon Chowdhury, Syeda Fahmida Tanzim, Sheekar Banerjee, Ishtiak Al Mamoon, AKM Muzahidul Islam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205] arXiv:2512.07237 [pdf, ps, other]: Title: Unified Camera Positional Encoding for Controlled Video Generation

Authors: Cheng Zhang, Boying Li, Meng Wei, Yan-Pei Cao, Camilo Cruz Gambardella, Dinh Phung, Jianfei Cai

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206] arXiv:2512.07234 [pdf, ps, other]: Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language Models

Authors: Biao Chen, Lin Zuo, Mengmeng Jing, Kunbin He, Yuchen Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[207] arXiv:2512.07230 [pdf, ps, other]: Title: STRinGS: Selective Text Refinement in Gaussian Splatting

Authors: Abhinav Raundhal, Gaurav Behera, P J Narayanan, Ravi Kiran Sarvadevabhatla, Makarand Tapaswi

Comments: Accepted to WACV 2026. Project Page, see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208] arXiv:2512.07229 [pdf, ps, other]: Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category Discovery

Authors: Fang Zhou, Zhiqiang Chen, Martin Pavlovski, Yizhong Zhang

Comments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[209] arXiv:2512.07228 [pdf, ps, other]: Title: Towards Robust Protective Perturbation against DeepFake Face Swapping

Authors: Hengyang Yao, Lin Li, Ke Sun, Jianing Qiu, Huiping Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[210] arXiv:2512.07215 [pdf, ps, other]: Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose Estimation

Authors: Md Selim Sarowar, Sungho Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[211] arXiv:2512.07211 [pdf, ps, other]: Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point Clouds

Authors: Frederik Hagelskjær, Dimitrios Arapis, Steffen Madsen, Thorbjørn Mosekjær Iversen

Comments: 8 pages, 8 figures, 5 tables, ICCR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[212] arXiv:2512.07206 [pdf, ps, other]: Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CT

Authors: Boyang Pan, Zeyu Zhang, Hongyu Meng, Bin Cui, Yingying Zhang, Wenli Hou, Junhao Li, Langdi Zhong, Xiaoxiao Chen, Xiaoyu Xu, Changjin Zuo, Chao Cheng, Nan-Jie Gong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[213] arXiv:2512.07203 [pdf, ps, other]: Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning

Authors: Xuhui Zheng, Kang An, Ziliang Wang, Yuhang Wang, Faqiang Qian, Yichao Wu

Comments: 7 pages, 1 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214] arXiv:2512.07201 [pdf, ps, other]: Title: Understanding Diffusion Models via Code Execution

Authors: Cheng Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[215] arXiv:2512.07198 [pdf, ps, other]: Title: Generating Storytelling Images with Rich Chains-of-Reasoning

Authors: Xiujie Song, Qi Jia, Shota Watanabe, Xiaoyi Pang, Ruijie Chen, Mengyue Wu, Kenny Q. Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[216] arXiv:2512.07197 [pdf, ps, other]: Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting

Authors: Seokhyun Youn, Soohyun Lee, Geonho Kim, Weeyoung Kwon, Sung-Ho Bae, Jihyong Oh

Comments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217] arXiv:2512.07192 [pdf, ps, other]: Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image Compression

Authors: Niu Yi, Xu Tianyi, Ma Mingming, Wang Xinkun

Comments: 12 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[218] arXiv:2512.07191 [pdf, ps, other]: Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field Correction

Authors: Wenqi Zhao, Jiacheng Sang, Fenghua Cheng, Yonglu Shu, Dong Li, Xiaofeng Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219] arXiv:2512.07190 [pdf, ps, other]: Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification

Authors: Pengfei Gu, Huimin Li, Haoteng Tang, Dongkuan (DK)Xu, Erik Enriquez, DongChul Kim, Bin Fu, Danny Z. Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220] arXiv:2512.07186 [pdf, ps, other]: Title: START: Spatial and Textual Learning for Chart Understanding

Authors: Zhuoming Liu, Xiaofeng Gao, Feiyang Niu, Qiaozi Gao, Liu Liu, Robinson Piramuthu

Comments: WACV2026 Camera Ready

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[221] arXiv:2512.07171 [pdf, ps, other]: Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image Restoration

Authors: Shravan Venkatraman, Rakesh Raj Madavan, Pavan Kumar S, Muthu Subash Kavitha

Comments: 21 pages, 11 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222] arXiv:2512.07170 [pdf, ps, other]: Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach

Authors: Jiayang Li, Chengjie Jiang, Junjun Jiang, Pengwei Liang, Jiayi Ma, Liqiang Nie

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[223] arXiv:2512.07166 [pdf, ps, other]: Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing

Authors: Siyuan Xu, Yibing Liu, Peilin Chen, Yung-Hui Li, Shiqi Wang, Sam Kwong

Comments: 9 pages,7figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[224] arXiv:2512.07165 [pdf, ps, other]: Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation

Authors: Muyu Xu, Fangneng Zhan, Xiaoqin Zhang, Ling Shao, Shijian Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225] arXiv:2512.07155 [pdf, ps, other]: Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented Metrics

Authors: Dahyeon Kye, Jeahun Sung, Mingyu Jeon, Jihyong Oh

Comments: Please visit our project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226] arXiv:2512.07141 [pdf, ps, other]: Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models

Authors: Fenghua Weng, Chaochao Lu, Xia Hu, Wenqi Shao, Wenjie Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[227] arXiv:2512.07136 [pdf, ps, other]: Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning

Authors: Siyang Jiang, Mu Yuan, Xiang Ji, Bufang Yang, Zeyu Liu, Lilin Xu, Yang Li, Yuting He, Liran Dong, Wenrui Lu, Zhenyu Yan, Xiaofan Jiang, Wei Gao, Hongkai Chen, Guoliang Xing

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[228] arXiv:2512.07135 [pdf, ps, other]: Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning

Authors: Zebin Xing, Pengxuan Yang, Linbo Wang, Yichen Zhang, Yiming Hu, Yupeng Zheng, Junli Wang, Yinfeng Gao, Guang Li, Kun Ma, Long Chen, Zhongpu Xia, Qichao Zhang, Hangjun Ye, Dongbin Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[229] arXiv:2512.07128 [pdf, ps, other]: Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIP

Authors: Chau Truong, Hieu Ta Quang, Dung D. Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[230] arXiv:2512.07126 [pdf, ps, other]: Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-On

Authors: Shengjie Lu, Zhibin Wan, Jiejie Liu, Quan Zhang, Mingjie Sun

Comments: 16 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231] arXiv:2512.07110 [pdf, ps, other]: Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection

Authors: Liangwei Jiang, Jinluo Xie, Yecheng Huang, Hua Zhang, Hongyu Yang, Di Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2512.07107 [pdf, ps, other]: Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D Supervision

Authors: Jaeyoon Lee, Hojoon Jung, Sungtae Hwang, Jihyong Oh, Jongwon Choi

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233] arXiv:2512.07078 [pdf, ps, other]: Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object Detection

Authors: Bo Gao, Jingcheng Tong, Xingsheng Chen, Han Yu, Zichen Li

Comments: 16 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[234] arXiv:2512.07076 [pdf, ps, other]: Title: Context-measure: Contextualizing Metric for Camouflage

Authors: Chen-Yang Wang, Gepeng Ji, Song Shao, Ming-Ming Cheng, Deng-Ping Fan

Comments: Technical Report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[235] arXiv:2512.07065 [pdf, ps, other]: Title: Persistent Homology-Guided Frequency Filtering for Image Compression

Authors: Anil Chintapalli, Peter Tenholder, Henry Chen, Arjun Rao

Comments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compression

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236] arXiv:2512.07062 [pdf, ps, other]: Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense Prediction

Authors: Changliang Xia, Chengyou Jia, Minnan Luo, Zhuohang Dang, Xin Shen, Bowen Ping

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[237] arXiv:2512.07052 [pdf, ps, other]: Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian Splatting

Authors: Hoang-Nhat Tran, Francesco Di Sario, Gabriele Spadaro, Giuseppe Valenzise, Enzo Tartaglione

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238] arXiv:2512.07051 [pdf, ps, other]: Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation

Authors: Adnan Munir, Shujaat Khan

Comments: 11 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[239] arXiv:2512.07037 [pdf, ps, other]: Title: Evaluating and Preserving High-level Fidelity in Super-Resolution

Authors: Josep M. Rocafort, Shaolin Su, Alexandra Gomez-Villa, Javier Vazquez-Corral

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[240] arXiv:2512.07034 [pdf, ps, other]: Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent Cues

Authors: Tuan-Anh Vu, Hai Nguyen-Truong, Ziqiang Zheng, Binh-Son Hua, Qing Guo, Ivor Tsang, Sai-Kit Yeung

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[241] arXiv:2512.06981 [pdf, ps, other]: Title: Selective Masking based Self-Supervised Learning for Image Semantic Segmentation

Authors: Yuemin Wang, Ian Stavness

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[242] arXiv:2512.06949 [pdf, ps, other]: Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology

Authors: Shravan Venkatraman, Muthu Subash Kavitha, Joe Dhanith P R, V Manikandarajan, Jia Wu

Comments: 19 pages, 5 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243] arXiv:2512.06921 [pdf, ps, other]: Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification

Authors: Ziyang Song, Zelin Zang, Xiaofan Ye, Boqiang Xu, Long Bai, Jinlin Wu, Hongliang Ren, Hongbin Liu, Jiebo Luo, Zhen Lei

Comments: Accepted by IEEE ICIA 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[244] arXiv:2512.06905 [pdf, ps, other]: Title: Scaling Zero-Shot Reference-to-Video Generation

Authors: Zijian Zhou, Shikun Liu, Haozhe Liu, Haonan Qiu, Zhaochong An, Weiming Ren, Zhiheng Liu, Xiaoke Huang, Kam Woh Ng, Tian Xie, Xiao Han, Yuren Cong, Hang Li, Chuyan Zhu, Aditya Patel, Tao Xiang, Sen He

Comments: Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245] arXiv:2512.06888 [pdf, ps, other]: Title: Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation

Authors: Liyang Song, Hardik Bishnoi, Sai Kumar Reddy Manne, Sarah Ostadabbas, Briana J. Taylor, Michael Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[246] arXiv:2512.06886 [pdf, ps, other]: Title: Balanced Learning for Domain Adaptive Semantic Segmentation

Authors: Wangkai Li, Rui Sun, Bohao Liao, Zhaoyang Li, Tianzhu Zhang

Comments: Accepted by International Conference on Machine Learning (ICML 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[247] arXiv:2512.06885 [pdf, ps, other]: Title: JoPano: Unified Panorama Generation via Joint Modeling

Authors: Wancheng Feng, Chen An, Zhenliang He, Meina Kan, Shiguang Shan, Lukun Wang

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[248] arXiv:2512.06882 [pdf, ps, other]: Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian Fusion

Authors: Yu Zhu, Naoya Chiba, Koichi Hashimoto

Comments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon request

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249] arXiv:2512.06877 [pdf, ps, other]: Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene Classification

Authors: Mohammed Q. Alkhatib, Ali Jamali, Swalpa Kumar Roy

Comments: Accepted and presented in ICSPIS

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250] arXiv:2512.06870 [pdf, ps, other]: Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective

Authors: Wangkai Li, Rui Sun, Zhaoyang Li, Tianzhu Zhang

Comments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[251] arXiv:2512.06866 [pdf, ps, other]: Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior

Authors: Yulin Li, Haokun Gui, Ziyang Fan, Junjie Wang, Bin Kang, Bin Chen, Zhuotao Tian

Comments: Accepted by NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[252] arXiv:2512.06865 [pdf, ps, other]: Title: Spatial Retrieval Augmented Autonomous Driving

Authors: Xiaosong Jia, Chenhe Zhang, Yule Jiang, Songbur Wong, Zhiyuan Zhang, Chen Chen, Shaofeng Zhang, Xuanhe Zhou, Xue Yang, Junchi Yan, Yu-Gang Jiang

Comments: Demo Page: this https URL with open sourced code, dataset, and checkpoints

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253] arXiv:2512.06864 [pdf, ps, other]: Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training

Authors: Kaixuan Lu, Mehmet Onurcan Kaya, Dim P. Papadopoulos

Comments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[254] arXiv:2512.06862 [pdf, ps, other]: Title: Omni-Referring Image Segmentation

Authors: Qiancheng Zheng, Yunhang Shen, Gen Luo, Baiyang Song, Xing Sun, Xiaoshuai Sun, Yiyi Zhou, Rongrong Ji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[255] arXiv:2512.06849 [pdf, ps, other]: Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT

Authors: Matan Atad, Alexander W. Marka, Lisa Steinhelfer, Anna Curto-Vilalta, Yannik Leonhardt, Sarah C. Foreman, Anna-Sophia Walburga Dietrich, Robert Graf, Alexandra S. Gersing, Bjoern Menze, Daniel Rueckert, Jan S. Kirschke, Hendrik Möller

Comments: In submission

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[256] arXiv:2512.06845 [pdf, ps, other]: Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly Detection

Authors: Satoshi Hashimoto, Hitoshi Nishimura, Yanan Wang, Mori Kurokawa

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257] arXiv:2512.06840 [pdf, ps, other]: Title: CADE: Continual Weakly-supervised Video Anomaly Detection with Ensembles

Authors: Satoshi Hashimoto, Tatsuya Konishi, Tomoya Kaichi, Kazunori Matsumoto, Mori Kurokawa

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258] arXiv:2512.06838 [pdf, ps, other]: Title: SparseCoop: Cooperative Perception with Kinematic-Grounded Queries

Authors: Jiahao Wang, Zhongwei Jiang, Wenchao Sun, Jiaru Zhong, Haibao Yu, Yuner Zhang, Chenyang Lu, Chuang Zhang, Lei He, Shaobing Xu, Jianqiang Wang

Comments: Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259] arXiv:2512.06818 [pdf, ps, other]: Title: MeshSplatting: Differentiable Rendering with Opaque Meshes

Authors: Jan Held, Sanghyun Son, Renaud Vandeghen, Daniel Rebain, Matheus Gadelha, Yi Zhou, Anthony Cioppa, Ming C. Lin, Marc Van Droogenbroeck, Andrea Tagliasacchi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260] arXiv:2512.06811 [pdf, ps, other]: Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language Models

Authors: Xiang Lin, Weixin Li, Shu Guo, Lihong Wang, Di Huang

Comments: Accepted by AAAI 2026(Oral)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[261] arXiv:2512.06810 [pdf, ps, other]: Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning

Authors: Yueqian Wang, Songxiang Liu, Disong Wang, Nuo Xu, Guanglu Wan, Huishuai Zhang, Dongyan Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[262] arXiv:2512.06802 [pdf, ps, other]: Title: VDOT: Efficient Unified Video Creation via Optimal Transport Distillation

Authors: Yutong Wang, Haiyu Zhang, Tianfan Xue, Yu Qiao, Yaohui Wang, Chang Xu, Xinyuan Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263] arXiv:2512.06793 [pdf, ps, other]: Title: Generalized Geometry Encoding Volume for Real-time Stereo Matching

Authors: Jiaxin Liu, Gangwei Xu, Xianqi Wang, Chengliang Zhang, Xin Yang

Comments: Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264] arXiv:2512.06783 [pdf, ps, other]: Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-Videos

Authors: Tobias Leuthold, Michele Xiloyannis, Yves Zimmermann

Comments: 16 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[265] arXiv:2512.06774 [pdf, ps, other]: Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting

Authors: Longjie Zhao, Ziming Hong, Zhenyang Ren, Runnan Chen, Mingming Gong, Tongliang Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[266] arXiv:2512.06769 [pdf, ps, other]: Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial Understanding

Authors: Hang Yin, Xiaomin He, PeiWen Yuan, Yiwei Li, Jiayi Shi, Wenxiao Fan, Shaoxiong Feng, Kan Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[267] arXiv:2512.06763 [pdf, ps, other]: Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control Algorithms

Authors: Chengyang Yan, Mitch Bryson, Donald G. Dansereau

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268] arXiv:2512.06759 [pdf, ps, other]: Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors

Authors: Wenbo Lyu, Yingjun Du, Jinglin Zhao, Xianton Zhen, Ling Shao

Comments: 12 pages,13figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[269] arXiv:2512.06750 [pdf, ps, other]: Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement

Authors: Weiqi Li, Xuanyu Zhang, Bin Chen, Jingfen Xie, Yan Wang, Kexin Zhang, Junlin Li, Li Zhang, Jian Zhang, Shijie Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[270] arXiv:2512.06746 [pdf, ps, other]: Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image Detection

Authors: Ruoxin Chen, Jiahui Gao, Kaiqing Lin, Keyue Zhang, Yandan Zhao, Isabel Guan, Taiping Yao, Shouhong Ding

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[271] arXiv:2512.06738 [pdf, ps, other]: Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain Adaptation

Authors: M Yashwanth, Sampath Koti, Arunabh Singh, Shyam Marjit, Anirban Chakraborty

Comments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[272] arXiv:2512.06736 [pdf, ps, other]: Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton Data

Authors: Jiaxing Fan, Jiaojiao Liu, Wenkong Wang, Yang Zhang, Xin Ma, Jichen Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273] arXiv:2512.06726 [pdf, ps, other]: Title: The Role of Entropy in Visual Grounding: Analysis and Optimization

Authors: Shuo Li, Jiajun Sun, Zhihao Zhang, Xiaoran Fan, Senjie Jin, Hui Li, Yuming Yang, Junjie Ye, Lixing Shen, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[274] arXiv:2512.06689 [pdf, ps, other]: Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and Separation

Authors: Jisoo Park, Seonghak Lee, Guisik Kim, Taewoo Kim, Junseok Kwon

Comments: Accepted to ASRU 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[275] arXiv:2512.06684 [pdf, ps, other]: Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy

Authors: Yumeng He, Zanwei Zhou, Yekun Zheng, Chen Liang, Yunbo Wang, Xiaokang Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[276] arXiv:2512.06674 [pdf, ps, other]: Title: RunawayEvil: Jailbreaking the Image-to-Video Generative Models

Authors: Songping Wang, Rufan Qian, Yueming Lyu, Qinglong Liu, Linzhuang Zou, Jie Qin, Songhua Liu, Caifeng Shan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[277] arXiv:2512.06673 [pdf, ps, other]: Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and Reasoning

Authors: Shida Gao, Feng Xue, Xiangfeng Wang, Anlong Ming, Teng Long, Yihua Shao, Haozhe Wang, Zhaowen Lin, Wei Wang, Nicu Sebe

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[278] arXiv:2512.06663 [pdf, ps, other]: Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks

Authors: Yu Qi, Yumeng Zhang, Chenting Gong, Xiao Tan, Weiming Zhang, Wei Zhang, Jingdong Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279] arXiv:2512.06662 [pdf, ps, other]: Title: Personalized Image Descriptions from Attention Sequences

Authors: Ruoyu Xue, Hieu Le, Jingyi Xu, Sounak Mondal, Abe Leite, Gregory Zelinsky, Minh Hoai, Dimitris Samaras

Comments: 10 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[280] arXiv:2512.06657 [pdf, ps, other]: Title: TextMamba: Scene Text Detector with Mamba

Authors: Qiyan Zhao, Yue Yan, Da-Han Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[281] arXiv:2512.06642 [pdf, ps, other]: Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution

Authors: Achmad Ardani Prasha, Clavino Ourizqi Rachmadi, Muhamad Fauzan Ibnu Syahlan, Naufal Rahfi Anugerah, Nanda Garin Raditya, Putri Amelia, Sabrina Laila Mutiara, Hilman Syachr Ramadhan

Comments: 21 pages, 7 figures, 3 table

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[282] arXiv:2512.06613 [pdf, ps, other]: Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach

Authors: Yueying Ke

Comments: 10 pages, 6 figures, 2 tables, IEEE conference format. Submitted as course project

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283] arXiv:2512.06612 [pdf, ps, other]: Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics

Authors: Kazuya Nishimura, Haruka Hirose, Ryoma Bise, Kaito Shiku, Yasuhiro Kojima

Comments: Neurips 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284] arXiv:2512.06598 [pdf, ps, other]: Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake Champlain

Authors: Muhammad Adil, Patrick J. Clemins, Andrew W. Schroth, Panagiotis D. Oikonomou, Donna M. Rizzo, Peter D. F. Isles, Xiaohan Zhang, Kareem I. Hannoun, Scott Turnbull, Noah B. Beckage, Asim Zia, Safwan Wshah

Comments: 23 pages, 15 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285] arXiv:2512.06581 [pdf, ps, other]: Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding

Authors: Yuhao Su, Anwesa Choudhuri, Zhongpai Gao, Benjamin Planche, Van Nguyen Nguyen, Meng Zheng, Yuhan Shen, Arun Innanje, Terrence Chen, Ehsan Elhamifar, Ziyan Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286] arXiv:2512.06575 [pdf, ps, other]: Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules

Authors: Fariza Dahes

Comments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LG

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[287] arXiv:2512.06565 [pdf, ps, other]: Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose Estimation

Authors: Xiujin Liu

Comments: 1 figures, 2 tables, 14pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288] arXiv:2512.06562 [pdf, ps, other]: Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities

Authors: Dung Thuy Nguyen, Quang Nguyen, Preston K. Robinette, Eli Jiang, Taylor T. Johnson, Kevin Leach

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[289] arXiv:2512.06560 [pdf, ps, other]: Title: Bridging spatial awareness and global context in medical image segmentation

Authors: Dalia Alzu'bi, A. Ben Hamza

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290] arXiv:2512.06531 [pdf, ps, other]: Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images

Authors: Sayan Das (1), Arghadip Biswas (2) ((1) IIIT Delhi, (2) Jadavpur University)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[291] arXiv:2512.06530 [pdf, ps, other]: Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-Generalization

Authors: Mohammed Wattad, Tamir Shor, Alex Bronstein

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[292] arXiv:2512.06521 [pdf, ps, other]: Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife Images

Authors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)

Comments: 31 pages + appendix

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[293] arXiv:2512.06504 [pdf, ps, other]: Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion

Authors: Andrii Lysyi, Anatoliy Sachenko, Pavlo Radiuk, Mykola Lysyi, Oleksandr Melnychenko, Diana Zahorodnia

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[294] arXiv:2512.06485 [pdf, ps, other]: Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based Interaction

Authors: Kush Revankar, Shreyas Deshpande, Araham Sayeed, Ansh Tandale, Sarika Bobde

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295] arXiv:2512.06447 [pdf, ps, other]: Title: Towards Stable Cross-Domain Depression Recognition under Missing Modalities

Authors: Jiuyi Chen, Mingkui Tan, Haifeng Lu, Qiuna Xu, Zhihua Wang, Runhao Zeng, Xiping Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296] arXiv:2512.06438 [pdf, ps, other]: Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars

Authors: Ramazan Fazylov, Sergey Zagoruyko, Aleksandr Parkin, Stamatis Lefkimmiatis, Ivan Laptev

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297] arXiv:2512.06434 [pdf, ps, other]: Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular Screening

Authors: Lucas R. Mareque, Ricardo L. Armentano, Leandro J. Cymberknop

Comments: 8 pages, 2 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[298] arXiv:2512.06426 [pdf, ps, other]: Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition

Authors: Nzakiese Mbongo, Kailash A. Hambarde, Hugo Proença

Comments: 12 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[299] arXiv:2512.06424 [pdf, ps, other]: Title: DragMesh: Interactive 3D Generation Made Easy

Authors: Tianshan Zhang, Zeyu Zhang, Hao Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300] arXiv:2512.06422 [pdf, ps, other]: Title: A Perception CNN for Facial Expression Recognition

Authors: Chunwei Tian, Jingyuan Xie, Lingjun Li, Wangmeng Zuo, Yanning Zhang, David Zhang

Comments: in IEEE Transactions on Image Processing (2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301] arXiv:2512.06421 [pdf, ps, other]: Title: Rethinking Training Dynamics in Scale-wise Autoregressive Generation

Authors: Gengze Zhou, Chongjian Ge, Hao Tan, Feng Liu, Yicong Hong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[302] arXiv:2512.06400 [pdf, ps, other]: Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene Enhancement

Authors: Jing Tao, Yonghong Zong, Banglei Guana, Pengju Sun, Taihang Lei, Yang Shanga, Qifeng Yu

Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303] arXiv:2512.06379 [pdf, ps, other]: Title: OCFER-Net: Recognizing Facial Expression in Online Learning System

Authors: Yi Huo, Lei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304] arXiv:2512.06377 [pdf, ps, other]: Title: VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System

Authors: Yi Huo, Yun Ge

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[305] arXiv:2512.06376 [pdf, ps, other]: Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework

Authors: Xinhao Xiang, Abhijeet Rastogi, Jiawei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306] arXiv:2512.06373 [pdf, ps, other]: Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning

Authors: Yuji Wang, Wenlong Liu, Jingxuan Niu, Haoji Zhang, Yansong Tang

Comments: The project page is [this url](this https URL)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307] arXiv:2512.06368 [pdf, ps, other]: Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos

Authors: Weitao Xiong, Zhiyuan Yuan, Jiahao Lu, Chengfeng Zhao, Peng Li, Yuan Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308] arXiv:2512.06363 [pdf, ps, other]: Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection

Authors: Jiabao Guo, Yadian Wang, Hui Ma, Yuhao Fu, Ju Jia, Hui Liu, Shengeng Tang, Lechao Cheng, Yunfeng Diao, Ajian Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[309] arXiv:2512.06358 [pdf, ps, other]: Title: Rectifying Latent Space for Generative Single-Image Reflection Removal

Authors: Mingjia Li, Jin Hu, Hainuo Wang, Qiming Hu, Jiarui Wang, Xiaojie Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310] arXiv:2512.06353 [pdf, ps, other]: Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search

Authors: Kaicheng Yang, Kaisen Yang, Baiting Wu, Xun Zhang, Qianrui Yang, Haotong Qin, He Zhang, Yulun Zhang

Comments: Code and Supplementary Material could be found at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[311] arXiv:2512.06345 [pdf, ps, other]: Title: CLUENet: Cluster Attention Makes Neural Networks Have Eyes

Authors: Xiangshuai Song, Jun-Jie Huang, Tianrui Liu, Ke Liang, Chang Tang

Comments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial Intelligence

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312] arXiv:2512.06344 [pdf, ps, other]: Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate

Authors: Kaile Wang, Lijun He, Haisheng Fu, Haixia Bi, Fan Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313] arXiv:2512.06332 [pdf, ps, other]: Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks

Authors: Jeffrey Gu, Minkyu Jeon, Ambri Ma, Serena Yeung-Levy, Ellen D. Zhong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314] arXiv:2512.06330 [pdf, ps, other]: Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening

Authors: Haoyu Zhang, Junhan Luo, Yugang Cao, Siran Peng, Jie Huang, Liangjian-Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[315] arXiv:2512.06328 [pdf, ps, other]: Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models

Authors: Jiahao Li, Yusheng Luo, Yunzhong Lou, Xiangdong Zhou

Comments: Accepted as an Oral presentation at AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316] arXiv:2512.06306 [pdf, ps, other]: Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation

Authors: Haoxian Zhou, Chuanzhi Xu, Langyi Chen, Haodong Chen, Yuk Ying Chung, Qiang Qu, Xaoming Chen, Weidong Cai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[317] arXiv:2512.06290 [pdf, ps, other]: Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification

Authors: Yiheng Huang, Shuang She, Zewei Wei, Jianmin Lin, Ming Yang, Wenyin Liu

Comments: 17 pages, 5 figures

Journal-ref: ICDAR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318] arXiv:2512.06282 [pdf, ps, other]: Title: A Sleep Monitoring System Based on Audio, Video and Depth Information

Authors: Lyn Chao-ling Chen, Kuan-Wen Chen, Yi-Ping Hung

Comments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[319] arXiv:2512.06281 [pdf, ps, other]: Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models

Authors: Hengzhuang Li, Xinsong Zhang, Qiming Peng, Bin Luo, Han Hu, Dengyang Jiang, Han-Jia Ye, Teng Zhang, Hai Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320] arXiv:2512.06276 [pdf, ps, other]: Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension

Authors: Tianyi Gao, Hao Li, Han Fang, Xin Wei, Xiaodong Dong, Hongbo Sun, Ye Yuan, Zhongjiang He, Jinglin Xu, Jingmin Xin, Hao Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[321] arXiv:2512.06275 [pdf, ps, other]: Title: FacePhys: State of the Heart Learning

Authors: Kegang Wang, Jiankai Tang, Yuntao Wang, Xin Liu, Yuxuan Fan, Jiatong Ji, Yuanchun Shi, Daniel McDuff

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322] arXiv:2512.06269 [pdf, ps, other]: Title: TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting

Authors: Quan Tran, Tuan Dang

Comments: 10 pages

Journal-ref: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323] arXiv:2512.06258 [pdf, ps, other]: Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs

Authors: Chaoyang Wang, Yangfan He, Yiyang Zhou, Yixuan Wang, Jiaqi Liu, Peng Xia, Zhengzhong Tu, Mohit Bansal, Huaxiu Yao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324] arXiv:2512.06255 [pdf, ps, other]: Title: Language-driven Fine-grained Retrieval

Authors: Shijie Wang, Xin Yu, Yadan Luo, Zijian Wang, Pengfei Zhang, Zi Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325] arXiv:2512.06251 [pdf, ps, other]: Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks

Authors: Fangzhou Lin, Yuping Wang, Yuliang Guo, Zixun Huang, Xinyu Huang, Haichong Zhang, Kazunori Yamada, Zhengzhong Tu, Liu Ren, Ziming Zhang

Comments: 12 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326] arXiv:2512.06232 [pdf, ps, other]: Title: Opinion: Learning Intuitive Physics May Require More than Visual Data

Authors: Ellen Su, Solim Legris, Todd M. Gureckis, Mengye Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[327] arXiv:2512.06230 [pdf, ps, other]: Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis Tracking

Authors: Pranav Balakrishnan, Sidisha Barik, Sean M. O'Rourke, Benjamin M. Marlin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328] arXiv:2512.06221 [pdf, ps, other]: Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility Study

Authors: Alena Makarova

Comments: 15 pages, 13 figures. Reproducibility study

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329] arXiv:2512.06206 [pdf, ps, other]: Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning

Authors: Akis Linardos, Sarthak Pati, Ujjwal Baid, Brandon Edwards, Patrick Foley, Kevin Ta, Verena Chung, Micah Sheller, Muhammad Irfan Khan, Mojtaba Jafaritadi, Elina Kontio, Suleiman Khan, Leon Mächler, Ivan Ezhov, Suprosanna Shit, Johannes C. Paetzold, Gustav Grimberg, Manuel A. Nickel, David Naccache, Vasilis Siomos, Jonathan Passerat-Palmbach, Giacomo Tarroni, Daewoon Kim, Leonard L. Klausmann, Prashant Shah, Bjoern Menze, Dimitrios Makris, Spyridon Bakas

Comments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URL

Journal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[330] arXiv:2512.06190 [pdf, ps, other]: Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food Drying

Authors: Shichen Li, Ahmadreza Eslaminia, Chenhui Shao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[331] arXiv:2512.06185 [pdf, ps, other]: Title: SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

Authors: Ankit Gupta, Christoph Adami, Emily Dolson (Michigan State University)

Comments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary material

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[332] arXiv:2512.06179 [pdf, ps, other]: Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction

Authors: Shilin Hu, Jingyi Xu, Sagnik Das, Dimitris Samaras, Hieu Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[333] arXiv:2512.06174 [pdf, ps, other]: Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction

Authors: Shilin Hu, Jingyi Xu, Akshat Dave, Dimitris Samaras, Hieu Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334] arXiv:2512.06171 [pdf, ps, other]: Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection

Authors: Jessica Plassmann, Nicolas Schuler, Michael Schuth, Georg von Freymann

Comments: 11 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335] arXiv:2512.06158 [pdf, ps, other]: Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation

Authors: Su Sun, Cheng Zhao, Himangi Mittal, Gaurav Mittal, Rohith Kukkala, Yingjie Victor Chen, Mei Chen

Comments: 15 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[336] arXiv:2512.06105 [pdf, ps, other]: Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report Generation

Authors: Junwen Zheng, Xinran Xu, Li Rong Wang, Chang Cai, Lucinda Siyun Tan, Dingyuan Wang, Hong Liang Tey, Xiuyi Fan

Comments: AAAI-26-AIA

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[337] arXiv:2512.06103 [pdf, ps, other]: Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack Detection

Authors: Raghavendra Ramachandra, Sushma Venkatesh

Comments: Accepted in IEEE T-BIOM

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[338] arXiv:2512.06096 [pdf, ps, other]: Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous Driving

Authors: Karthik Mohan, Sonam Singh, Amit Arvind Kale

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[339] arXiv:2512.06080 [pdf, ps, other]: Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light

Authors: Tzofi Klinghoffer, Siddharth Somasundaram, Xiaoyu Xiang, Yuchen Fan, Christian Richardt, Akshat Dave, Ramesh Raskar, Rakesh Ranjan

Comments: SIGGRAPH Asia 2025. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[340] arXiv:2512.06065 [pdf, ps, other]: Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing

Authors: Runjia Li, Moayed Haji-Ali, Ashkan Mirzaei, Chaoyang Wang, Arpit Sahni, Ivan Skorokhodov, Aliaksandr Siarohin, Tomas Jakab, Junlin Han, Sergey Tulyakov, Philip Torr, Willi Menapace

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[341] arXiv:2512.06058 [pdf, ps, other]: Title: Representation Learning for Point Cloud Understanding

Authors: Siming Yan

Comments: 181 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[342] arXiv:2512.06032 [pdf, ps, other]: Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation

Authors: Ranjan Sapkota, Konstantinos I. Roumeliotis, Manoj Karkee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[343] arXiv:2512.06024 [pdf, ps, other]: Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensing

Authors: Jiabin Liu, Zihao Zhou, Jialei Yan, Anxin Guo, Alvise Benetazzo, Hui Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
[344] arXiv:2512.06020 [pdf, ps, other]: Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation

Authors: Wenyi Mo, Tianyu Zhang, Yalong Bai, Ligong Han, Ying Ba, Dimitris N. Metaxas

Comments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[345] arXiv:2512.06014 [pdf, ps, other]: Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets

Authors: Jiho Shin, Dominic Marshall, Matthieu Komorowski

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[346] arXiv:2512.06013 [pdf, ps, other]: Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViT

Authors: Wenhao Li, Chengwei Ma, Weixin Mao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[347] arXiv:2512.06012 [pdf, ps, other]: Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive Manufacturing

Authors: Emmanuel Akeweje, Conall Kirk, Chi-Wai Chan, Denis Dowling, Mimi Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[348] arXiv:2512.06010 [pdf, other]: Title: Fast and Flexible Robustness Certificates for Semantic Segmentation

Authors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[349] arXiv:2512.06006 [pdf, ps, other]: Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization

Authors: Xuefei (Julie) Wang, Kai A. Horstmann, Ethan Lin, Jonathan Chen, Alexander R. Farhang, Sophia Stiles, Atharva Sehgal, Jonathan Light, David Van Valen, Yisong Yue, Jennifer J. Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[350] arXiv:2512.06003 [pdf, ps, other]: Title: PrunedCaps: A Case For Primary Capsules Discrimination

Authors: Ramin Sharifi, Pouya Shiri, Amirali Baniasadi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[351] arXiv:2512.05996 [pdf, ps, other]: Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and Counting

Authors: Yi Liu, Jingyu Song, Vedanth Kallakuri, Katherine A. Skinner

Comments: 18 pages, under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
[352] arXiv:2512.05993 [pdf, ps, other]: Title: Domain-Specific Foundation Model Improves AI-Based Analysis of Neuropathology

Authors: Ruchika Verma, Shrishtee Kandoi, Robina Afzal, Shengjia Chen, Jannes Jegminat, Michael W. Karlovich, Melissa Umphlett, Timothy E. Richardson, Kevin Clare, Quazi Hossain, Jorge Samanamud, Phyllis L. Faust, Elan D. Louis, Ann C. McKee, Thor D. Stein, Jonathan D. Cherry, Jesse Mez, Anya C. McGoldrick, Dalilah D. Quintana Mora, Melissa J. Nirenberg, Ruth H. Walker, Yolfrankcis Mendez, Susan Morgello, Dennis W. Dickson, Melissa E. Murray, Carlos Cordon-Cardo, Nadejda M. Tsankova, Jamie M. Walker, Diana K. Dangoor, Stephanie McQuillan, Emma L. Thorn, Claudia De Sanctis, Shuying Li, Thomas J. Fuchs, Kurt Farrell, John F. Crary, Gabriele Campanella

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[353] arXiv:2512.05991 [pdf, ps, other]: Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head

Authors: Chang Liu, Tianjiao Jing, Chengcheng Ma, Xuanqi Zhou, Zhengxuan Lian, Qin Jin, Hongliang Yuan, Shi-Sheng Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[354] arXiv:2512.05988 [pdf, ps, other]: Title: VG3T: Visual Geometry Grounded Gaussian Transformer

Authors: Junho Kim, Seongwon Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[355] arXiv:2512.05987 [pdf, ps, other]: Title: Adaptive Dataset Quantization: A New Direction for Dataset Pruning

Authors: Chenyue Yu, Jianyu Yu

Comments: Accepted by ICCPR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[356] arXiv:2512.05969 [pdf, ps, other]: Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices

Authors: Hokin Deng

Comments: See $\href{this https URL}{results}$ and $\href{this https URL}{code}$

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[357] arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]: Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs

Authors: Sujoy Nath, Arkaprabha Basu, Sharanya Dasgupta, Swagatam Das

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[358] arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]: Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray Segmentation

Authors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Sharipov Hotam Beknazarovich, Farzona S. Ataeva, Qurbonaliev Alisher, Yuanjie Zheng

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[359] arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]: Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics Framework

Authors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Alimov Ruslan, Lutfuloev Mazbutdzhon, Ismoilov Shuhratjon, Yuanjie Zheng

Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[360] arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]: Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning Models

Authors: Shimin Zhang, Xianwei Chen, Yufan Shen, Ziyuan Ye, Jibin Wu

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[361] arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]: Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces

Authors: Nikita Gabdullin

Comments: 9 pages, 5 figures, 1 table, 4 equations

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[362] arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]: Title: Human Geometry Distribution for 3D Animation Generation

Authors: Xiangjun Tang, Biao Zhang, Peter Wonka

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[363] arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]: Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World Models

Authors: Chenwei Shi, Xueyu Luan

Comments: 23 pages, 8 figures, 3 tables

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
[364] arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]: Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models

Authors: Haidong Kang, Jun Du, Lihong Lin

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[365] arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]: Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness Likelihood

Authors: Gilhyun Nam, Taewon Kim, Joonhyun Jeong, Eunho Yang

Comments: Accepted to WACV 2026

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[366] arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]: Title: A Geometric Unification of Concept Learning with Concept Cones

Authors: Alexandre Rocchi--Henry, Thomas Fel, Gianni Franchi

Comments: 22 pages

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[367] arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]: Title: Affine Subspace Models and Clustering for Patch-Based Image Denoising

Authors: Tharindu Wickremasinghe, Marco F. Duarte

Comments: Asilomar Conference on Signals, Systems, and Computers 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[368] arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]: Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty Metrics

Authors: Tianyi Ren, Daniel Low, Pittra Jaengprajak, Juampablo Heras Rivera, Jacob Ruzevick, Mehmet Kurt

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[369] arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]: Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers

Authors: Jonghyun Park, Jong Chul Ye

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[370] arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]: Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket Search

Authors: Tanay Arora, Christof Teuscher

Comments: This work plans to be submitted to the IEEE for possible publication

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[371] arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]: Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning

Authors: Nithin Sivakumaran, Justin Chih-Yao Chen, David Wan, Yue Zhang, Jaehong Yoon, Elias Stengel-Eskin, Mohit Bansal

Comments: Code: this https URL

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[372] arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]: Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous Driving

Authors: Zebin Xing, Yupeng Zheng, Qichao Zhang, Zhixing Ding, Pengxuan Yang, Songen Gu, Zhongpu Xia, Dongbin Zhao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[373] arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]: Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep Analysis

Authors: Sakib Mostafa, Lei Xing, Md. Tauhidul Islam

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[374] arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]: Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme Patients

Authors: Krishna Arun, Moinak Bhattachrya, Paras Goel

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[375] arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]: Title: VideoVLA: Video Generators Can Be Generalizable Robot Manipulators

Authors: Yichao Shen, Fangyun Wei, Zhiying Du, Yaobo Liang, Yan Lu, Jiaolong Yang, Nanning Zheng, Baining Guo

Comments: Project page: this https URL

Journal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[376] arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]: Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge

Authors: Ilia Larchenko, Gleb Zarin, Akash Karnatak

Comments: 2025 NeurIPS Behavior Challenge 1st place solution

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[377] arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]: Title: Dynamic Visual SLAM using a General 3D Prior

Authors: Xingguang Zhong, Liren Jin, Marija Popović, Jens Behley, Cyrill Stachniss

Comments: 8 pages

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[378] arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]: Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge Devices

Authors: Sepyan Purnama Kristanto, Lutfi Hakim, Hermansyah

Comments: 9Pages, 3 figure, Politeknik Negeri Banyuwangi

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[379] arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]: Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association

Authors: Zhihua Fang, Shumei Tao, Junxu Wang, Liang He

Comments: FAME 2026 Technical Report

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[380] arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]: Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics

Authors: Nikhil Verma, Joonas Linnosmaa, Espinosa-Leal Leonardo, Napat Vajragupta

Comments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-concept

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[381] arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]: Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG Data

Authors: Lin Yang, Xiang Li, Xin Ma, Xinxin Zhao

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[382] arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]: Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods

Authors: Panagiota Kiourti, Anu Singh, Preeti Duraipandian, Weichao Zhou, Wenchao Li

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[383] arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]: Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine Learning

Authors: Camellia Zakaria, Aryan Sadeghi, Weaam Jaafar, Junshi Xu, Alex Mariakakis, Marianne Hatzopoulou

Comments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
[384] arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]: Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network

Authors: Xiao Li

Comments: in Chinese language

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[385] arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]: Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

Authors: Ruicheng Zhang, Mingyang Zhang, Jun Zhou, Zhangrui Guo, Xiaofan Liu, Zunnan Xu, Zhizhou Zhong, Puxin Yan, Haocheng Luo, Xiu Li

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[386] arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]: Title: Vector Quantization using Gaussian Variational Autoencoder

Authors: Tongda Xu, Wendi Zheng, Jiajun He, Jose Miguel Hernandez-Lobato, Yan Wang, Ya-Qin Zhang, Jie Tang

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[387] arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]: Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation

Authors: Xiaojun Jia, Jie Liao, Qi Guo, Teng Ma, Simeng Qin, Ranjie Duan, Tianlin Li, Yihao Huang, Zhitao Zeng, Dongxian Wu, Yiming Li, Wenqi Ren, Xiaochun Cao, Yang Liu

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[388] arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]: Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers

Authors: Hochul Hwang, Soowan Yang, Jahir Sadik Monon, Nicholas A Giudice, Sunghoon Ivan Lee, Joydeep Biswas, Donghyun Kim

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[389] arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]: Title: Semantic Temporal Single-photon LiDAR

Authors: Fang Li, Tonglin Mu, Shuling Li, Junran Guo, Keyuan Li, Jianing Li, Ziyang Luo, Xiaodong Fan, Ye Chen, Yunfeng Liu, Hong Cai, Lip Ket Chin, Jinbei Zhang, Shihai Sun

Comments: 14 pages, 5 figures. And any comment is welcome

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
[390] arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]: Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation

Authors: Azeez Idris, Abdurahman Ali Mohammed, Samuel Fanijo

Comments: NeurIPS Black in AI workshop - 2022

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Mon, 8 Dec 2025

[391] arXiv:2512.05965 [pdf, ps, other]: Title: EditThinker: Unlocking Iterative Reasoning for Any Image Editor

Authors: Hongyu Li, Manyuan Zhang, Dian Zheng, Ziyu Guo, Yimeng Jia, Kaituo Feng, Hao Yu, Yexin Liu, Yan Feng, Peng Pei, Xunliang Cai, Linjiang Huang, Hongsheng Li, Si Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[392] arXiv:2512.05960 [pdf, ps, other]: Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement

Authors: Munsif Ali, Najmul Hassan, Lucia Ventura, Davide Di Bari, Simonepietro Canese

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[393] arXiv:2512.05941 [pdf, ps, other]: Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding

Authors: Zhiyuan Jiang, Shenghao Xie, Wenyi Li, Wenqiang Zu, Peihang Li, Jiahao Qiu, Siqi Pei, Lei Ma, Tiejun Huang, Mengdi Wang, Shilong Liu

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[394] arXiv:2512.05937 [pdf, ps, other]: Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV Perception

Authors: Anne Sielemann, Valentin Barner, Stefan Wolf, Masoud Roschani, Jens Ziehn, Juergen Beyerer

Comments: 8 pages, 2 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[395] arXiv:2512.05936 [pdf, ps, other]: Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition

Authors: Anne Sielemann, Lena Loercher, Max-Lion Schumacher, Stefan Wolf, Masoud Roschani, Jens Ziehn

Comments: 8 pages, 8 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[396] arXiv:2512.05928 [pdf, ps, other]: Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face Recognition

Authors: Pedro Vidal, Bernardo Biesseck, Luiz E. L. Coelho, Roger Granada, David Menotti

Comments: 18 pages, 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[397] arXiv:2512.05927 [pdf, ps, other]: Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty

Authors: Zhiting Mei, Tenny Yin, Micah Baker, Ola Shorinwa, Anirudha Majumdar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[398] arXiv:2512.05922 [pdf, ps, other]: Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology Segmentation

Authors: Khang Le, Anh Mai Vu, Thi Kim Trang Vo, Ha Thach, Ngoc Bui Lam Quang, Thanh-Huy Nguyen, Minh H. N. Le, Zhu Han, Chandra Mohan, Hien Van Nguyen

Comments: Note: Khang Le and Anh Mai Vu contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[399] arXiv:2512.05920 [pdf, ps, other]: Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery Prediction

Authors: Jiawen Yang, Yihui Cao, Xuanyu Tian, Yuyao Zhang, Hongjiang Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[400] arXiv:2512.05905 [pdf, ps, other]: Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

Authors: Wenhao Yan, Sheng Ye, Zhuoyi Yang, Jiayan Teng, ZhenHui Dong, Kairui Wen, Xiaotao Gu, Yong-Jin Liu, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[401] arXiv:2512.05866 [pdf, ps, other]: Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator

Authors: Md. Mahbub Hasan Akash, Aria Tasnim Mridula, Sheekar Banerjee, Ishtiak Al Mamoon

Comments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[402] arXiv:2512.05859 [pdf, ps, other]: Title: Edit-aware RAW Reconstruction

Authors: Abhijith Punnappurath, Luxi Zhao, Ke Zhao, Hue Nguyen, Radek Grzeszczuk, Michael S. Brown

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403] arXiv:2512.05853 [pdf, ps, other]: Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack

Authors: Shiji Zhao, Shukun Xiong, Yao Huang, Yan Jin, Zhenyu Wu, Jiyang Guan, Ranjie Duan, Jialing Tao, Hui Xue, Xingxing Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404] arXiv:2512.05830 [pdf, ps, other]: Title: Phase-OTDR Event Detection Using Image-Based Data Transformation and Deep Learning

Authors: Muhammet Cagri Yeke, Samil Sirin, Kivilcim Yuksel, Abdurrahman Gumus

Comments: 22 pages, 11 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[405] arXiv:2512.05814 [pdf, ps, other]: Title: UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease Detection

Authors: Fubao Zhu, Zhanyuan Jia, Zhiguo Wang, Huan Huang, Danyang Sun, Chuang Han, Yanting Li, Jiaofen Nan, Chen Zhao, Weihua Zhou

Comments: The code is already available on GitHub: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[406] arXiv:2512.05809 [pdf, ps, other]: Title: Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling

Authors: Saurav Jha, M. Jehanzeb Mirza, Wei Lin, Shiqi Yang, Sarath Chandar

Comments: Extended abstract at World Modeling Workshop 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[407] arXiv:2512.05802 [pdf, ps, other]: Title: Bring Your Dreams to Life: Continual Text-to-Video Customization

Authors: Jiahua Dong, Xudong Wang, Wenqi Liang, Zongyan Han, Meng Cao, Duzhen Zhang, Hanbin Zhao, Zhi Han, Salman Khan, Fahad Shahbaz Khan

Comments: Accepted to AAAI2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408] arXiv:2512.05783 [pdf, ps, other]: Title: Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth

Authors: Maryam Yousefi, Soodeh Bakhshandeh

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[409] arXiv:2512.05774 [pdf, ps, other]: Title: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding

Authors: Ziyang Wang, Honglu Zhou, Shijie Wang, Junnan Li, Caiming Xiong, Silvio Savarese, Mohit Bansal, Michael S. Ryoo, Juan Carlos Niebles

Comments: Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[410] arXiv:2512.05762 [pdf, ps, other]: Title: FNOPT: Resolution-Agnostic, Self-Supervised Cloth Simulation using Meta-Optimization with Fourier Neural Operators

Authors: Ruochen Chen, Thuy Tran, Shaifali Parashar

Comments: Accepted for WACV

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[411] arXiv:2512.05759 [pdf, ps, other]: Title: Label-Efficient Point Cloud Segmentation with Active Learning

Authors: Johannes Meyer, Jasper Hoffmann, Felix Schulz, Dominik Merkle, Daniel Buescher, Alexander Reiterer, Joschka Boedecker, Wolfram Burgard

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[412] arXiv:2512.05754 [pdf, ps, other]: Title: USV: Unified Sparsification for Accelerating Video Diffusion Models

Authors: Xinjian Wu, Hongmei Wang, Yuan Zhou, Qinglin Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[413] arXiv:2512.05746 [pdf, ps, other]: Title: HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models

Authors: Shizhuo Mao, Hongtao Zou, Qihu Xie, Song Chen, Yi Kang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[414] arXiv:2512.05740 [pdf, ps, other]: Title: Distilling Expert Surgical Knowledge: How to train local surgical VLMs for anatomy explanation in Complete Mesocolic Excision

Authors: Lennart Maack, Julia-Kristin Graß, Lisa-Marie Toscha, Nathaniel Melling, Alexander Schlaefer

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[415] arXiv:2512.05710 [pdf, ps, other]: Title: Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature Learning

Authors: Jianan Sun, Dongzhihan Wang, Mingyu Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[416] arXiv:2512.05698 [pdf, ps, other]: Title: OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning

Authors: Xusheng Guo, Wanfa Zhang, Shijia Zhao, Qiming Xia, Xiaolong Xie, Mingming Wang, Hai Wu, Chenglu Wen

Comments: The 40th Annual AAAI Conference on Artificial Intelligence

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[417] arXiv:2512.05683 [pdf, ps, other]: Title: Physics-Informed Graph Neural Network with Frequency-Aware Learning for Optical Aberration Correction

Authors: Yong En Kok, Bowen Deng, Alexander Bentley, Andrew J. Parkes, Michael G. Somekh, Amanda J. Wright, Michael P. Pound

Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[418] arXiv:2512.05674 [pdf, ps, other]: Title: Hyperspectral Unmixing with 3D Convolutional Sparse Coding and Projected Simplex Volume Maximization

Authors: Gargi Panda, Soumitra Kundu, Saumik Bhattacharya, Aurobinda Routray

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[419] arXiv:2512.05672 [pdf, ps, other]: Title: InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem

Authors: Yeobin Hong, Suhyeon Lee, Hyungjin Chung, Jong Chul Ye

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[420] arXiv:2512.05669 [pdf, ps, other]: Title: Deep Learning-Based Real-Time Sequential Facial Expression Analysis Using Geometric Features

Authors: Talha Enes Koksal, Abdurrahman Gumus

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[421] arXiv:2512.05663 [pdf, ps, other]: Title: LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection

Authors: Johannes Meier, Jonathan Michel, Oussema Dhaouadi, Yung-Hsu Yang, Christoph Reich, Zuria Bauer, Stefan Roth, Marc Pollefeys, Jacques Kaiser, Daniel Cremers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[422] arXiv:2512.05651 [pdf, ps, other]: Title: Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective

Authors: Nan Zhong, Mian Zou, Yiran Xu, Zhenxing Qian, Xinpeng Zhang, Baoyuan Wu, Kede Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[423] arXiv:2512.05635 [pdf, ps, other]: Title: Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data

Authors: Georgy Perevozchikov, Nancy Mehta, Egor Ershov, Radu Timofte

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[424] arXiv:2512.05613 [pdf, ps, other]: Title: DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation Model

Authors: Pasquale De Marinis, Pieter M. Blok, Uzay Kaymak, Rogier Brussee, Gennaro Vessio, Giovanna Castellano

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[425] arXiv:2512.05610 [pdf, ps, other]: Title: NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projections

Authors: Juho Korkeala, Jesse Muhojoki, Josef Taher, Klaara Salolahti, Matti Hyyppä, Antero Kukko, Juha Hyyppä

Comments: 19 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[426] arXiv:2512.05597 [pdf, ps, other]: Title: Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token Prediction

Authors: Ruihong Yin, Xuepeng Shi, Oleksandr Bailo, Marco Manfredi, Theo Gevers

Comments: 10 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427] arXiv:2512.05593 [pdf, ps, other]: Title: Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer

Authors: Rong Wang, Wei Mao, Changsheng Lu, Hongdong Li

Comments: Accepted to 3DV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[428] arXiv:2512.05571 [pdf, ps, other]: Title: MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical Imaging

Authors: Xingyu Zhang, Anna Reithmeir, Fryderyk Kögl, Rickmer Braren, Julia A. Schnabel, Daniel M. Lang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429] arXiv:2512.05564 [pdf, ps, other]: Title: ProPhy: Progressive Physical Alignment for Dynamic World Simulation

Authors: Zijun Wang, Panwen Hu, Jing Wang, Terry Jingchen Zhang, Yuhao Cheng, Long Chen, Yiqiang Yan, Zutao Jiang, Hanhui Li, Xiaodan Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430] arXiv:2512.05557 [pdf, ps, other]: Title: 2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence Consistency

Authors: Xingxi Yin, Yicheng Li, Gong Yan, Chenglin Li, Jian Zhao, Cong Huang, Yue Deng, Yin Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[431] arXiv:2512.05546 [pdf, ps, other]: Title: Conscious Gaze: Adaptive Attention Mechanisms for Hallucination Mitigation in Vision-Language Models

Authors: Weijue Bu, Guan Yuan, Guixian Zhang

Comments: 6 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[432] arXiv:2512.05539 [pdf, ps, other]: Title: Ideal Observer for Segmentation of Dead Leaves Images

Authors: Swantje Mahncke, Malte Ott

Comments: 41 pages, 16 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Methodology (stat.ME)
[433] arXiv:2512.05529 [pdf, ps, other]: Title: See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors

Authors: Kunyi Yang, Qingyu Wang, Cheng Yuan, Yutong Ban

Comments: The first two authors contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[434] arXiv:2512.05524 [pdf, ps, other]: Title: VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation

Authors: Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435] arXiv:2512.05515 [pdf, ps, other]: Title: DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis

Authors: Yuhua Wen, Qifei Li, Yingying Zhou, Yingming Gao, Zhengqi Wen, Jianhua Tao, Ya Li

Comments: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[436] arXiv:2512.05513 [pdf, ps, other]: Title: Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning

Authors: Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[437] arXiv:2512.05511 [pdf, ps, other]: Title: Rethinking Infrared Small Target Detection: A Foundation-Driven Efficient Paradigm

Authors: Chuang Yu, Jinmiao Zhao, Yunpeng Liu, Yaokun Li, Xiujun Shu, Yuanhao Feng, Bo Wang, Yimian Dai, Xiangyu Yue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[438] arXiv:2512.05494 [pdf, ps, other]: Title: Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation

Authors: Fan Zhang, Zhiwei Gu, Hua Wang

Comments: Accepted to AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[439] arXiv:2512.05492 [pdf, ps, other]: Title: WaterWave: Bridging Underwater Image Enhancement into Video Streams via Wavelet-based Temporal Consistency Field

Authors: Qi Zhu, Jingyi Zhang, Naishan Zheng, Wei Yu, Jinghao Zhang, Deyi Ji, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[440] arXiv:2512.05482 [pdf, ps, other]: Title: Concept-based Explainable Data Mining with VLM for 3D Detection

Authors: Mai Tsujimoto

Comments: 28 pages including appendix. Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[441] arXiv:2512.05481 [pdf, ps, other]: Title: UniFS: Unified Multi-Contrast MRI Reconstruction via Frequency-Spatial Fusion

Authors: Jialin Li, Yiwei Ren, Kai Pan, Dong Wei, Pujin Cheng, Xian Wu, Xiaoying Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[442] arXiv:2512.05478 [pdf, ps, other]: Title: EmoStyle: Emotion-Driven Image Stylization

Authors: Jingyuan Yang, Zihuan Bai, Hui Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[443] arXiv:2512.05468 [pdf, ps, other]: Title: University Building Recognition Dataset in Thailand for the mission-oriented IoT sensor system

Authors: Takara Taniguchi, Yudai Ueda, Atsuya Muramatsu, Kohki Hashimoto, Ryo Yagi, Hideya Ochiai, Chaodit Aswakul

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[444] arXiv:2512.05446 [pdf, ps, other]: Title: TED-4DGS: Temporally Activated and Embedding-based Deformation for 4DGS Compression

Authors: Cheng-Yuan Ho, He-Bi Yang, Jui-Chiu Chiang, Yu-Lun Liu, Wen-Hsiao Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[445] arXiv:2512.05422 [pdf, ps, other]: Title: ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction

Authors: Jiangtong Tan, Lin Liu, Jie Huanng, Xiaopeng Zhang, Qi Tian, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[446] arXiv:2512.05418 [pdf, ps, other]: Title: Performance Evaluation of Deep Learning for Tree Branch Segmentation in Autonomous Forestry Systems

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[447] arXiv:2512.05415 [pdf, ps, other]: Title: Moving object detection from multi-depth images with an attention-enhanced CNN

Authors: Masato Shibukawa, Fumi Yoshida, Toshifumi Yanagisawa, Takashi Ito, Hirohisa Kurosaki, Makoto Yoshikawa, Kohki Kamiya, Ji-an Jiang, Wesley Fraser, JJ Kavelaars, Susan Benecchi, Anne Verbiscer, Akira Hatakeyama, Hosei O, Naoya Ozaki

Comments: 14 pages, 22 figures, submitted to PASJ

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[448] arXiv:2512.05412 [pdf, ps, other]: Title: YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[449] arXiv:2512.05410 [pdf, ps, other]: Title: Genetic Algorithms For Parameter Optimization for Disparity Map Generation of Radiata Pine Branch Images

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[450] arXiv:2512.05398 [pdf, ps, other]: Title: The Dynamic Prior: Understanding 3D Structures for Casual Dynamic Videos

Authors: Zhuoyuan Wu, Xurui Yang, Jiahui Huang, Yue Wang, Jun Gao

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451] arXiv:2512.05394 [pdf, ps, other]: Title: Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability

Authors: Shizhan Liu, Xinran Deng, Zhuoyi Yang, Jiayan Teng, Xiaotao Gu, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[452] arXiv:2512.05391 [pdf, ps, other]: Title: LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models

Authors: Qingqiao Hu, Weimin Lyu, Meilong Xu, Kehan Qi, Xiaoling Hu, Saumya Gupta, Jiawei Zhou, Chao Chen

Comments: 20 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453] arXiv:2512.05385 [pdf, ps, other]: Title: ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration

Authors: Yingjie Xia, Tao Liu, Jinglei Shi, Qingsong Xie, Heng Guo, Jian Yang, Xi Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[454] arXiv:2512.05362 [pdf, ps, other]: Title: PoolNet: Deep Learning for 2D to 3D Video Process Validation

Authors: Sanchit Kaul, Joseph Luna, Shray Arora

Comments: All code related to this paper can be found at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[455] arXiv:2512.05359 [pdf, ps, other]: Title: Group Orthogonal Low-Rank Adaptation for RGB-T Tracking

Authors: Zekai Shao, Yufan Hu, Jingyuan Liu, Bin Fan, Hongmin Liu

Comments: 13 pages, 8 figures. Accepted by AAAI 2026. Extended version

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456] arXiv:2512.05354 [pdf, ps, other]: Title: SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training

Authors: Yang Zheng, Hao Tan, Kai Zhang, Peng Wang, Leonidas Guibas, Gordon Wetzstein, Wang Yifan

Comments: project page this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[457] arXiv:2512.05343 [pdf, ps, other]: Title: SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling

Authors: Elisabetta Fedele, Francis Engelmann, Ian Huang, Or Litany, Marc Pollefeys, Leonidas Guibas

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[458] arXiv:2512.05277 [pdf, ps, other]: Title: From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model

Authors: Kevin Cannons, Saeed Ranjbar Alvar, Mohammad Asiful Hossain, Ahmad Rezaei, Mohsen Gholami, Alireza Heidarikhazaei, Zhou Weimin, Yong Zhang, Mohammad Akbari

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[459] arXiv:2512.05272 [pdf, ps, other]: Title: Inferring Compositional 4D Scenes without Ever Seeing One

Authors: Ahmet Berke Gokmen, Ajad Chhatkuli, Luc Van Gool, Danda Pani Paudel

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460] arXiv:2512.05268 [pdf, ps, other]: Title: CARD: Correlation Aware Restoration with Diffusion

Authors: Niki Nezakati, Arnab Ghosh, Amit Roy-Chowdhury, Vishwanath Saragadam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461] arXiv:2512.05259 [pdf, ps, other]: Title: Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization

Authors: Georgios Chatzichristodoulou, Niki Efthymiou, Panagiotis Filntisis, Georgios Pavlakos, Petros Maragos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462] arXiv:2512.05240 [pdf, ps, other]: Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction

Authors: Dmitrii Torbunov, Onur Okuducu, Yi Huang, Odera Dim, Rebecca Coles, Yonggang Cui, Yihui Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463] arXiv:2512.05209 [pdf, ps, other]: Title: DEAR: Dataset for Evaluating the Aesthetics of RenderingDEAR: Dataset for Evaluating the Aesthetics of Rendering

Authors: Vsevolod Plohotnuk, Artyom Panshin, Nikola Banić, Simone Bianco, Michael Freeman, Egor Ershov

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464] arXiv:2512.05198 [pdf, ps, other]: Title: Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion Models

Authors: Rowan Bradbury, Dazhi Zhong

Comments: 16 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[465] arXiv:2512.05172 [pdf, ps, other]: Title: Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning

Authors: Wentao Wang, Chunyang Liu, Kehua Sheng, Bo Zhang, Yan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[466] arXiv:2512.05152 [pdf, ps, other]: Title: EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer Models

Authors: Kun Wang, Donglin Di, Tonghua Su, Lei Fan

Comments: 6pages, 5figures, published to 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[467] arXiv:2512.05150 [pdf, ps, other]: Title: TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Authors: Zhenglin Cheng, Peng Sun, Jianguo Li, Tao Lin

Comments: arxiv v0

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[468] arXiv:2512.05145 [pdf, ps, other]: Title: Self-Improving VLM Judges Without Human Annotations

Authors: Inna Wanyin Lin, Yushi Hu, Shuyue Stella Li, Scott Geng, Pang Wei Koh, Luke Zettlemoyer, Tim Althoff, Marjan Ghazvininejad

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469] arXiv:2512.05140 [pdf, other]: Title: FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation

Authors: Georges Le Bellier (CEDRIC - VERTIGO, Cnam), Nicolas Audebert (LaSTIG, IGN, CEDRIC - VERTIGO)

Comments: 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2026, Tucson (AZ), United States

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[470] arXiv:2512.05139 [pdf, ps, other]: Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models

Authors: Yang Xiang, Jingwen Zhong, Yige Yan, Petros Koutrakis, Eric Garshick, Meredith Franklin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[471] arXiv:2512.05137 [pdf, ps, other]: Title: ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images

Authors: Yunfei Zhang, Yizhuo He, Yuanxun Shao, Zhengtao Yao, Haoyan Xu, Junhao Dong, Zhen Yao, Zhikang Dong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[472] arXiv:2512.05136 [pdf, ps, other]: Title: Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes

Authors: Yujie Xiao, Gongzhen Tang, Deyun Zhang, Jun Li, Guangkun Nie, Haoyu Wang, Shun Huang, Tong Liu, Qinghao Zhao, Kangyin Chen, Shenda Hong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[473] arXiv:2512.05134 [pdf, ps, other]: Title: InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models

Authors: Zihao Wu

Comments: 8 pages main, 8 pages appendix, 16 figures, 5 tables. Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[474] arXiv:2512.05132 [pdf, ps, other]: Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training

Authors: Wenshuo Wang, Fan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[475] arXiv:2512.05131 [pdf, ps, other]: Title: AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance

Authors: Tianling Xu, Shengzhe Gan, Leslie Gu, Yuelei Li, Fangneng Zhan, Hanspeter Pfister

Comments: Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[476] arXiv:2512.05959 (cross-list from cs.CL) [pdf, ps, other]: Title: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

Authors: David Anugraha, Patrick Amadeus Irawan, Anshul Singh, En-Shiun Annie Lee, Genta Indra Winata

Comments: Preprint

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[477] arXiv:2512.05955 (cross-list from cs.RO) [pdf, ps, other]: Title: SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models

Authors: Haowen Liu, Shaoxiong Yao, Haonan Chen, Jiawei Gao, Jiayuan Mao, Jia-Bin Huang, Yilun Du

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[478] arXiv:2512.05932 (cross-list from cs.RO) [pdf, ps, other]: Title: Physically-Based Simulation of Automotive LiDAR

Authors: L. Dudzik, M. Roschani, A. Sielemann, K. Trampert, J. Ziehn, J. Beyerer, C. Neumann

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[479] arXiv:2512.05824 (cross-list from cs.AI) [pdf, ps, other]: Title: Multimodal Oncology Agent for IDH1 Mutation Prediction in Low-Grade Glioma

Authors: Hafsa Akebli (1), Adam Shephard (2), Vincenzo Della Mea (1), Nasir Rajpoot (2 and 3) ((1) University of Udine, Udine, Italy, (2) University of Warwick, Coventry, UK, (3) Histofy Ltd, Coventry, UK)

Comments: 4 pages, 2 figures

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[480] arXiv:2512.05812 (cross-list from cs.RO) [pdf, ps, other]: Title: Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation

Authors: Fabian Konstantinidis, Moritz Sackmann, Ulrich Hofmann, Christoph Stiller

Comments: This work has been submitted to the IEEE for possible publication

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[481] arXiv:2512.05665 (cross-list from cs.CL) [pdf, ps, other]: Title: Interleaved Latent Visual Reasoning with Selective Perceptual Modeling

Authors: Shuai Dong, Siyuan Wang, Xingyu Liu, Zhongyu Wei

Comments: 11 pages, 6 figures. Code available at this https URL

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[482] arXiv:2512.05438 (cross-list from cs.HC) [pdf, ps, other]: Title: EXR: An Interactive Immersive EHR Visualization in Extended Reality

Authors: Benoit Marteau, Shaun Q. Y. Tan, Jieru Li, Andrew Hornback, Yishan Zhong, Shaunna Wang, Christian Lowson, Jason Woloff, Joshua M. Pahys, Steven W. Hwang, Coleman Hilton, May D. Wang

Comments: 11 pages, 6 figures. Preprint version. This paper has been accepted to IEEE ICIR 2025. This is the author-prepared version and not the final published version. The final version will appear in IEEE Xplo

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[483] arXiv:2512.05299 (cross-list from eess.SY) [pdf, ps, other]: Title: ARCAS: An Augmented Reality Collision Avoidance System with SLAM-Based Tracking for Enhancing VRU Safety

Authors: Ahmad Yehia, Jiseop Byeon, Tianyi Wang, Huihai Wang, Yiming Xu, Junfeng Jiao, Christian Claudel

Comments: 8 pages, 3 figures, 1 table

Subjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Robotics (cs.RO); Image and Video Processing (eess.IV)
[484] arXiv:2512.05126 (cross-list from eess.AS) [pdf, ps, other]: Title: SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS Model

Authors: Kaidi Wang, Yi He, Wenhao Guan, Weijie Wu, Hongwu Ding, Xiong Zhang, Di Wu, Meng Meng, Jian Luan, Lin Li, Qingyang Hong

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)

Fri, 5 Dec 2025

[485] arXiv:2512.05115 [pdf, ps, other]: Title: Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Authors: Tianqi Liu, Zhaoxi Chen, Zihao Huang, Shaocong Xu, Saining Zhang, Chongjie Ye, Bohan Li, Zhiguo Cao, Wei Li, Hao Zhao, Ziwei Liu

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[486] arXiv:2512.05113 [pdf, ps, other]: Title: Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting

Authors: Hao-Jen Chien, Yi-Chuan Huang, Chung-Ho Wu, Wei-Lun Chao, Yu-Lun Liu

Comments: WACV 2026. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[487] arXiv:2512.05112 [pdf, ps, other]: Title: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

Authors: Dongzhi Jiang, Renrui Zhang, Haodong Li, Zhuofan Zong, Ziyu Guo, Jun He, Claire Guo, Junyan Ye, Rongyao Fang, Weijia Li, Rui Liu, Hongsheng Li

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[488] arXiv:2512.05111 [pdf, ps, other]: Title: ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Authors: Shengyuan Ding, Xinyu Fang, Ziyu Liu, Yuhang Zang, Yuhang Cao, Xiangyu Zhao, Haodong Duan, Xiaoyi Dong, Jianze Liang, Bin Wang, Conghui He, Dahua Lin, Jiaqi Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[489] arXiv:2512.05110 [pdf, ps, other]: Title: ShadowDraw: From Any Object to Shadow-Drawing Compositional Art

Authors: Rundong Luo, Noah Snavely, Wei-Chiu Ma

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[490] arXiv:2512.05106 [pdf, ps, other]: Title: NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation

Authors: Yu Zeng, Charles Ochoa, Mingyuan Zhou, Vishal M. Patel, Vitor Guizilini, Rowan McAllister

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Robotics (cs.RO)
[491] arXiv:2512.05104 [pdf, ps, other]: Title: EvoIR: Towards All-in-One Image Restoration via Evolutionary Frequency Modulation

Authors: Jiaqi Ma, Shengkai Hu, Jun Wan, Jiaxing Huang, Lefei Zhang, Salman Khan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[492] arXiv:2512.05098 [pdf, ps, other]: Title: SA-IQA: Redefining Image Quality Assessment for Spatial Aesthetics with Multi-Dimensional Rewards

Authors: Yuan Gao, Jin Song

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[493] arXiv:2512.05091 [pdf, ps, other]: Title: Visual Reasoning Tracer: Object-Level Grounded Reasoning Benchmark

Authors: Haobo Yuan, Yueyi Sun, Yanwei Li, Tao Zhang, Xueqing Deng, Henghui Ding, Lu Qi, Anran Wang, Xiangtai Li, Ming-Hsuan Yang

Comments: Technical Report; Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[494] arXiv:2512.05081 [pdf, ps, other]: Title: Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression

Authors: Jung Yi, Wooseok Jang, Paul Hyunbin Cho, Jisu Nam, Heeji Yoon, Seungryong Kim

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[495] arXiv:2512.05079 [pdf, ps, other]: Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints

Authors: Minghan Zhu, Zhiyi Wang, Qihang Sun, Maani Ghaffari, Michael Posa

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[496] arXiv:2512.05076 [pdf, ps, other]: Title: BulletTime: Decoupled Control of Time and Camera Pose for Video Generation

Authors: Yiming Wang, Qihang Zhang, Shengqu Cai, Tong Wu, Jan Ackermann, Zhengfei Kuang, Yang Zheng, Frano Rajič, Siyu Tang, Gordon Wetzstein

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[497] arXiv:2512.05060 [pdf, ps, other]: Title: 4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer

Authors: Xianfeng Wu, Yajing Bai, Minghan Li, Xianzu Wu, Xueqi Zhao, Zhongyuan Lai, Wenyu Liu, Xinggang Wang

Comments: Code: this https URL, Webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[498] arXiv:2512.05044 [pdf, ps, other]: Title: Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Authors: Yanran Zhang, Ziyi Wang, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu

Comments: 18 Pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[499] arXiv:2512.05039 [pdf, ps, other]: Title: Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual Encoding

Authors: Abhigyan Bhattacharya, Hiranmoy Roy

Comments: Submitted for review CVPR-2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[500] arXiv:2512.05025 [pdf, ps, other]: Title: RAMEN: Resolution-Adjustable Multimodal Encoder for Earth Observation

Authors: Nicolas Houdré, Diego Marcos, Hugo Riffaud de Turckheim, Dino Ienco, Laurent Wendling, Camille Kurtz, Sylvain Lobry

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[501] arXiv:2512.05021 [pdf, ps, other]: Title: HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition

Authors: Pham Thach Thanh Truc, Dang Hoai Nam, Huynh Tong Dang Khoa, Vo Nguyen Le Duy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[502] arXiv:2512.05016 [pdf, ps, other]: Title: Generative Neural Video Compression via Video Diffusion Prior

Authors: Qi Mao, Hao Cheng, Tinghan Yang, Libiao Jin, Siwei Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[503] arXiv:2512.05006 [pdf, ps, other]: Title: Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects

Authors: Xianghui Fan, Zhaoyu Chen, Mengyang Pan, Anping Deng, Hang Yang

Comments: conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[504] arXiv:2512.05000 [pdf, ps, other]: Title: Reflection Removal through Efficient Adaptation of Diffusion Transformers

Authors: Daniyar Zakarin, Thiemo Wandel, Anton Obukhov, Dengxin Dai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[505] arXiv:2512.04996 [pdf, ps, other]: Title: A dynamic memory assignment strategy for dilation-based ICP algorithm on embedded GPUs

Authors: Qiong Chang, Weimin Wang, Junpei Zhong, Jun Miyazaki

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[506] arXiv:2512.04981 [pdf, ps, other]: Title: Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

Authors: NaHyeon Park, Namin An, Kunhee Kim, Soyeon Yoon, Jiahao Huo, Hyunjung Shim

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[507] arXiv:2512.04970 [pdf, ps, other]: Title: Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks

Authors: Leonid Pogorelyuk, Niels Bracher, Aaron Verkleeren, Lars Kühmichel, Stefan T. Radev

Comments: UniReps Workshop 2025, 12 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[508] arXiv:2512.04969 [pdf, ps, other]: Title: Rethinking the Use of Vision Transformers for AI-Generated Image Detection

Authors: NaHyeon Park, Kunhee Kim, Junsuk Choe, Hyunjung Shim

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[509] arXiv:2512.04967 [pdf, ps, other]: Title: Balanced Few-Shot Episodic Learning for Accurate Retinal Disease Diagnosis

Authors: Jasmaine Khale, Ravi Prakash Srivastava

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[510] arXiv:2512.04963 [pdf, ps, other]: Title: GeoPE:A Unified Geometric Positional Embedding for Structured Tensors

Authors: Yupu Yao, Bowen Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[511] arXiv:2512.04952 [pdf, ps, other]: Title: FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via Neural Action Tokenization

Authors: Yicheng Liu, Shiduo Zhang, Zibin Dong, Baijun Ye, Tianyuan Yuan, Xiaopeng Yu, Linqi Yin, Chenhao Lu, Junhao Shi, Luca Jiang-Tao Yu, Liangtao Zheng, Tao Jiang, Jingjing Gong, Xipeng Qiu, Hang Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[512] arXiv:2512.04943 [pdf, ps, other]: Title: Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition

Authors: Novanto Yudistira

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[513] arXiv:2512.04939 [pdf, ps, other]: Title: LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging

Authors: Zhijian Shu, Cheng Lin, Tao Xie, Wei Yin, Ben Li, Zhiyuan Pu, Weize Li, Yao Yao, Xun Cao, Xiaoyang Guo, Xiao-Xiao Long

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[514] arXiv:2512.04927 [pdf, ps, other]: Title: Virtually Unrolling the Herculaneum Papyri by Diffeomorphic Spiral Fitting

Authors: Paul Henderson

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[515] arXiv:2512.04926 [pdf, ps, other]: Title: Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion

Authors: Yueming Pan, Ruoyu Feng, Qi Dai, Yuqi Wang, Wenfeng Lin, Mingyu Guo, Chong Luo, Nanning Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[516] arXiv:2512.04904 [pdf, ps, other]: Title: ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching

Authors: Guanbo Huang, Jingjia Mao, Fanding Huang, Fengkai Liu, Xiangyang Luo, Yaoyuan Liang, Jiasheng Lu, Xiaoe Wang, Pei Liu, Ruiliu Fu, Shao-Lun Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[517] arXiv:2512.04890 [pdf, ps, other]: Title: Equivariant Symmetry-Aware Head Pose Estimation for Fetal MRI

Authors: Ramya Muthukrishnan, Borjan Gagoski, Aryn Lee, P. Ellen Grant, Elfar Adalsteinsson, Polina Golland, Benjamin Billot

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[518] arXiv:2512.04888 [pdf, ps, other]: Title: You Only Train Once (YOTO): A Retraining-Free Object Detection Framework

Authors: Priyanto Hidayatullah, Nurjannah Syakrani, Yudi Widhiyasana, Muhammad Rizqi Sholahuddin, Refdinal Tubagus, Zahri Al Adzani Hidayat, Hanri Fajar Ramadhan, Dafa Alfarizki Pratama, Farhan Muhammad Yasin

Comments: This manuscript was first submitted to the Engineering (Elsevier Journal). The preprint version was posted to arXiv afterwards to facilitate open access and community feedback

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[519] arXiv:2512.04883 [pdf, ps, other]: Title: SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms

Authors: Jiawen Wen, Yu Hu, Suixuan Qiu, Jinshan Huang, Xiaowen Chu

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[520] arXiv:2512.04875 [pdf, ps, other]: Title: SP-Det: Self-Prompted Dual-Text Fusion for Generalized Multi-Label Lesion Detection

Authors: Qing Xu, Yanqian Wang, Xiangjian Hea, Yue Li, Yixuan Zhang, Rong Qu, Wenting Duan, Zhen Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[521] arXiv:2512.04862 [pdf, ps, other]: Title: Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing

Authors: Maria-Paola Forte, Nikos Athanasiou, Giulia Ballardini, Jan Ulrich Bartels, Katherine J. Kuchenbecker, Michael J. Black

Comments: * Equal contribution. Minor figure corrections compared to the ICCV 2025 version

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[522] arXiv:2512.04857 [pdf, ps, other]: Title: Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens

Authors: Ziran Qin, Youru Lv, Mingbao Lin, Zeren Zhang, Chanfan Gan, Tieyuan Chen, Weiyao Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[523] arXiv:2512.04837 [pdf, ps, other]: Title: A Sanity Check for Multi-In-Domain Face Forgery Detection in the Real World

Authors: Jikang Cheng, Renye Yan, Zhiyuan Yan, Yaozhong Gan, Xueyi Zhang, Zhongyuan Wang, Wei Peng, Ling Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[524] arXiv:2512.04832 [pdf, ps, other]: Title: Tokenizing Buildings: A Transformer for Layout Synthesis

Authors: Manuel Ladron de Guevara, Jinmo Rhee, Ardavan Bidgoli, Vaidas Razgaitis, Michael Bergin

Comments: 8 pages, 1 page References, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[525] arXiv:2512.04830 [pdf, ps, other]: Title: FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis

Authors: Shijie Chen, Peixi Peng

Comments: Novel View Synthesis, Driving Scene, Free Trajectory, Image Generation

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[526] arXiv:2512.04821 [pdf, ps, other]: Title: LatentFM: A Latent Flow Matching Approach for Generative Medical Image Segmentation

Authors: Huynh Trinh Ngoc, Hoang Anh Nguyen Kim, Toan Nguyen Hai, Long Tran Quoc

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527] arXiv:2512.04815 [pdf, ps, other]: Title: RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS

Authors: Chuanyu Fu, Guanying Chen, Yuqi Zhang, Kunbin Yao, Yuan Xiong, Chuan Huang, Shuguang Cui, Yasuyuki Matsushita, Xiaochun Cao

Comments: arXiv admin note: substantial text overlap with arXiv:2506.02751

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528] arXiv:2512.04810 [pdf, ps, other]: Title: EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

Authors: Xin He, Longhui Wei, Jianbo Ouyang, Minghui Liao, Lingxi Xie, Qi Tian

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[529] arXiv:2512.04786 [pdf, ps, other]: Title: LaFiTe: A Generative Latent Field for 3D Native Texturing

Authors: Chia-Hao Chen, Zi-Xin Zou, Yan-Pei Cao, Ze Yuan, Guan Luo, Xiaojuan Qi, Ding Liang, Song-Hai Zhang, Yuan-Chen Guo

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530] arXiv:2512.04784 [pdf, ps, other]: Title: PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

Authors: Bowen Ping, Chengyou Jia, Minnan Luo, Changliang Xia, Xin Shen, Zhuohang Dang, Hangwei Qian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531] arXiv:2512.04761 [pdf, ps, other]: Title: Order Matters: 3D Shape Generation from Sequential VR Sketches

Authors: Yizi Chen, Sidi Wu, Tianyi Xiao, Nina Wiedemann, Loic Landrieu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532] arXiv:2512.04734 [pdf, ps, other]: Title: MT-Depth: Multi-task Instance feature analysis for the Depth Completion

Authors: Abdul Haseeb Nizamani, Dandi Zhou, Xinhai Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533] arXiv:2512.04733 [pdf, ps, other]: Title: E3AD: An Emotion-Aware Vision-Language-Action Model for Human-Centric End-to-End Autonomous Driving

Authors: Yihong Tang, Haicheng Liao, Tong Nie, Junlin He, Ao Qu, Kehua Chen, Wei Ma, Zhenning Li, Lijun Sun, Chengzhong Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[534] arXiv:2512.04728 [pdf, ps, other]: Title: Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild

Authors: Yigui Feng, Qinglin Wang, Haotian Mo, Yang Liu, Ke Liu, Gencheng Liu, Xinhai Chen, Siqi Shen, Songzhu Mei, Jie Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[535] arXiv:2512.04699 [pdf, ps, other]: Title: OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution

Authors: Xinning Chai, Zhengxue Cheng, Yuhong Zhang, Hengsheng Zhang, Yingsheng Qin, Yucai Yang, Rong Xie, Li Song

Comments: Accepted as TCSVT, 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[536] arXiv:2512.04686 [pdf, ps, other]: Title: Towards Cross-View Point Correspondence in Vision-Language Models

Authors: Yipu Wang, Yuheng Ji, Yuyang Liu, Enshen Zhou, Ziqiang Yang, Yuxuan Tian, Ziheng Qin, Yue Liu, Huajie Tan, Cheng Chi, Zhiyuan Ma, Daniel Dajun Zeng, Xiaolong Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[537] arXiv:2512.04678 [pdf, ps, other]: Title: Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

Authors: Yunhong Lu, Yanhong Zeng, Haobo Li, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jiapeng Zhu, Hengyuan Cao, Zhipeng Zhang, Xing Zhu, Yujun Shen, Min Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538] arXiv:2512.04677 [pdf, ps, other]: Title: Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Authors: Yubo Huang, Hailong Guo, Fangtai Wu, Shifeng Zhang, Shijie Huang, Qijun Gan, Lin Liu, Sirui Zhao, Enhong Chen, Jiaming Liu, Steven Hoi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539] arXiv:2512.04660 [pdf, ps, other]: Title: I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models

Authors: Juntong Wang, Jiarui Wang, Huiyu Duan, Jiaxiang Kang, Guangtao Zhai, Xiongkuo Min

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540] arXiv:2512.04643 [pdf, ps, other]: Title: SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding

Authors: Chang-Hsun Wu, Kai-Po Chang, Yu-Yang Sheng, Hung-Kai Chung, Kuei-Chun Wang, Yu-Chiang Frank Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[541] arXiv:2512.04619 [pdf, ps, other]: Title: Denoise to Track: Harnessing Video Diffusion Priors for Robust Correspondence

Authors: Tianyu Yuan, Yuanbo Yang, Lin-Zhuo Chen, Yao Yao, Zhuzhong Qian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[542] arXiv:2512.04599 [pdf, ps, other]: Title: Malicious Image Analysis via Vision-Language Segmentation Fusion: Detection, Element, and Location in One-shot

Authors: Sheng Hang, Chaoxiang He, Hongsheng Hu, Hanqing Hu, Bin Benjamin Zhu, Shi-Feng Sun, Dawu Gu, Shuo Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543] arXiv:2512.04597 [pdf, ps, other]: Title: When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering

Authors: Tao Wu, Chuhao Zhou, Guangyu Zhao, Haozhi Cao, Yewen Pu, Jianfei Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[544] arXiv:2512.04585 [pdf, ps, other]: Title: SAM3-I: Segment Anything with Instructions

Authors: Jingjing Li, Yue Feng, Yuchen Guo, Jincai Huang, Yongri Piao, Qi Bi, Miao Zhang, Xiaoqi Zhao, Qiang Chen, Shihao Zou, Wei Ji, Huchuan Lu, Li Cheng

Comments: Preliminary results; work in progress

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545] arXiv:2512.04581 [pdf, ps, other]: Title: Infrared UAV Target Tracking with Dynamic Feature Refinement and Global Contextual Attention Knowledge Distillation

Authors: Houzhang Fang, Chenxing Wu, Kun Bai, Tianqi Chen, Xiaolin Wang, Xiyang Liu, Yi Chang, Luxin Yan

Comments: Accepted by IEEE TMM

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546] arXiv:2512.04576 [pdf, ps, other]: Title: TARDis: Time Attenuated Representation Disentanglement for Incomplete Multi-Modal Tumor Segmentation and Classification

Authors: Zishuo Wan, Qinqin Kang, Yi Huang, Yun Bian, Dawei Ding, Ke Yan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547] arXiv:2512.04568 [pdf, ps, other]: Title: Prompt2Craft: Generating Functional Craft Assemblies with LLMs

Authors: Vitor Hideyo Isume, Takuya Kiyokawa, Natsuki Yamanobe, Yukiyasu Domae, Weiwei Wan, Kensuke Harada

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548] arXiv:2512.04564 [pdf, ps, other]: Title: Dataset creation for supervised deep learning-based analysis of microscopic images -- review of important considerations and recommendations

Authors: Christof A. Bertram, Viktoria Weiss, Jonas Ammeling, F. Maria Schabel, Taryn A. Donovan, Frauke Wilm, Christian Marzahl, Katharina Breininger, Marc Aubreville

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549] arXiv:2512.04563 [pdf, ps, other]: Title: COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence

Authors: Zefeng Zhang, Xiangzhao Hao, Hengzhu Tang, Zhenyu Zhang, Jiawei Sheng, Xiaodong Li, Zhenyang Li, Li Gao, Daiting Shi, Dawei Yin, Tingwen Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550] arXiv:2512.04554 [pdf, ps, other]: Title: Counterfeit Answers: Adversarial Forgery against OCR-Free Document Visual Question Answering

Authors: Marco Pintore, Maura Pintor, Dimosthenis Karatzas, Battista Biggio

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[551] arXiv:2512.04542 [pdf, ps, other]: Title: Gaussian Entropy Fields: Driving Adaptive Sparsity in 3D Gaussian Optimization

Authors: Hong Kuang, Jianchen Liu

Comments: 28 pages,11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[552] arXiv:2512.04540 [pdf, ps, other]: Title: VideoMem: Enhancing Ultra-Long Video Understanding via Adaptive Memory Management

Authors: Hongbo Jin, Qingyuan Wang, Wenhao Zhang, Yang Liu, Sijie Cheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[553] arXiv:2512.04537 [pdf, ps, other]: Title: X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale

Authors: Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554] arXiv:2512.04536 [pdf, ps, other]: Title: Detection of Intoxicated Individuals from Facial Video Sequences via a Recurrent Fusion Model

Authors: Bita Baroutian, Atefe Aghaei, Mohsen Ebrahimi Moghaddam

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[555] arXiv:2512.04534 [pdf, ps, other]: Title: Refaçade: Editing Object with Given Reference Texture

Authors: Youze Huang (1), Penghui Ruan (2), Bojia Zi (3), Xianbiao Qi (4), Jianan Wang (5), Rong Xiao (4) ((1) University of Electronic Science and Technology of China, (2) The Hong Kong Polytechnic University, (3) The Chinese University of Hong Kong, (4) IntelliFusion Inc., (5) Astribot Inc.)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556] arXiv:2512.04532 [pdf, ps, other]: Title: PhyVLLM: Physics-Guided Video Language Model with Motion-Appearance Disentanglement

Authors: Yu-Wei Zhan, Xin Wang, Hong Chen, Tongtong Feng, Wei Feng, Ren Wang, Guangyao Li, Qing Li, Wenwu Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[557] arXiv:2512.04528 [pdf, ps, other]: Title: Auto3R: Automated 3D Reconstruction and Scanning via Data-driven Uncertainty Quantification

Authors: Chentao Shen, Sizhe Zheng, Bingqian Wu, Yaohua Feng, Yuanchen Fei, Mingyu Mei, Hanwen Jiang, Xiangru Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[558] arXiv:2512.04522 [pdf, ps, other]: Title: Identity Clue Refinement and Enhancement for Visible-Infrared Person Re-Identification

Authors: Guoqing Zhang, Zhun Wang, Hairui Wang, Zhonglin Ye, Yuhui Zheng

Comments: 14 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[559] arXiv:2512.04521 [pdf, ps, other]: Title: WiFi-based Cross-Domain Gesture Recognition Using Attention Mechanism

Authors: Ruijing Liu, Cunhua Pan, Jiaming Zeng, Hong Ren, Kezhi Wang, Lei Kong, Jiangzhou Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[560] arXiv:2512.04520 [pdf, ps, other]: Title: Boundary-Aware Test-Time Adaptation for Zero-Shot Medical Image Segmentation

Authors: Chenlin Xu, Lei Zhang, Lituan Wang, Xinyu Pu, Pengfei Ma, Guangwu Qian, Zizhou Wang, Yan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561] arXiv:2512.04519 [pdf, ps, other]: Title: VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory

Authors: Yifei Yu, Xiaoshan Wu, Xinting Hu, Tao Hu, Yangtian Sun, Xiaoyang Lyu, Bo Wang, Lin Ma, Yuewen Ma, Zhongrui Wang, Xiaojuan Qi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562] arXiv:2512.04515 [pdf, ps, other]: Title: EgoLCD: Egocentric Video Generation with Long Context Diffusion

Authors: Liuzhou Zhang, Jiarui Ye, Yuanlei Wang, Ming Zhong, Mingju Cao, Wanke Xia, Bowen Zeng, Zeyu Zhang, Hao Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[563] arXiv:2512.04511 [pdf, ps, other]: Title: DuGI-MAE: Improving Infrared Mask Autoencoders via Dual-Domain Guidance

Authors: Yinghui Xing, Xiaoting Su, Shizhou Zhang, Donghao Chu, Di Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[564] arXiv:2512.04504 [pdf, ps, other]: Title: UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers

Authors: Min Zhao, Bokai Yan, Xue Yang, Hongzhou Zhu, Jintao Zhang, Shilong Liu, Chongxuan Li, Jun Zhu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[565] arXiv:2512.04499 [pdf, ps, other]: Title: Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model

Authors: Yuduo Jin, Brandon Haworth

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[566] arXiv:2512.04496 [pdf, ps, other]: Title: Shift-Window Meets Dual Attention: A Multi-Model Architecture for Specular Highlight Removal

Authors: Tianci Huo, Lingfeng Qi, Yuhan Chen, Qihong Xue, Jinyuan Shao, Hai Yu, Jie Li, Zhanhua Zhang, Guofa Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567] arXiv:2512.04487 [pdf, ps, other]: Title: Controllable Long-term Motion Generation with Extended Joint Targets

Authors: Eunjong Lee, Eunhee Kim, Sanghoon Hong, Eunho Jung, Jihoon Kim

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[568] arXiv:2512.04485 [pdf, ps, other]: Title: Not All Birds Look The Same: Identity-Preserving Generation For Birds

Authors: Aaron Sun, Oindrila Saha, Subhransu Maji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569] arXiv:2512.04483 [pdf, ps, other]: Title: DeRA: Decoupled Representation Alignment for Video Tokenization

Authors: Pengbo Guo, Junke Wang, Zhen Xing, Chengxu Liu, Daoguo Dong, Xueming Qian, Zuxuan Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[570] arXiv:2512.04461 [pdf, ps, other]: Title: UniTS: Unified Time Series Generative Model for Remote Sensing

Authors: Yuxiang Zhang, Shunlin Liang, Wenyuan Li, Han Ma, Jianglei Xu, Yichuan Ma, Jiangwei Xie, Wei Li, Mengmeng Zhang, Ran Tao, Xiang-Gen Xia

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571] arXiv:2512.04459 [pdf, ps, other]: Title: dVLM-AD: Enhance Diffusion Vision-Language-Model for Driving via Controllable Reasoning

Authors: Yingzi Ma, Yulong Cao, Wenhao Ding, Shuibai Zhang, Yan Wang, Boris Ivanovic, Ming Jiang, Marco Pavone, Chaowei Xiao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572] arXiv:2512.04456 [pdf, ps, other]: Title: GuidNoise: Single-Pair Guided Diffusion for Generalized Noise Synthesis

Authors: Changjin Kim, HyeokJun Lee, YoungJoon Yoo

Comments: AAAI2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[573] arXiv:2512.04451 [pdf, ps, other]: Title: StreamEQA: Towards Streaming Video Understanding for Embodied Scenarios

Authors: Yifei Wang, Zhenkai Li, Tianwen Qian, Huanran Zheng, Zheng Wang, Yuqian Fu, Xiaoling Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574] arXiv:2512.04441 [pdf, ps, other]: Title: MindDrive: An All-in-One Framework Bridging World Models and Vision-Language Model for End-to-End Autonomous Driving

Authors: Bin Sun, Yaoguang Cao, Yan Wang, Rui Wang, Jiachen Shang, Xiejie Feng, Jiayi Lu, Jia Shi, Shichun Yang, Xiaoyu Yan, Ziying Song

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[575] arXiv:2512.04426 [pdf, ps, other]: Title: Self-Paced and Self-Corrective Masked Prediction for Movie Trailer Generation

Authors: Sidan Zhu, Hongteng Xu, Dixin Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[576] arXiv:2512.04425 [pdf, ps, other]: Title: Explainable Parkinsons Disease Gait Recognition Using Multimodal RGB-D Fusion and Large Language Models

Authors: Manar Alnaasan, Md Selim Sarowar, Sungho Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[577] arXiv:2512.04421 [pdf, ps, other]: Title: UTrice: Unifying Primitives in Differentiable Ray Tracing and Rasterization via Triangles for Particle-Based 3D Scenes

Authors: Changhe Liu, Ehsan Javanmardi, Naren Bao, Alex Orsholits, Manabu Tsukada

Comments: 13 pages, 10 figures, submitted to CVPR2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[578] arXiv:2512.04413 [pdf, ps, other]: Title: Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection

Authors: Xiangyi Gao, Danpei Zhao, Bo Yuan, Wentao Li

Comments: 12 pages, 8 figures, 11 tables

Journal-ref: IEEE Transactions on Geoscience and Remote Sensing 63 (2025) 1-11

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[579] arXiv:2512.04397 [pdf, ps, other]: Title: Performance Evaluation of Transfer Learning Based Medical Image Classification Techniques for Disease Detection

Authors: Zeeshan Ahmad, Shudi Bao, Meng Chen

Journal-ref: 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Copenhagen, Denmark, 2025, pp. 1-5

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[580] arXiv:2512.04395 [pdf, ps, other]: Title: Fourier-Attentive Representation Learning: A Fourier-Guided Framework for Few-Shot Generalization in Vision-Language Models

Authors: Hieu Dinh Trung Pham, Huy Minh Nhat Nguyen, Cuong Tuan Nguyen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[581] arXiv:2512.04390 [pdf, ps, other]: Title: FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring

Authors: Geunhyuk Youk, Jihyong Oh, Munchurl Kim

Comments: 20 pages, 15 figures. Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[582] arXiv:2512.04358 [pdf, ps, other]: Title: MAFNet:Multi-frequency Adaptive Fusion Network for Real-time Stereo Matching

Authors: Ao Xu, Rujin Zhao, Xiong Xu, Boceng Huang, Yujia Jia, Hongfeng Long, Fuxuan Chen, Zilong Cao, Fangyuan Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[583] arXiv:2512.04356 [pdf, ps, other]: Title: Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment

Authors: Kai-Po Chang, Wei-Yuan Cheng, Chi-Pin Huang, Fu-En Yang, Yu-Chiang Frank Wang

Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2026. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[584] arXiv:2512.04331 [pdf, ps, other]: Title: Open Set Face Forgery Detection via Dual-Level Evidence Collection

Authors: Zhongyi Cai, Bryce Gernon, Wentao Bao, Yifan Li, Matthew Wright, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[585] arXiv:2512.04329 [pdf, ps, other]: Title: A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks

Authors: Waleed Khalid, Dmitry Ignatov, Radu Timofte

Subjects: Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[586] arXiv:2512.04323 [pdf, ps, other]: Title: Bayes-DIC Net: Estimating Digital Image Correlation Uncertainty with Bayesian Neural Networks

Authors: Biao Chen, Zhenhua Lei, Yahui Zhang, Tongzhi Niu

Comments: 17 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG)
[587] arXiv:2512.04315 [pdf, ps, other]: Title: SyncTrack4D: Cross-Video Motion Alignment and Video Synchronization for Multi-Video 4D Gaussian Splatting

Authors: Yonghan Lee, Tsung-Wei Huang, Shiv Gehlot, Jaehoon Choi, Guan-Ming Su, Dinesh Manocha

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[588] arXiv:2512.04314 [pdf, ps, other]: Title: DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision

Authors: Jiashu Liao, Pietro Liò, Marc de Kamps, Duygu Sarikaya

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[589] arXiv:2512.04313 [pdf, ps, other]: Title: Mind-to-Face: Neural-Driven Photorealistic Avatar Synthesis via EEG Decoding

Authors: Haolin Xiong, Tianwen Fu, Pratusha Bhuvana Prasad, Yunxuan Cai, Haiwei Chen, Wenbin Teng, Hanyuan Xiao, Yajie Zhao

Comments: 16 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[590] arXiv:2512.04311 [pdf, ps, other]: Title: Real-time Cricket Sorting By Sex

Authors: Juan Manuel Cantarero Angulo, Matthew Smith

Comments: 13 pages, 14 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[591] arXiv:2512.04309 [pdf, ps, other]: Title: Text-Only Training for Image Captioning with Retrieval Augmentation and Modality Gap Correction

Authors: Rui Fonseca, Bruno Martins, Gil Rocha

Comments: Submitted to CVPR 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[592] arXiv:2512.04305 [pdf, ps, other]: Title: How (Mis)calibrated is Your Federated CLIP and What To Do About It?

Authors: Mainak Singha, Masih Aminbeidokhti, Paolo Casari, Elisa Ricci, Subhankar Roy

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[593] arXiv:2512.04303 [pdf, ps, other]: Title: Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications

Authors: Gasser Elazab, Maximilian Jansen, Michael Unterreiner, Olaf Hellwich

Comments: Accepted in 3DV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[594] arXiv:2512.04284 [pdf, ps, other]: Title: Learning Single-Image Super-Resolution in the JPEG Compressed Domain

Authors: Sruthi Srinivasan, Elham Shakibapour, Rajy Rawther, Mehdi Saeedi

Comments: 7 pages, 4 figures, 2 tables, SEEDS Workshop, ICIP 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[595] arXiv:2512.04283 [pdf, ps, other]: Title: Plug-and-Play Image Restoration with Flow Matching: A Continuous Viewpoint

Authors: Fan Jia, Yuhao Huang, Shih-Hsin Wang, Cristina Garcia-Cardona, Andrea L. Bertozzi, Bao Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[596] arXiv:2512.04282 [pdf, ps, other]: Title: Inference-time Stochastic Refinement of GRU-Normalizing Flow for Real-time Video Motion Transfer

Authors: Tasmiah Haque, Srinjoy Das

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[597] arXiv:2512.04267 [pdf, ps, other]: Title: UniLight: A Unified Representation for Lighting

Authors: Zitian Zhang, Iliyan Georgiev, Michael Fischer, Yannick Hold-Geoffroy, Jean-François Lalonde, Valentin Deschaintre

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[598] arXiv:2512.04248 [pdf, ps, other]: Title: MVRoom: Controllable 3D Indoor Scene Generation with Multi-View Diffusion Models

Authors: Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[599] arXiv:2512.04238 [pdf, ps, other]: Title: 6 Fingers, 1 Kidney: Natural Adversarial Medical Images Reveal Critical Weaknesses of Vision-Language Models

Authors: Leon Mayer, Piotr Kalinowski, Caroline Ebersbach, Marcel Knopp, Tim Rädsch, Evangelia Christodoulou, Annika Reinke, Fiona R. Kolbinger, Lena Maier-Hein

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[600] arXiv:2512.04222 [pdf, ps, other]: Title: ReasonX: MLLM-Guided Intrinsic Image Decomposition

Authors: Alara Dirik, Tuanfeng Wang, Duygu Ceylan, Stefanos Zafeiriou, Anna Frühstück

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[601] arXiv:2512.04221 [pdf, ps, other]: Title: MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis

Authors: Xiangyu Bai, He Liang, Bishoy Galoaa, Utsav Nandi, Shayda Moezzi, Yuhang He, Sarah Ostadabbas

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[602] arXiv:2512.04219 [pdf, ps, other]: Title: Generalized Event Partonomy Inference with Structured Hierarchical Predictive Learning

Authors: Zhou Chen, Joe Lin, Sathyanarayanan N. Aakur\\

Comments: 16 pages, 7 figures, 3 tables. Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[603] arXiv:2512.04187 [pdf, ps, other]: Title: OnSight Pathology: A real-time platform-agnostic computational pathology companion for histopathology

Authors: Jinzhen Hu, Kevin Faust, Parsa Babaei Zadeh, Adrienn Bourkas, Shane Eaton, Andrew Young, Anzar Alvi, Dimitrios George Oreopoulos, Ameesha Paliwal, Assem Saleh Alrumeh, Evelyn Rose Kamski-Hennekam, Phedias Diamandis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[604] arXiv:2512.04175 [pdf, ps, other]: Title: Beyond Flicker: Detecting Kinematic Inconsistencies for Generalizable Deepfake Video Detection

Authors: Alejandro Cobo, Roberto Valle, José Miguel Buenaposada, Luis Baumela

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[605] arXiv:2512.05117 (cross-list from cs.LG) [pdf, ps, other]: Title: The Universal Weight Subspace Hypothesis

Authors: Prakhar Kaushik, Shravan Chaudhari, Ankit Vaidya, Rama Chellappa, Alan Yuille

Comments: 37 pages

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[606] arXiv:2512.05116 (cross-list from cs.LG) [pdf, ps, other]: Title: Value Gradient Guidance for Flow Matching Alignment

Authors: Zhen Liu, Tim Z. Xiao, Carles Domingo-Enrich, Weiyang Liu, Dinghuai Zhang

Comments: Accepted at NeurIPS 2025; 26 pages, 20 figures

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[607] arXiv:2512.05114 (cross-list from cs.LG) [pdf, ps, other]: Title: Deep infant brain segmentation from multi-contrast MRI

Authors: Malte Hoffmann, Lilla Zöllei, Adrian V. Dalca

Comments: 8 pages, 8 figures, 1 table, website at this https URL, presented at the 2025 IEEE Asilomar Conference on Signals, Systems, and Computers

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[608] arXiv:2512.05103 (cross-list from cs.LG) [pdf, ps, other]: Title: TV2TV: A Unified Framework for Interleaved Language and Video Generation

Authors: Xiaochuang Han, Youssef Emad, Melissa Hall, John Nguyen, Karthik Padthe, Liam Robbins, Amir Bar, Delong Chen, Michal Drozdzal, Maha Elbayad, Yushi Hu, Shang-Wen Li, Sreya Dutta Roy, Jakob Verbeek, XuDong Wang, Marjan Ghazvininejad, Luke Zettlemoyer, Emily Dinan

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[609] arXiv:2512.05094 (cross-list from cs.RO) [pdf, ps, other]: Title: From Generated Human Videos to Physically Plausible Robot Trajectories

Authors: James Ni, Zekai Wang, Wei Lin, Amir Bar, Yann LeCun, Trevor Darrell, Jitendra Malik, Roei Herzig

Comments: For project website, see this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[610] arXiv:2512.04814 (cross-list from cs.SD) [pdf, ps, other]: Title: Shared Multi-modal Embedding Space for Face-Voice Association

Authors: Christopher Simic, Korbinian Riedhammer, Tobias Bocklet

Comments: Ranked 1st in Fame 2026 Challenge, ICASSP

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[611] arXiv:2512.04763 (cross-list from cs.LG) [pdf, ps, other]: Title: MemLoRA: Distilling Expert Adapters for On-Device Memory Systems

Authors: Massimo Bini, Ondrej Bohdal, Umberto Michieli, Zeynep Akata, Mete Ozay, Taha Ceritli

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[612] arXiv:2512.04705 (cross-list from cs.CC) [pdf, ps, other]: Title: Hardware-aware Neural Architecture Search of Early Exiting Networks on Edge Accelerators

Authors: Alaa Zniber, Arne Symons, Ouassim Karrakchou, Marian Verhelst, Mounir Ghogho

Comments: Submitted to IEEE Transactions on Emerging Topics in Computing

Subjects: Computational Complexity (cs.CC); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
[613] arXiv:2512.04625 (cross-list from cs.LG) [pdf, ps, other]: Title: Rethinking Decoupled Knowledge Distillation: A Predictive Distribution Perspective

Authors: Bowen Zheng, Ran Cheng

Comments: Accepted to IEEE TNNLS

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[614] arXiv:2512.04556 (cross-list from cs.GR) [pdf, ps, other]: Title: Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex

Authors: Zhizhen Wu, Zhe Cao, Yuchi Huo

Comments: 10 pages, 7 figures

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[615] arXiv:2512.04464 (cross-list from cs.LG) [pdf, ps, other]: Title: Feature Engineering vs. Deep Learning for Automated Coin Grading: A Comparative Study on Saint-Gaudens Double Eagles

Authors: Tanmay Dogra, Eric Ngo, Mohammad Alam, Jean-Paul Talavera, Asim Dahal

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[616] arXiv:2512.04385 (cross-list from cs.LG) [pdf, ps, other]: Title: STeP-Diff: Spatio-Temporal Physics-Informed Diffusion Models for Mobile Fine-Grained Pollution Forecasting

Authors: Nan Zhou, Weijie Hong, Huandong Wang, Jianfeng Zheng, Qiuhua Wang, Yali Song, Xiao-Ping Zhang, Yong Li, Xinlei Chen

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[617] arXiv:2512.04264 (cross-list from cs.LG) [pdf, ps, other]: Title: Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness

Authors: Long Dang, Thushari Hapuarachchi, Kaiqi Xiong, Jing Lin

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[618] arXiv:2512.04092 (cross-list from physics.soc-ph) [pdf, ps, other]: Title: The changing surface of the world's roads

Authors: Sukanya Randhawa, Guntaj Randhawa, Clemens Langer, Francis Andorful, Benjamin Herfort, Daniel Kwakye, Omer Olchik, Sven Lautenbach, Alexander Zipf

Subjects: Physics and Society (physics.soc-ph); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[619] arXiv:2512.04087 (cross-list from q-bio.NC) [pdf, ps, other]: Title: Human-Centred Evaluation of Text-to-Image Generation Models for Self-expression of Mental Distress: A Dataset Based on GPT-4o

Authors: Sui He, Shenbin Qian

Subjects: Neurons and Cognition (q-bio.NC); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)

Thu, 4 Dec 2025

[620] arXiv:2512.04085 [pdf, ps, other]: Title: Unique Lives, Shared World: Learning from Single-Life Videos

Authors: Tengda Han, Sayna Ebrahimi, Dilara Gokay, Li Yang Ku, Maks Ovsjanikov, Iva Babukova, Daniel Zoran, Viorica Patraucean, Joao Carreira, Andrew Zisserman, Dima Damen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[621] arXiv:2512.04084 [pdf, ps, other]: Title: SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows

Authors: Qinyu Zhao, Guangting Zheng, Tao Yang, Rui Zhu, Xingjian Leng, Stephen Gould, Liang Zheng

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[622] arXiv:2512.04082 [pdf, ps, other]: Title: PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

Authors: Jiazhe Wei, Ken Li, Tianyu Lao, Haofan Wang, Liang Wang, Caifeng Shan, Chenyang Si

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[623] arXiv:2512.04069 [pdf, ps, other]: Title: SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL

Authors: Siyi Chen, Mikaela Angelina Uy, Chan Hee Song, Faisal Ladhak, Adithyavairavan Murali, Qing Qu, Stan Birchfield, Valts Blukis, Jonathan Tremblay

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[624] arXiv:2512.04048 [pdf, ps, other]: Title: Stable Signer: Hierarchical Sign Language Generative Model

Authors: Sen Fang, Yalin Feng, Hongbin Zhong, Yanxin Zhang, Dimitris N. Metaxas

Comments: 12 pages, 7 figures. More Demo at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Computers and Society (cs.CY)
[625] arXiv:2512.04040 [pdf, ps, other]: Title: RELIC: Interactive Video World Model with Long-Horizon Memory

Authors: Yicong Hong, Yiqun Mei, Chongjian Ge, Yiran Xu, Yang Zhou, Sai Bi, Yannick Hold-Geoffroy, Mike Roberts, Matthew Fisher, Eli Shechtman, Kalyan Sunkavalli, Feng Liu, Zhengqi Li, Hao Tan

Comments: 22 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[626] arXiv:2512.04039 [pdf, ps, other]: Title: Fast & Efficient Normalizing Flows and Applications of Image Generative Models

Authors: Sandeep Nagar

Comments: PhD Thesis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[627] arXiv:2512.04025 [pdf, ps, other]: Title: PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation

Authors: Xiaolong Li, Youping Gu, Xi Lin, Weijie Wang, Bohan Zhuang

Comments: Tech report

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[628] arXiv:2512.04021 [pdf, ps, other]: Title: C3G: Learning Compact 3D Representations with 2K Gaussians

Authors: Honggyu An, Jaewoo Jung, Mungyeom Kim, Sunghwan Hong, Chaehyun Kim, Kazumi Fukuda, Minkyeong Jeon, Jisang Han, Takuya Narihira, Hyuna Ko, Junsu Kim, Yuki Mitsufuji, Seungryong Kim

Comments: Project Page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[629] arXiv:2512.04019 [pdf, ps, other]: Title: Ultra-lightweight Neural Video Representation Compression

Authors: Ho Man Kwan, Tianhao Peng, Ge Gao, Fan Zhang, Mike Nilsson, Andrew Gower, David Bull

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[630] arXiv:2512.04015 [pdf, ps, other]: Title: Learning Group Actions In Disentangled Latent Image Representations

Authors: Farhana Hossain Swarnali, Miaomiao Zhang, Tonmoy Hossain

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[631] arXiv:2512.04012 [pdf, ps, other]: Title: Emergent Outlier View Rejection in Visual Geometry Grounded Transformers

Authors: Jisang Han, Sunghwan Hong, Jaewoo Jung, Wooseok Jang, Honggyu An, Qianqian Wang, Seungryong Kim, Chen Feng

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[632] arXiv:2512.04007 [pdf, ps, other]: Title: On the Temporality for Sketch Representation Learning

Authors: Marcelo Isaias de Moraes Junior, Moacir Antonelli Ponti

Comments: Preprint submitted to Pattern Recognition Letters

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[633] arXiv:2512.04000 [pdf, ps, other]: Title: Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding

Authors: Jialuo Li, Bin Li, Jiahao Li, Yan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[634] arXiv:2512.03996 [pdf, ps, other]: Title: Highly Efficient Test-Time Scaling for T2I Diffusion Models with Text Embedding Perturbation

Authors: Hang Xu, Linjiang Huang, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[635] arXiv:2512.03992 [pdf, ps, other]: Title: DIQ-H: Evaluating Hallucination Persistence in VLMs Under Temporal Visual Degradation

Authors: Zexin Lin, Hawen Wan, Yebin Zhong, Xiaoqiang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[636] arXiv:2512.03981 [pdf, ps, other]: Title: DirectDrag: High-Fidelity, Mask-Free, Prompt-Free Drag-based Image Editing via Readout-Guided Feature Alignment

Authors: Sheng-Hao Liao, Shang-Fu Chen, Tai-Ming Huang, Wen-Huang Cheng, Kai-Lung Hua

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[637] arXiv:2512.03979 [pdf, ps, other]: Title: BlurDM: A Blur Diffusion Model for Image Deblurring

Authors: Jin-Ting He, Fu-Jen Tsai, Yan-Tsung Peng, Min-Hung Chen, Chia-Wen Lin, Yen-Yu Lin

Comments: NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[638] arXiv:2512.03964 [pdf, ps, other]: Title: Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization

Authors: Lianyu Pang, Ji Zhou, Qiping Wang, Baoquan Zhao, Zhenguo Yang, Qing Li, Xudong Mao

Comments: 17 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[639] arXiv:2512.03963 [pdf, ps, other]: Title: TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning

Authors: Tao Wu, Li Yang, Gen Zhan, Yabin Zhang, Yiting Liao, Junlin Li, Deliang Fu, Li Zhang, Limin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[640] arXiv:2512.03939 [pdf, ps, other]: Title: MUT3R: Motion-aware Updating Transformer for Dynamic 3D Reconstruction

Authors: Guole Shen, Tianchen Deng, Xingrui Qin, Nailin Wang, Jianyu Wang, Yanbo Wang, Yongtao Chen, Hesheng Wang, Jingchuan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[641] arXiv:2512.03932 [pdf, ps, other]: Title: Beyond the Ground Truth: Enhanced Supervision for Image Restoration

Authors: Donghun Ryou, Inju Ha, Sanghyeok Chu, Bohyung Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[642] arXiv:2512.03918 [pdf, ps, other]: Title: UniMo: Unifying 2D Video and 3D Human Motion with an Autoregressive Framework

Authors: Youxin Pang, Yong Zhang, Ruizhi Shao, Xiang Deng, Feng Gao, Xu Xiaoming, Xiaoming Wei, Yebin Liu

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[643] arXiv:2512.03905 [pdf, ps, other]: Title: Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence

Authors: Shuai Yang, Junxin Lin, Yifan Zhou, Ziwei Liu, Chen Change Loy

Comments: Code: this https URL, Project: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[644] arXiv:2512.03883 [pdf, ps, other]: Title: Dual Cross-Attention Siamese Transformer for Rectal Tumor Regrowth Assessment in Watch-and-Wait Endoscopy

Authors: Jorge Tapias Gomez, Despoina Kanata, Aneesh Rangnekar, Christina Lee, Julio Garcia-Aguilar, Joshua Jesse Smith, Harini Veeraraghavan

Comments: 6 pages, 5 figures, 1 table, submitted to ISBI conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[645] arXiv:2512.03869 [pdf, ps, other]: Title: An Automated Framework for Large-Scale Graph-Based Cerebrovascular Analysis

Authors: Daniele Falcetta, Liane S. Canas, Lorenzo Suppa, Matteo Pentassuglia, Jon Cleary, Marc Modat, Sébastien Ourselin, Maria A. Zuluaga

Comments: Submitted to ISBI 2026. 6 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[646] arXiv:2512.03862 [pdf, ps, other]: Title: Diminishing Returns in Self-Supervised Learning

Authors: Oli Bridge, Huey Sun, Botond Branyicskai-Nagy, Charles D'Ornano, Shomit Basu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[647] arXiv:2512.03854 [pdf, ps, other]: Title: Prostate biopsy whole slide image dataset from an underrepresented Middle Eastern population

Authors: Peshawa J. Muhammad Ali, Navin Vincent, Saman S. Abdulla, Han N. Mohammed Fadhl, Anders Blilie, Kelvin Szolnoky, Julia Anna Mielcarz, Xiaoyi Ji, Kimmo Kartasalo, Abdulbasit K. Al-Talabani, Nita Mulliqi

Comments: 13 pages, 2 figures and 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[648] arXiv:2512.03852 [pdf, ps, other]: Title: Traffic Image Restoration under Adverse Weather via Frequency-Aware Mamba

Authors: Liwen Pan, Longguang Wang, Guangwei Gao, Jun Wang, Jun Shi, Juncheng Li

Comments: 12pages, 13 figures, 5tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[649] arXiv:2512.03848 [pdf, ps, other]: Title: PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation

Authors: Hania Ghouse, Maryam Alsharqi, Farhad R. Nezami, Muzammil Behzad

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[650] arXiv:2512.03844 [pdf, ps, other]: Title: CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation

Authors: Letian Zhou, Songhua Liu, Xinchao Wang

Comments: 34 pages, 24 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[651] arXiv:2512.03837 [pdf, ps, other]: Title: Heatmap Pooling Network for Action Recognition from RGB Videos

Authors: Mengyuan Liu, Jinfu Liu, Yongkang Jiang, Bin He

Comments: Final Version of IEEE Transactions on Pattern Analysis and Machine Intelligence

Journal-ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[652] arXiv:2512.03834 [pdf, ps, other]: Title: Lean Unet: A Compact Model for Image Segmentation

Authors: Ture Hassler, Ida Åkerholm, Marcus Nordström, Gabriele Balletti, Orcun Goksel

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[653] arXiv:2512.03827 [pdf, ps, other]: Title: A Robust Camera-based Method for Breath Rate Measurement

Authors: Alexey Protopopov

Comments: 9 pages, 4 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[654] arXiv:2512.03817 [pdf, ps, other]: Title: HieroGlyphTranslator: Automatic Recognition and Translation of Egyptian Hieroglyphs to English

Authors: Ahmed Nasser, Marwan Mohamed, Alaa Sherif, Basmala Mahmoud, Shereen Yehia, Asmaa Saad, Mariam S. El-Rahmany, Ensaf H. Mohamed

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[655] arXiv:2512.03796 [pdf, ps, other]: Title: LSRS: Latent Scale Rejection Sampling for Visual Autoregressive Modeling

Authors: Hong-Kai Zheng, Piji Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[656] arXiv:2512.03794 [pdf, ps, other]: Title: AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition

Authors: Zichuan Lin, Yicheng Liu, Yang Yang, Lvfang Tao, Deheng Ye

Comments: 15 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[657] arXiv:2512.03751 [pdf, ps, other]: Title: Research on Brain Tumor Classification Method Based on Improved ResNet34 Network

Authors: Yufeng Li, Wenchao Zhao, Bo Dang, Weimin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[658] arXiv:2512.03749 [pdf, ps, other]: Title: Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models

Authors: Korada Sri Vardhana, Shrikrishna Lolla, Soma Biswas

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[659] arXiv:2512.03746 [pdf, ps, other]: Title: Thinking with Programming Vision: Towards a Unified View for Thinking with Images

Authors: Zirun Guo, Minjie Hong, Feng Zhang, Kai Jia, Tao Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[660] arXiv:2512.03745 [pdf, ps, other]: Title: Dual-level Modality Debiasing Learning for Unsupervised Visible-Infrared Person Re-Identification

Authors: Jiaze Li, Yan Lu, Bin Liu, Guojun Yin, Mang Ye

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[661] arXiv:2512.03730 [pdf, ps, other]: Title: Out-of-the-box: Black-box Causal Attacks on Object Detectors

Authors: Melane Navaratnarajah, David A. Kelly, Hana Chockler

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[662] arXiv:2512.03724 [pdf, ps, other]: Title: PosA-VLA: Enhancing Action Generation via Pose-Conditioned Anchor Attention

Authors: Ziwen Li, Xin Wang, Hanlue Zhang, Runnan Chen, Runqi Lin, Xiao He, Han Huang, Yandong Guo, Fakhri Karray, Tongliang Liu, Mingming Gong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[663] arXiv:2512.03715 [pdf, ps, other]: Title: DINO-RotateMatch: A Rotation-Aware Deep Framework for Robust Image Matching in Large-Scale 3D Reconstruction

Authors: Kaichen Zhang, Tianxiang Sheng, Xuanming Shi

Comments: 9 pages, 5 figures, 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[664] arXiv:2512.03701 [pdf, ps, other]: Title: Structured Uncertainty Similarity Score (SUSS): Learning a Probabilistic, Interpretable, Perceptual Metric Between Images

Authors: Paula Seidler, Neill D. F. Campbell, Ivor J A Simpson

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[665] arXiv:2512.03687 [pdf, ps, other]: Title: Active Visual Perception: Opportunities and Challenges

Authors: Yian Li, Xiaoyu Guo, Hao Zhang, Shuiwang Li, Xiaowei Dai

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[666] arXiv:2512.03683 [pdf, ps, other]: Title: GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces

Authors: Melis Ocal, Xiaoyan Xing, Yue Li, Ngo Anh Vien, Sezer Karaoglu, Theo Gevers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[667] arXiv:2512.03673 [pdf, ps, other]: Title: ConvRot: Rotation-Based Plug-and-Play 4-bit Quantization for Diffusion Transformers

Authors: Feice Huang, Zuliang Han, Xing Zhou, Yihuang Chen, Lifei Zhu, Haoqian Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[668] arXiv:2512.03667 [pdf, ps, other]: Title: Colon-X: Advancing Intelligent Colonoscopy from Multimodal Understanding to Clinical Reasoning

Authors: Ge-Peng Ji, Jingyi Liu, Deng-Ping Fan, Nick Barnes

Comments: Technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[669] arXiv:2512.03666 [pdf, ps, other]: Title: ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos

Authors: Qi'ao Xu, Tianwen Qian, Yuqian Fu, Kailing Li, Yang Jiao, Jiacheng Zhang, Xiaoling Wang, Liang He

Comments: 26 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[670] arXiv:2512.03663 [pdf, ps, other]: Title: Multi-Scale Visual Prompting for Lightweight Small-Image Classification

Authors: Salim Khazem

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[671] arXiv:2512.03643 [pdf, ps, other]: Title: Optical Context Compression Is Just (Bad) Autoencoding

Authors: Ivan Yee Lee, Cheng Yang, Taylor Berg-Kirkpatrick

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[672] arXiv:2512.03640 [pdf, ps, other]: Title: MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms

Authors: Jiahao Zhang, Xiao Zhao, Guangyu Gao

Journal-ref: MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15521. Springer, Singapore

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[673] arXiv:2512.03625 [pdf, ps, other]: Title: FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features

Authors: Zhigang Yang, Yuan Liu, Jiawei Zhang, Puning Zhang, Xinqiang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[674] arXiv:2512.03621 [pdf, ps, other]: Title: ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation

Authors: Yaokun Li, Shuaixian Wang, Mantang Guo, Jiehui Huang, Taojun Ding, Mu Hu, Kaixuan Wang, Shaojie Shen, Guang Tan

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[675] arXiv:2512.03619 [pdf, ps, other]: Title: LAMP: Language-Assisted Motion Planning for Controllable Video Generation

Authors: Muhammed Burak Kizil, Enes Sanli, Niloy J. Mitra, Erkut Erdem, Aykut Erdem, Duygu Ceylan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[676] arXiv:2512.03601 [pdf, ps, other]: Title: Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

Authors: Haoran Zhou, Gim Hee Lee

Comments: Accepted to NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[677] arXiv:2512.03598 [pdf, ps, other]: Title: Memory-Guided Point Cloud Completion for Dental Reconstruction

Authors: Jianan Sun, Yukang Huang, Dongzhihan Wang, Mingyu Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[678] arXiv:2512.03597 [pdf, ps, other]: Title: HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation

Authors: Fuchen Zheng, Xinyi Chen, Weixuan Li, Quanjun Li, Junhua Zhou, Xiaojiao Guo, Xuhang Chen, Chi-Man Pun, Shoujun Zhou

Comments: 6 pages, 4 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[679] arXiv:2512.03593 [pdf, ps, other]: Title: CloseUpAvatar: High-Fidelity Animatable Full-Body Avatars with Mixture of Multi-Scale Textures

Authors: David Svitov, Pietro Morerio, Lourdes Agapito, Alessio Del Bue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[680] arXiv:2512.03592 [pdf, ps, other]: Title: Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding

Authors: Guang Yang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[681] arXiv:2512.03590 [pdf, ps, other]: Title: Beyond Boundary Frames: Audio-Visual Semantic Guidance for Context-Aware Video Interpolation

Authors: Yuchen Deng, Xiuyang Wu, Hai-Tao Zheng, Jie Wang, Feidiao Yang, Yuxing Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[682] arXiv:2512.03580 [pdf, ps, other]: Title: Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes

Authors: Malte Bleeker, Mauro Gotsch

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[683] arXiv:2512.03577 [pdf, ps, other]: Title: Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning

Authors: Yizhi Zhang, Lei Fan, Zhulin Tao, Donglin Di, Yang Song, Sidong Liu, Cong Cong

Comments: 6 pages, 2 figures. Camera-ready version accepted for IEEE BIBM 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[684] arXiv:2512.03575 [pdf, ps, other]: Title: UniComp: Rethinking Video Compression Through Informational Uniqueness

Authors: Chao Yuan, Shimin Chen, Minliang Lin, Limeng Qiao, Guanglu Wan, Lin Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[685] arXiv:2512.03574 [pdf, ps, other]: Title: Global-Local Aware Scene Text Editing

Authors: Fuxiang Yang, Tonghua Su, Donglin Di, Yin Chen, Xiangqian Wu, Zhongjie Wang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[686] arXiv:2512.03566 [pdf, ps, other]: Title: GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models

Authors: Hao Sun, Lei Fan, Donglin Di, Shaohui Liu

Comments: Accepted by ACM MM Asia2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[687] arXiv:2512.03558 [pdf, ps, other]: Title: CartoMapQA: A Fundamental Benchmark Dataset Evaluating Vision-Language Models on Cartographic Map Understanding

Authors: Huy Quang Ung, Guillaume Habault, Yasutaka Nishimura, Hao Niu, Roberto Legaspi, Tomoki Oya, Ryoichi Kojima, Masato Taya, Chihiro Ono, Atsunori Minamikawa, Yan Liu

Comments: Accepted at SIGSPATIAL 2025 (Best paper candidates), 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[688] arXiv:2512.03553 [pdf, ps, other]: Title: Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching

Authors: Wei Chee Yew, Hailun Xu, Sanjay Saha, Xiaotian Fan, Hiok Hian Ong, David Yuchen Wang, Kanchan Sarkar, Zhenheng Yang, Danhui Guan

Comments: Accepted at KDD 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[689] arXiv:2512.03542 [pdf, ps, other]: Title: V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time Intervention

Authors: Nan Sun, Zhenyu Zhang, Xixun Lin, Kun Wang, Yanmin Shang, Naibin Gu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang, Yanan Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[690] arXiv:2512.03540 [pdf, ps, other]: Title: CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation

Authors: Ruoxuan Zhang, Bin Wen, Hongxia Xie, Yi Yao, Songhan Zuo, Jian-Yu Jiang-Lin, Hong-Han Shuai, Wen-Huang Cheng

Comments: Accepted by ACM Multimedia 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[691] arXiv:2512.03534 [pdf, ps, other]: Title: Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Authors: Subin Kim, Sangwoo Mo, Mamshad Nayeem Rizve, Yiran Xu, Difan Liu, Jinwoo Shin, Tobias Hinz

Comments: Visualizations are available at the website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[692] arXiv:2512.03532 [pdf, ps, other]: Title: OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation

Authors: Zhishan Zhou, Siyuan Wei, Zengran Wang, Chunjie Wang, Xiaosheng Yan, Xiao Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[693] arXiv:2512.03520 [pdf, ps, other]: Title: FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation

Authors: Yiyi Cai, Yuhan Wu, Kunhang Li, You Zhou, Bo Zheng, Haiyang Liu

Comments: 15 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[694] arXiv:2512.03510 [pdf, ps, other]: Title: CSMapping: Scalable Crowdsourced Semantic Mapping and Topology Inference for Autonomous Driving

Authors: Zhijian Qiao, Zehuan Yu, Tong Li, Chih-Chung Chou, Wenchao Ding, Shaojie Shen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[695] arXiv:2512.03509 [pdf, ps, other]: Title: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model

Authors: Kwaku Opoku-Ware, Gideon Opoku

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[696] arXiv:2512.03508 [pdf, ps, other]: Title: Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation

Authors: Seogkyu Jeon, Kibeom Hong, Hyeran Byun

Comments: ICCV 2025 (poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[697] arXiv:2512.03500 [pdf, ps, other]: Title: EEA: Exploration-Exploitation Agent for Long Video Understanding

Authors: Te Yang, Xiangyu Zhu, Bo Wang, Quan Chen, Peng Jiang, Zhen Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[698] arXiv:2512.03499 [pdf, ps, other]: Title: NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation

Authors: Renqi Chen, Haoyang Su, Shixiang Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[699] arXiv:2512.03479 [pdf, ps, other]: Title: Towards Object-centric Understanding for Instructional Videos

Authors: Wenliang Guo, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[700] arXiv:2512.03477 [pdf, ps, other]: Title: Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis

Authors: Zijian Gu, Yuxi Liu, Zhenhao Zhang, Song Wang

Comments: 10 pages, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[701] arXiv:2512.03474 [pdf, ps, other]: Title: Procedural Mistake Detection via Action Effect Modeling

Authors: Wenliang Guo, Yujiang Pu, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[702] arXiv:2512.03470 [pdf, ps, other]: Title: Difference Decomposition Networks for Infrared Small Target Detection

Authors: Chen Hu, Mingyu Zhou, Shuai Yuan, Hongbo Hu, Xiangyu Qiu, Junhai Luo, Tian Pu, Xiyin Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[703] arXiv:2512.03463 [pdf, ps, other]: Title: Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models

Authors: Shojiro Yamabe, Futa Waseda, Daiki Shiono, Tsubasa Takahashi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[704] arXiv:2512.03454 [pdf, ps, other]: Title: Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles

Authors: Haicheng Liao, Huanming Shen, Bonan Wang, Yongkang Li, Yihong Tang, Chengyue Wang, Dingyi Zhuang, Kehua Chen, Hai Yang, Chengzhong Xu, Zhenning Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[705] arXiv:2512.03453 [pdf, ps, other]: Title: GeoVideo: Introducing Geometric Regularization into Video Generation Model

Authors: Yunpeng Bai, Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[706] arXiv:2512.03451 [pdf, ps, other]: Title: GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers

Authors: Zhiye Song, Steve Dai, Ben Keller, Brucek Khailany

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[707] arXiv:2512.03450 [pdf, ps, other]: Title: KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models

Authors: Rhys Newbury, Juyan Zhang, Tin Tran, Hanna Kurniawati, Dana Kulić

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[708] arXiv:2512.03449 [src]: Title: LM-CartSeg: Automated Segmentation of Lateral and Medial Cartilage and Subchondral Bone for Radiomics Analysis

Authors: Tongxu Zhang

Comments: The manuscript represents only a preliminary and substantially incompleted exploration. The author has decided not to stand by these results, and a thoroughly revised and significantly different version will be developed separately. Therefore this version is withdrawn and should not be cited

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[709] arXiv:2512.03445 [pdf, ps, other]: Title: Multi-Aspect Knowledge-Enhanced Medical Vision-Language Pretraining with Multi-Agent Data Generation

Authors: Xieji Li, Siyuan Yan, Yingsheng Liu, H. Peter Soyer, Monika Janda, Victoria Mar, Zongyuan Ge

Comments: 10 pages. Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[710] arXiv:2512.03430 [pdf, ps, other]: Title: Label-Efficient Hyperspectral Image Classification via Spectral FiLM Modulation of Low-Level Pretrained Diffusion Features

Authors: Yuzhen Hu, Biplab Banerjee, Saurabh Prasad

Comments: Accepted to the ICML 2025 TerraBytes Workshop (June 9, 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[711] arXiv:2512.03427 [pdf, ps, other]: Title: Generalization Evaluation of Deep Stereo Matching Methods for UAV-Based Forestry Applications

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[712] arXiv:2512.03424 [pdf, ps, other]: Title: DM3D: Deformable Mamba via Offset-Guided Gaussian Sequencing for Point Cloud Understanding

Authors: Bin Liu, Chunyang Wang, Xuelian Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[713] arXiv:2512.03418 [pdf, ps, other]: Title: YOLOA: Real-Time Affordance Detection via LLM Adapter

Authors: Yuqi Ji, Junjie Ke, Lihuo He, Jun Liu, Kaifan Zhang, Yu-Kun Lai, Guiguang Ding, Xinbo Gao

Comments: 13 pages, 9 figures, conference

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[714] arXiv:2512.03405 [pdf, ps, other]: Title: ViDiC: Video Difference Captioning

Authors: Jiangtao Wu, Shihao Li, Zhaozhou Bian, Jialu Chen, Runzhe Wen, An Ping, Yiwen He, Jiakai Wang, Yuanxing Zhang, Jiaheng Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[715] arXiv:2512.03404 [pdf, ps, other]: Title: MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification

Authors: Yujian Zhao, Hankun Liu, Guanglin Niu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[716] arXiv:2512.03370 [pdf, ps, other]: Title: ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding

Authors: Lingjun Zhao, Yandong Luo, James Hay, Lu Gan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[717] arXiv:2512.03369 [pdf, ps, other]: Title: FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread Forecasting

Authors: Nan Zhou, Huandong Wang, Jiahao Li, Han Li, Yali Song, Qiuhua Wang, Yong Li, Xinlei Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[718] arXiv:2512.03359 [pdf, ps, other]: Title: A Hybrid Deep Learning Framework with Explainable AI for Lung Cancer Classification with DenseNet169 and SVM

Authors: Md Rashidul Islam, Bakary Gibba, Altagi Abdallah Bakheit Abdelgadir

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[719] arXiv:2512.03350 [pdf, ps, other]: Title: SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation

Authors: Yu Yuan, Tharindu Wickremasinghe, Zeeshan Nadir, Xijun Wang, Yiheng Chi, Stanley H. Chan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[720] arXiv:2512.03346 [pdf, ps, other]: Title: Hierarchical Attention for Sparse Volumetric Anomaly Detection in Subclinical Keratoconus

Authors: Lynn Kandakji, William Woof, Nikolas Pontikos

Comments: 16 pages, 7 figures, 6 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[721] arXiv:2512.03345 [pdf, ps, other]: Title: HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration

Authors: Seunghoi Kim, Henry F. J. Tregidgo, Chen Jin, Matteo Figini, Daniel C. Alexander

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[722] arXiv:2512.03339 [pdf, ps, other]: Title: ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in Echocardiography

Authors: Yeganeh Ghamary, Victoria Wu, Hooman Vaseli, Christina Luong, Teresa Tsang, Siavash Bigdeli, Purang Abolmaesumi

Comments: 11 pages, Accepted in IMIMIC Workshop at MICCAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[723] arXiv:2512.03335 [pdf, ps, other]: Title: Step-by-step Layered Design Generation

Authors: Faizan Farooq Khan, K J Joseph, Koustava Goswami, Mohamed Elhoseiny, Balaji Vasan Srinivasan

Journal-ref: AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[724] arXiv:2512.03317 [pdf, ps, other]: Title: NavMapFusion: Diffusion-based Fusion of Navigation Maps for Online Vectorized HD Map Construction

Authors: Thomas Monninger, Zihan Zhang, Steffen Staab, Sihao Ding

Comments: Accepted to 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2026)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[725] arXiv:2512.03284 [pdf, ps, other]: Title: SpatialReasoner: Active Perception for Large-Scale 3D Scene Understanding

Authors: Hongpei Zheng, Shijie Li, Yanran Li, Hujun Yin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[726] arXiv:2512.03257 [pdf, ps, other]: Title: PyroFocus: A Deep Learning Approach to Real-Time Wildfire Detection in Multispectral Remote Sensing Imagery

Authors: Mark Moussa, Andre Williams, Seth Roffe, Douglas Morton

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[727] arXiv:2512.03247 [pdf, ps, other]: Title: PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement

Authors: Haitian Zheng, Yuan Yao, Yongsheng Yu, Yuqian Zhou, Jiebo Luo, Zhe Lin

Comments: Published in the Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[728] arXiv:2512.03245 [pdf, ps, other]: Title: 2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition

Authors: Liying Lu, Raphaël Achddou, Sabine Süsstrunk

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[729] arXiv:2512.03237 [pdf, ps, other]: Title: LLM-Guided Material Inference for 3D Point Clouds

Authors: Nafiseh Izadyar, Teseo Schneider

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[730] arXiv:2512.03233 [pdf, ps, other]: Title: Object Counting with GPT-4o and GPT-5: A Comparative Study

Authors: Richard Füzesséry, Kaziwa Saleh, Sándor Szénási, Zoltán Vámossy

Comments: 5 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[731] arXiv:2512.03210 [pdf, ps, other]: Title: Flux4D: Flow-based Unsupervised 4D Reconstruction

Authors: Jingkang Wang, Henry Che, Yun Chen, Ze Yang, Lily Goli, Sivabalan Manivasagam, Raquel Urtasun

Comments: NeurIPS 2025. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[732] arXiv:2512.03199 [pdf, ps, other]: Title: Does Head Pose Correction Improve Biometric Facial Recognition?

Authors: Justin Norman, Hany Farid

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[733] arXiv:2512.03182 [pdf, ps, other]: Title: Drainage: A Unifying Framework for Addressing Class Uncertainty

Authors: Yasser Taha, Grégoire Montavon, Nils Körber

Comments: 16 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[734] arXiv:2512.03126 [pdf, ps, other]: Title: Hierarchical Process Reward Models are Symbolic Vision Learners

Authors: Shan Zhang, Aotian Chen, Kai Zou, Jindong Gu, Yuan Xue, Anton van den Hengel

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[735] arXiv:2512.04076 (cross-list from cs.GR) [pdf, ps, other]: Title: Radiance Meshes for Volumetric Reconstruction

Authors: Alexander Mai, Trevor Hedstrom, George Kopanas, Janne Kontkanen, Falko Kuester, Jonathan T. Barron

Comments: Website: half-potato.gitlab.io/rm

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[736] arXiv:2512.04032 (cross-list from cs.CL) [pdf, ps, other]: Title: Jina-VLM: Small Multilingual Vision Language Model

Authors: Andreas Koukounas, Georgios Mastrapas, Florian Hönicke, Sedigheh Eslami, Guillaume Roncari, Scott Martens, Han Xiao

Comments: 18 pages, 1-7 main content, 13-18 appendix for tables and dataset

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[737] arXiv:2512.03995 (cross-list from cs.RO) [pdf, ps, other]: Title: Artificial Microsaccade Compensation: Stable Vision for an Ornithopter

Authors: Levi Burner, Guido de Croon, Yiannis Aloimonos

Comments: 29 pages, 5 figures, 2 tables, under review

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[738] arXiv:2512.03962 (cross-list from eess.IV) [pdf, ps, other]: Title: Tada-DIP: Input-adaptive Deep Image Prior for One-shot 3D Image Reconstruction

Authors: Evan Bell, Shijun Liang, Ismail Alkhouri, Saiprasad Ravishankar

Comments: 6 pages, 8 figures, 2025 Asilomar Conference on Signals, Systems, and Computers. Code is available at github.com/evanbell02/Tada-DIP/

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[739] arXiv:2512.03656 (cross-list from cs.LG) [pdf, ps, other]: Title: Cyclical Temporal Encoding and Hybrid Deep Ensembles for Multistep Energy Forecasting

Authors: Salim Khazem, Houssam Kanso

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[740] arXiv:2512.03556 (cross-list from cs.RO) [pdf, ps, other]: Title: RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL

Authors: Yinzhou Tang, Yu Shang, Yinuo Chen, Bingwen Wei, Xin Zhang, Shu'ang Yu, Liangzhi Shi, Chao Yu, Chen Gao, Wei Wu, Yong Li

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[741] arXiv:2512.03522 (cross-list from cs.RO) [pdf, ps, other]: Title: MSG-Loc: Multi-Label Likelihood-based Semantic Graph Matching for Object-Level Global Localization

Authors: Gihyeon Lee, Jungwoo Lee, Juwon Kim, Young-Sik Shin, Younggun Cho

Comments: Accepted in IEEE Robotics and Automation Letters (2025)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[742] arXiv:2512.03514 (cross-list from cs.IR) [pdf, ps, other]: Title: M3DR: Towards Universal Multilingual Multimodal Document Retrieval

Authors: Adithya S Kolavi, Vyoman Jain

Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[743] arXiv:2512.03422 (cross-list from cs.RO) [pdf, ps, other]: Title: What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models

Authors: Tianchen Deng, Yue Pan, Shenghai Yuan, Dong Li, Chen Wang, Mingrui Li, Long Chen, Lihua Xie, Danwei Wang, Jingchuan Wang, Javier Civera, Hesheng Wang, Weidong Chen

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[744] arXiv:2512.03216 (cross-list from physics.ins-det) [pdf, ps, other]: Title: Kaleidoscopic Scintillation Event Imaging

Authors: Alex Bocchieri, John Mamish, David Appleyard, Andreas Velten

Subjects: Instrumentation and Detectors (physics.ins-det); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[745] arXiv:2512.03173 (cross-list from cs.CY) [pdf, ps, other]: Title: Culture Affordance Atlas: Reconciling Object Diversity Through Functional Mapping

Authors: Joan Nwatu, Longju Bai, Oana Ignat, Rada Mihalcea

Journal-ref: AAAI 2026 Social Impact Track

Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[746] arXiv:2512.03166 (cross-list from cs.RO) [pdf, ps, other]: Title: Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments

Authors: Aya Taourirte, Md Sohag Mia

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[747] arXiv:2512.03111 (cross-list from q-bio.GN) [pdf, ps, other]: Title: PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer

Authors: Xiaoshui Huang, Tianlin Zhu, Yifan Zuo, Xue Xia, Zonghan Wu, Jiebin Yan, Dingli Hua, Zongyi Xu, Yuming Fang, Jian Zhang

Comments: Accepted by AAAI 2026

Subjects: Genomics (q-bio.GN); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[748] arXiv:2512.03054 (cross-list from cs.LG) [pdf, ps, other]: Title: Energy-Efficient Federated Learning via Adaptive Encoder Freezing for MRI-to-CT Conversion: A Green AI-Guided Research

Authors: Ciro Benito Raggio, Lucia Migliorelli, Nils Skupien, Mathias Krohmer Zabaleta, Oliver Blanck, Francesco Cicone, Giuseppe Lucio Cascini, Paolo Zaffino, Maria Francesca Spadea

Comments: 22 pages, 13 figures

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Medical Physics (physics.med-ph)
[749] arXiv:2512.03052 (cross-list from cs.GR) [pdf, ps, other]: Title: LATTICE: Democratize High-Fidelity 3D Generation at Scale

Authors: Zeqiang Lai, Yunfei Zhao, Zibo Zhao, Haolin Liu, Qingxiang Lin, Jingwei Huang, Chunchao Guo, Xiangyu Yue

Comments: Technical Report

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)

Wed, 10 Dec 2025
Tue, 9 Dec 2025
Mon, 8 Dec 2025
Fri, 5 Dec 2025
Thu, 4 Dec 2025

[ total of 749 entries: 1-749 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Wed, 10 Dec 2025

Tue, 9 Dec 2025

Mon, 8 Dec 2025

Fri, 5 Dec 2025

Thu, 4 Dec 2025