Computer Vision and Pattern Recognition

Authors and titles for recent submissions

[ total of 749 entries: 1-250 | 251-500 | 501-749 ]
[ showing 250 entries per page: fewer | more | all ]

Wed, 10 Dec 2025

[1] arXiv:2512.08931 [pdf, ps, other]: Title: Astra: General Interactive World Model with Autoregressive Denoising

Authors: Yixuan Zhu, Jiaqi Feng, Wenzhao Zheng, Yuan Gao, Xin Tao, Pengfei Wan, Jie Zhou, Jiwen Lu

Comments: Code is available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[2] arXiv:2512.08930 [pdf, ps, other]: Title: Selfi: Self Improving Reconstruction Engine via 3D Geometric Feature Alignment

Authors: Youming Deng, Songyou Peng, Junyi Zhang, Kathryn Heal, Tiancheng Sun, John Flynn, Steve Marschner, Lucy Chai

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[3] arXiv:2512.08924 [pdf, ps, other]: Title: Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

Authors: Chuhan Zhang, Guillaume Le Moing, Skanda Koppula, Ignacio Rocco, Liliane Momeni, Junyu Xie, Shuyang Sun, Rahul Sukthankar, Joëlle K Barral, Raia Hadsell, Zoubin Ghahramani, Andrew Zisserman, Junlin Zhang, Mehdi SM Sajjadi

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2512.08922 [pdf, ps, other]: Title: Unified Diffusion Transformer for High-fidelity Text-Aware Image Restoration

Authors: Jin Hyeon Kim, Paul Hyunbin Cho, Claire Kim, Jaewon Min, Jaeeun Lee, Jihye Park, Yeji Choi, Seungryong Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[5] arXiv:2512.08912 [pdf, ps, other]: Title: LiDAS: Lighting-driven Dynamic Active Sensing for Nighttime Perception

Authors: Simon de Moreau, Andrei Bursuc, Hafid El-Idrissi, Fabien Moutarde

Comments: Preprint. 12 pages, 9 figures. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[6] arXiv:2512.08905 [pdf, ps, other]: Title: Self-Evolving 3D Scene Generation from a Single Image

Authors: Kaizhi Zheng, Yue Fan, Jing Gu, Zishuo Xu, Xuehai He, Xin Eric Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7] arXiv:2512.08897 [pdf, ps, other]: Title: UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation

Authors: Zeyang Liu, Le Wang, Sanping Zhou, Yuxuan Wu, Xiaolong Sun, Gang Hua, Haoxiang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8] arXiv:2512.08889 [pdf, ps, other]: Title: No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers

Authors: Damiano Marsili, Georgia Gkioxari

Comments: Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[9] arXiv:2512.08888 [pdf, ps, other]: Title: Accelerated Rotation-Invariant Convolution for UAV Image Segmentation

Authors: Manduhu Manduhu, Alexander Dow, Gerard Dooly, James Riordan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[10] arXiv:2512.08881 [pdf, ps, other]: Title: SATGround: A Spatially-Aware Approach for Visual Grounding in Remote Sensing

Authors: Aysim Toker, Andreea-Maria Oncescu, Roy Miles, Ismail Elezi, Jiankang Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[11] arXiv:2512.08873 [pdf, ps, other]: Title: Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image Captioning

Authors: Jing Jie Tan, Anissa Mokraoui, Ban-Hoe Kwan, Danny Wee-Kiat Ng, Yan-Chai Hum

Comments: 6 pages

Journal-ref: 2024 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
[12] arXiv:2512.08860 [pdf, ps, other]: Title: Tri-Bench: Stress-Testing VLM Reliability on Spatial Reasoning under Camera Tilt and Object Interference

Authors: Amit Bendkhale

Comments: 6 pages, 3 figures. Code and data: this https URL Accepted to the AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13] arXiv:2512.08854 [pdf, ps, other]: Title: Generation is Required for Data-Efficient Perception

Authors: Jack Brady, Bernhard Schölkopf, Thomas Kipf, Simon Buchholz, Wieland Brendel

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[14] arXiv:2512.08829 [pdf, ps, other]: Title: InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

Authors: Hongyuan Tao, Bencheng Liao, Shaoyu Chen, Haoran Yin, Qian Zhang, Wenyu Liu, Xinggang Wang

Comments: 16 pages, 8 figures, conference or other essential info

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15] arXiv:2512.08820 [pdf, ps, other]: Title: Training-Free Dual Hyperbolic Adapters for Better Cross-Modal Reasoning

Authors: Yi Zhang, Chun-Wun Cheng, Junyi He, Ke Yu, Yushun Tang, Carola-Bibiane Schönlieb, Zhihai He, Angelica I. Aviles-Rivero

Comments: Accepted in IEEE Transactions on Multimedia (TMM)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[16] arXiv:2512.08789 [pdf, ps, other]: Title: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance

Authors: Chaewon Kim, Seoyeon Lee, Jonghyuk Park

Comments: 10 pages, 7 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[17] arXiv:2512.08785 [pdf, ps, other]: Title: LoFA: Learning to Predict Personalized Priors for Fast Adaptation of Visual Generative Models

Authors: Yiming Hao, Mutian Xu, Chongjie Ye, Jie Qin, Shunlin Lu, Yipeng Qin, Xiaoguang Han

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2512.08774 [pdf, ps, other]: Title: Refining Visual Artifacts in Diffusion Models via Explainable AI-based Flaw Activation Maps

Authors: Seoyeon Lee, Gwangyeol Yu, Chaewon Kim, Jonghyuk Park

Comments: 10 pages, 9 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[19] arXiv:2512.08765 [pdf, ps, other]: Title: Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Authors: Ruihang Chu, Yefei He, Zhekai Chen, Shiwei Zhang, Xiaogang Xu, Bin Xia, Dingdong Wang, Hongwei Yi, Xihui Liu, Hengshuang Zhao, Yu Liu, Yingya Zhang, Yujiu Yang

Comments: NeurlPS 2025. Code and data available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2512.08751 [pdf, ps, other]: Title: Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge Devices

Authors: Kuniko Paxton, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis Papadopoulos

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[21] arXiv:2512.08747 [pdf, ps, other]: Title: A Scalable Pipeline Combining Procedural 3D Graphics and Guided Diffusion for Photorealistic Synthetic Training Data Generation in White Button Mushroom Segmentation

Authors: Artúr I. Károly, Péter Galambos

Comments: 20 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2512.08738 [pdf, ps, other]: Title: Pose-Based Sign Language Spotting via an End-to-End Encoder Architecture

Authors: Samuel Ebimobowei Johnny, Blessed Guda, Emmanuel Enejo Aaron, Assane Gueye

Comments: To appear at AACL-IJCNLP 2025 Workshop WSLP

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[23] arXiv:2512.08733 [pdf, ps, other]: Title: Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting

Authors: Kuniko Paxton, Zeinab Dehghani, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis Papadopoulos

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[24] arXiv:2512.08730 [pdf, ps, other]: Title: SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing Images

Authors: Kaiyu Li, Shengqi Zhang, Yupeng Deng, Zhi Wang, Deyu Meng, Xiangyong Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2512.08700 [pdf, ps, other]: Title: Scale-invariant and View-relational Representation Learning for Full Surround Monocular Depth

Authors: Kyumin Hwang, Wonhyeok Choi, Kiljoon Han, Wonjoon Choi, Minwoo Choi, Yongcheon Na, Minwoo Park, Sunghoon Im

Comments: Accepted at IEEE Robotics and Automation Letters (RA-L) 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2512.08697 [pdf, ps, other]: Title: What really matters for person re-identification? A Mixture-of-Experts Framework for Semantic Attribute Importance

Authors: Athena Psalta, Vasileios Tsironis, Konstantinos Karantzalos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27] arXiv:2512.08673 [pdf, ps, other]: Title: Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds

Authors: Shaofeng Zhang, Xuanqi Chen, Xiangdong Zhang, Sitong Wu, Junchi Yan

Comments: 16 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2512.08648 [pdf, ps, other]: Title: Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank

Authors: Shaofeng Zhang, Xuanqi Chen, Ning Liao, Haoxiang Zhao, Xiaoxing Wang, Haoru Tan, Sitong Wu, Xiaosong Jia, Qi Fan, Junchi Yan

Comments: 19 pages, 19 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2512.08647 [pdf, ps, other]: Title: C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior Recognition

Authors: Keito Inoshita

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30] arXiv:2512.08645 [pdf, ps, other]: Title: Chain-of-Image Generation: Toward Monitorable and Controllable Image Generation

Authors: Young Kyung Kim, Oded Schlesinger, Yuzhou Zhao, J. Matias Di Martino, Guillermo Sapiro

Comments: 19 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2512.08639 [pdf, ps, other]: Title: Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

Authors: Huilin Xu, Zhuoyang Liu, Yixiang Luomei, Feng Xu

Comments: Under Review, 12 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[32] arXiv:2512.08627 [pdf, ps, other]: Title: Trajectory Densification and Depth from Perspective-based Blur

Authors: Tianchen Qiu, Qirun Zhang, Jiajian He, Zhengyue Zhuge, Jiahui Xu, Yueting Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2512.08625 [pdf, ps, other]: Title: OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set Semantics

Authors: Jisang Yoo, Gyeongjin Kang, Hyun-kyu Ko, Hyeonwoo Yu, Eunbyung Park

Comments: 8 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2512.08606 [pdf, ps, other]: Title: Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot Learning

Authors: Zhenyu Zhang, Guangyao Chen, Yixiong Zou, Zhimeng Huang, Yuhua Li

Comments: 14 pages, 8 figures, Association for the Advancement of Artificial Intelligence (AAAI2026, poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[35] arXiv:2512.08589 [pdf, ps, other]: Title: Automated Pollen Recognition in Optical and Holographic Microscopy Images

Authors: Swarn Singh Warshaneyan, Maksims Ivanovs, Blaž Cugmas, Inese Bērziņa, Laura Goldberga, Mindaugas Tamosiunas, Roberts Kadiķis

Comments: 08 pages, 10 figures, 04 tables, 20 references. Date of Conference: 13-14 June 2025 Date Added to IEEE Xplore: 10 July 2025 Electronic ISBN: 979-8-3315-0969-9 Print on Demand(PoD) ISBN: 979-8-3315-0970-5 DOI: 10.1109/AICCONF64766.2025.11064260 Conference Location: Prague, Czech Republic Online Access: this https URL

Journal-ref: 2025 3rd Cognitive Models and Artificial Intelligence Conference (AICCONF), vol. 1, no. 1, pp. 1-8, Prague, Czech Republic, IEEE, 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2512.08577 [pdf, ps, other]: Title: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery

Authors: Yuna Kato, Shohei Mori, Hideo Saito, Yoshifumi Takatsume, Hiroki Kajita, Mariko Isogawa

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[37] arXiv:2512.08572 [pdf, ps, other]: Title: From Cells to Survival: Hierarchical Analysis of Cell Inter-Relations in Multiplex Microscopy for Lung Cancer Prognosis

Authors: Olle Edgren Schüllerqvist, Jens Baumann, Joakim Lindblad, Love Nordling, Artur Mezheyeuski, Patrick Micke, Nataša Sladoje

Comments: 5 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38] arXiv:2512.08569 [pdf, ps, other]: Title: Instance-Aware Test-Time Segmentation for Continual Domain Shifts

Authors: Seunghwan Lee, Inyoung Jung, Hojoon Lee, Eunil Park, Sungeun Hong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39] arXiv:2512.08564 [pdf, ps, other]: Title: Modular Neural Image Signal Processing

Authors: Mahmoud Afifi, Zhongling Wang, Ran Zhang, Michael S. Brown

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40] arXiv:2512.08560 [pdf, ps, other]: Title: BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain

Authors: Navve Wasserman, Matias Cosarinsky, Yuval Golbari, Aude Oliva, Antonio Torralba, Tamar Rott Shaham, Michal Irani

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2512.08557 [pdf, ps, other]: Title: SSCATeR: Sparse Scatter-Based Convolution Algorithm with Temporal Data Recycling for Real-Time 3D Object Detection in LiDAR Point Clouds

Authors: Alexander Dow, Manduhu Manduhu, Matheus Santos, Ben Bartlett, Gerard Dooly, James Riordan

Comments: 22 Pages, 26 Figures, This work has been submitted to the IEEE Sensors Journal for possible publication

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42] arXiv:2512.08547 [pdf, ps, other]: Title: An Iteration-Free Fixed-Point Estimator for Diffusion Inversion

Authors: Yifei Chen, Kaiyu Song, Yan Pan, Jianxing Yu, Jian Yin, Hanjiang Lai

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43] arXiv:2512.08542 [pdf, ps, other]: Title: A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image Generation

Authors: Zhigang Jia, Duan Wang, Hengkai Wang, Yajun Xie, Meixiang Zhao, Xiaoyu Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
[44] arXiv:2512.08537 [pdf, ps, other]: Title: Fast-ARDiff: An Entropy-informed Acceleration Framework for Continuous Space Autoregressive Generation

Authors: Zhen Zou, Xiaoxiao Ma, Jie Huang, Zichao Yu, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2512.08535 [pdf, ps, other]: Title: Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail Enhancement

Authors: Xinyue Liang, Zhinyuan Ma, Lingchen Sun, Yanjun Guo, Lei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2512.08534 [pdf, ps, other]: Title: PaintFlow: A Unified Framework for Interactive Oil Paintings Editing and Generation

Authors: Zhangli Hu, Ye Chen, Jiajun Yao, Bingbing Ni

Comments: 14 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47] arXiv:2512.08529 [pdf, ps, other]: Title: MVP: Multiple View Prediction Improves GUI Grounding

Authors: Yunzhu Zhang, Zeyu Pan, Zhengwen Zeng, Shuheng Shen, Changhua Meng, Linchao Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2512.08524 [pdf, ps, other]: Title: Beyond Real Weights: Hypercomplex Representations for Stable Quantization

Authors: Jawad Ibn Ahad, Maisha Rahman, Amrijit Biswas, Muhammad Rafsan Kabir, Robin Krambroeckers, Sifat Momen, Nabeel Mohammed, Shafin Rahman

Comments: Accepted in Winter Conference on Applications of Computer Vision (WACV) 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[49] arXiv:2512.08511 [pdf, ps, other]: Title: Thinking with Images via Self-Calling Agent

Authors: Wenxi Yang, Yuzhong Zhao, Fang Wan, Qixiang Ye

Comments: Code would be released at this https URL soon

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50] arXiv:2512.08506 [pdf, ps, other]: Title: OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point Clouds

Authors: Jialu Sui, Rui Liu, Hongsheng Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51] arXiv:2512.08505 [pdf, ps, other]: Title: Beyond the Noise: Aligning Prompts with Latent Representations in Diffusion Models

Authors: Vasco Ramos, Regev Cohen, Idan Szpektor, Joao Magalhaes

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[52] arXiv:2512.08503 [pdf, ps, other]: Title: Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning Models

Authors: Jiaming Zhang, Che Wang, Yang Cao, Longtao Huang, Wei Yang Bryan Lim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[53] arXiv:2512.08498 [pdf, ps, other]: Title: On-the-fly Large-scale 3D Reconstruction from Multi-Camera Rigs

Authors: Yijia Guo, Tong Hu, Zhiwei Li, Liwen Hu, Keming Qian, Xitong Lin, Shengbo Chen, Tiejun Huang, Lei Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[54] arXiv:2512.08486 [pdf, ps, other]: Title: Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned Interventions

Authors: Ada Gorgun, Fawaz Sammani, Nikos Deligiannis, Bernt Schiele, Jonas Fischer

Comments: Code is available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[55] arXiv:2512.08478 [pdf, ps, other]: Title: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform

Authors: Yuning Gong, Yifei Liu, Yifan Zhan, Muyao Niu, Xueying Li, Yuanjun Liao, Jiaming Chen, Yuanyuan Gao, Jiaqi Chen, Minming Chen, Li Zhou, Yuning Zhang, Wei Wang, Xiaoqing Hou, Huaxi Huang, Shixiang Tang, Le Ma, Dingwen Zhang, Xue Yang, Junchi Yan, Yanchi Zhang, Yinqiang Zheng, Xiao Sun, Zhihang Zhong

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[56] arXiv:2512.08477 [pdf, ps, other]: Title: ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Consistent Attention

Authors: Huiguo He, Pengyu Yan, Ziqi Yi, Weizhi Zhong, Zheng Liu, Yejun Tang, Huan Yang, Kun Gai, Guanbin Li, Lianwen Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[57] arXiv:2512.08467 [pdf, ps, other]: Title: Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion Recovery

Authors: Chamath Ranasinghe, Uthayasanker Thayasivam

Comments: 8 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[58] arXiv:2512.08445 [pdf, ps, other]: Title: Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts

Authors: Madhav Gupta, Vishak Prasad C, Ganesh Ramakrishnan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[59] arXiv:2512.08441 [pdf, ps, other]: Title: Leveraging Multispectral Sensors for Color Correction in Mobile Cameras

Authors: Luca Cogo, Marco Buzzelli, Simone Bianco, Javier Vazquez-Corral, Raimondo Schettini

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[60] arXiv:2512.08439 [pdf, ps, other]: Title: LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-training

Authors: Qing Xu, Kun Yuan, Yuxiang Luo, Yuhao Zhai, Wenting Duan, Nassir Navab, Zhen Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[61] arXiv:2512.08430 [pdf, ps, other]: Title: SDT-6D: Fully Sparse Depth-Transformer for Staged End-to-End 6D Pose Estimation in Industrial Multi-View Bin Picking

Authors: Nico Leuze, Maximilian Hoh, Samed Doğan, Nicolas R.-Peña, Alfred Schoettl

Comments: Accepted to WACV 2026. Preprint version

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[62] arXiv:2512.08410 [pdf, ps, other]: Title: Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval

Authors: Tao Chen, Shaobo Ju, Qiong Wu, Chenxin Fang, Kun Zhang, Jun Peng, Hui Li, Yiyi Zhou, Rongrong Ji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63] arXiv:2512.08406 [pdf, ps, other]: Title: SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos

Authors: Mingqi Gao, Yunqi Miao, Jungong Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64] arXiv:2512.08400 [pdf, ps, other]: Title: Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries

Authors: Samitha Nuwan Thilakarathna, Ercan Avsar, Martin Mathias Nielsen, Malte Pedersen

Comments: The paper has been accepted for publication at Northern Lights Deep Learning (NLDL) Conference 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[65] arXiv:2512.08397 [pdf, ps, other]: Title: Detection of Digital Facial Retouching utilizing Face Beauty Information

Authors: Philipp Srock, Juan E. Tapia, Christoph Busch

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66] arXiv:2512.08378 [pdf, ps, other]: Title: Simultaneous Enhancement and Noise Suppression under Complex Illumination Conditions

Authors: Jing Tao, You Li, Banglei Guan, Yang Shang, Qifeng Yu

Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[67] arXiv:2512.08374 [pdf, ps, other]: Title: The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss

Authors: Bozhou Li, Xinda Xue, Sihan Yang, Yang Shi, Xinlong Chen, Yushuo Guan, Yuanxing Zhang, Wentao Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[68] arXiv:2512.08362 [pdf, ps, other]: Title: SCU-CGAN: Enhancing Fire Detection through Synthetic Fire Image Generation and Dataset Augmentation

Authors: Ju-Young Kim, Ji-Hong Park, Gun-Woo Kim

Comments: Accepted for main track at MobieSec 2024 (not published in the proceedings)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[69] arXiv:2512.08358 [pdf, ps, other]: Title: TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels

Authors: Jiahao Lu, Weitao Xiong, Jiacheng Deng, Peng Li, Tianyu Huang, Zhiyang Dou, Cheng Lin, Sai-Kit Yeung, Yuan Liu

Comments: Accepted by NeurIPS 2025. Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70] arXiv:2512.08337 [pdf, ps, other]: Title: DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD Generation

Authors: Jianwei Wang, Qing Wang, Menglan Ruan, Rongjun Ge, Chunfeng Yang, Yang Chen, Chunming Xie

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71] arXiv:2512.08334 [pdf, ps, other]: Title: HybridSplat: Fast Reflection-baked Gaussian Tracing using Hybrid Splatting

Authors: Chang Liu, Hongliang Yuan, Lianghao Zhang, Sichao Wang, Jianwei Guo, Shi-Sheng Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72] arXiv:2512.08331 [pdf, ps, other]: Title: Bi^2MAC: Bimodal Bi-Adaptive Mask-Aware Convolution for Remote Sensing Pansharpening

Authors: Xianghong Xiao, Zeyu Xia, Zhou Fei, Jinliang Xiao, Haorui Chen, Liangjian Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73] arXiv:2512.08330 [pdf, ps, other]: Title: PointDico: Contrastive 3D Representation Learning Guided by Diffusion Models

Authors: Pengbo Li, Yiding Sun, Haozhe Cheng

Comments: Accepted by IJCNN 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[74] arXiv:2512.08329 [pdf, ps, other]: Title: Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models

Authors: Michael R. Martin, Garrick Chan, Kwan-Liu Ma

Comments: 32 pages, 17 figures, 1 table, 5 algorithms, preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[75] arXiv:2512.08327 [pdf, ps, other]: Title: Low Rank Support Quaternion Matrix Machine

Authors: Wang Chen, Ziyan Luo, Shuangyue Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
[76] arXiv:2512.08325 [pdf, ps, other]: Title: GeoDiffMM: Geometry-Guided Conditional Diffusion for Motion Magnification

Authors: Xuedeng Liu, Jiabao Guo, Zheng Zhang, Fei Wang, Zhi Liu, Dan Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77] arXiv:2512.08323 [pdf, ps, other]: Title: Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challenge

Authors: Achraf Ben-Hamadou, Nour Neifar, Ahmed Rekik, Oussama Smaoui, Firas Bouzguenda, Sergi Pujades, Niels van Nistelrooij, Shankeeth Vinayahalingam, Kaibo Shi, Hairong Jin, Youyi Zheng, Tibor Kubík, Oldřich Kodym, Petr Šilling, Kateřina Trávníčková, Tomáš Mojžiš, Jan Matula, Jeffry Hartanto, Xiaoying Zhu, Kim-Ngan Nguyen, Tudor Dascalu, Huikai Wu, and Weijie Liu, Shaojie Zhuang, Guangshun Wei, Yuanfeng Zhou

Comments: MICCAI 2024, 3DTeethLand, Challenge report, under review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78] arXiv:2512.08317 [pdf, ps, other]: Title: GeoDM: Geometry-aware Distribution Matching for Dataset Distillation

Authors: Xuhui Li, Zhengquan Luo, Zihui Cui, Zhiqiang Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[79] arXiv:2512.08309 [pdf, ps, other]: Title: Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise in Infinite, Real-Time Terrain Generation

Authors: Alexander Goslin

Comments: Project website: this https URL Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
[80] arXiv:2512.08294 [pdf, ps, other]: Title: OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation

Authors: Yexin Liu, Manyuan Zhang, Yueze Wang, Hongyu Li, Dian Zheng, Weiming Zhang, Changsheng Lu, Xunliang Cai, Yan Feng, Peng Pei, Harry Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[81] arXiv:2512.08282 [pdf, ps, other]: Title: PAVAS: Physics-Aware Video-to-Audio Synthesis

Authors: Oh Hyun-Bin, Yuhta Takida, Toshimitsu Uesaka, Tae-Hyun Oh, Yuki Mitsufuji

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
[82] arXiv:2512.08269 [pdf, ps, other]: Title: EgoX: Egocentric Video Generation from a Single Exocentric Video

Authors: Taewoong Kang, Kinam Kim, Dohyeon Kim, Minho Park, Junha Hyung, Jaegul Choo

Comments: 21 pages, project page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83] arXiv:2512.08262 [pdf, ps, other]: Title: RLCNet: An end-to-end deep learning framework for simultaneous online calibration of LiDAR, RADAR, and Camera

Authors: Hafeez Husain Cholakkal, Stefano Arrigoni, Francesco Braghin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[84] arXiv:2512.08254 [pdf, ps, other]: Title: SFP: Real-World Scene Recovery Using Spatial and Frequency Priors

Authors: Yun Liu, Tao Li, Cosmin Ancuti, Wenqi Ren, Weisi Lin

Comments: 10 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[85] arXiv:2512.08253 [pdf, ps, other]: Title: Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic Segmentation

Authors: YiLin Zhou, Lili Wei, Zheming Xu, Ziyi Chen, Congyan Lang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[86] arXiv:2512.08247 [pdf, ps, other]: Title: Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object Detection

Authors: Haowen Zheng, Hu Zhu, Lu Deng, Weihao Gu, Yang Yang, Yanyan Liang

Comments: AAAI-26

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[87] arXiv:2512.08243 [pdf, ps, other]: Title: Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSI

Authors: Saeeda Naz, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences (UEAS), Swat, Pakistan)

Comments: 26 Pages, 10 Figures, 4 Tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[88] arXiv:2512.08240 [pdf, ps, other]: Title: HybridToken-VLM: Hybrid Token Compression for Vision-Language Models

Authors: Jusheng Zhang, Xiaoyang Guo, Kaitong Cai, Qinhan Lv, Yijia Fan, Wenhao Chai, Jian Wang, Keze Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[89] arXiv:2512.08237 [pdf, ps, other]: Title: FastBEV++: Fast by Algorithm, Deployable by Design

Authors: Yuanpeng Chen, Hui Song, Wei Tao, ShanHui Mo, Shuang Zhang, Xiao Hua, TianKun Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[90] arXiv:2512.08229 [pdf, ps, other]: Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic Systems

Authors: Tony Salloom, Dandi Zhou, Xinhai Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[91] arXiv:2512.08228 [pdf, ps, other]: Title: MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal Models

Authors: Jusheng Zhang, Kaitong Cai, Xiaoyang Guo, Sidi Liu, Qinhan Lv, Ruiqi Chen, Jing Yang, Yijia Fan, Xiaofei Sun, Jian Wang, Ziliang Chen, Liang Lin, Keze Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[92] arXiv:2512.08227 [pdf, ps, other]: Title: New VVC profiles targeting Feature Coding for Machines

Authors: Md Eimran Hossain Eimon, Ashan Perera, Juan Merlos, Velibor Adzic, Hari Kalva

Comments: Accepted for presentation at ICIP 2025 workshop on Coding for Machines

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93] arXiv:2512.08223 [pdf, ps, other]: Title: SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection

Authors: Ching-Hung Cheng, Hsiu-Fu Wu, Bing-Chen Wu, Khanh-Phong Bui, Van-Tin Luu, Ching-Chun Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94] arXiv:2512.08221 [pdf, ps, other]: Title: VisKnow: Constructing Visual Knowledge Base for Object Understanding

Authors: Ziwei Yao, Qiyang Wan, Ruiping Wang, Xilin Chen

Comments: 16 pages, 12 figures, 7 tables. Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95] arXiv:2512.08215 [pdf, ps, other]: Title: Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior Refinement

Authors: Chia-Hern Lai, I-Hsuan Lo, Yen-Ku Yeh, Thanh-Nguyen Truong, Ching-Chun Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[96] arXiv:2512.08198 [pdf, ps, other]: Title: Animal Re-Identification on Microcontrollers

Authors: Yubo Chen, Di Zhao, Yun Sing Koh, Talia Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[97] arXiv:2512.08180 [pdf, ps, other]: Title: GeoLoom: High-quality Geometric Diagram Generation from Textual Input

Authors: Xiaojing Wei, Ting Zhang, Wei He, Jingdong Wang, Hua Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[98] arXiv:2512.08163 [pdf, ps, other]: Title: Accuracy Does Not Guarantee Human-Likeness in Monocular Depth Estimators

Authors: Yuki Kubota, Taiki Fukiage

Comments: 22 pages, 12 figures, 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[99] arXiv:2512.08161 [pdf, ps, other]: Title: Fourier-RWKV: A Multi-State Perception Network for Efficient Image Dehazing

Authors: Lirong Zheng, Yanshan Li, Rui Yu, Kaihao Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100] arXiv:2512.08135 [pdf, ps, other]: Title: CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning

Authors: Zeyuan Chen, Xiang Zhang, Haiyang Xu, Jianwen Xie, Zhuowen Tu

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[101] arXiv:2512.08075 [pdf, ps, other]: Title: Identification of Deforestation Areas in the Amazon Rainforest Using Change Detection Models

Authors: Christian Massao Konishi, Helio Pedrini

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102] arXiv:2512.08048 [pdf, ps, other]: Title: Mask to Adapt: Simple Random Masking Enables Robust Continual Test-Time Learning

Authors: Chandler Timm C. Doloriel

Comments: ongoing work

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[103] arXiv:2512.08042 [pdf, ps, other]: Title: Towards Sustainable Universal Deepfake Detection with Frequency-Domain Masking

Authors: Chandler Timm C. Doloriel, Habib Ullah, Kristian Hovde Liland, Fadi Al Machot, Ngai-Man Cheung

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[104] arXiv:2512.08040 [pdf, ps, other]: Title: Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment

Authors: Youngjoon Jang, Liliane Momeni, Zifan Jiang, Joon Son Chung, Gül Varol, Andrew Zisserman

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105] arXiv:2512.08038 [pdf, ps, other]: Title: SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity Classification

Authors: Elifnur Sunger, Tales Imbiriba, Peter Campbell, Deniz Erdogmus, Stratis Ioannidis, Jennifer Dy

Comments: 20 pages, 16 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[106] arXiv:2512.08016 [pdf, ps, other]: Title: FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models

Authors: Jiyoon Pyo, Yuankun Jiao, Dongwon Jung, Zekun Li, Leeje Jang, Sofia Kirsanova, Jina Kim, Yijun Lin, Qin Liu, Junyi Xie, Hadi Askari, Nan Xu, Muhao Chen, Yao-Yi Chiang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[107] arXiv:2512.07984 [pdf, ps, other]: Title: Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer Detection

Authors: Ryan Banks, Camila Lindoni Azevedo, Hongying Tang, Yunpeng Li

Comments: 13 pages, 7 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[108] arXiv:2512.07951 [pdf, ps, other]: Title: Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality

Authors: Zekai Luo, Zongze Du, Zhouhang Zhu, Hao Zhong, Muzhi Zhu, Wen Wang, Yuling Xi, Chenchen Jing, Hao Chen, Chunhua Shen

Comments: Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[109] arXiv:2512.07925 [pdf, ps, other]: Title: Near-real time fires detection using satellite imagery in Sudan conflict

Authors: Kuldip Singh Atwal, Dieter Pfoser, Daniel Rothbart

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[110] arXiv:2512.07838 [pdf, ps, other]: Title: Detection of Cyberbullying in GIF using AI

Authors: Pal Dave, Xiaohong Yuan, Madhuri Siddula, Kaushik Roy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[111] arXiv:2512.08715 (cross-list from cs.PF) [pdf, ps, other]: Title: Multi-domain performance analysis with scores tailored to user preferences

Authors: Sébastien Piérard, Adrien Deliège, Marc Van Droogenbroeck

Subjects: Performance (cs.PF); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[112] arXiv:2512.08629 (cross-list from cs.AI) [pdf, ps, other]: Title: See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm

Authors: Haoyu Zhao, Weizhong Ding, Yuhao Yang, Zheng Tian, Linyi Yang, Kun Shao, Jun Wang

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[113] arXiv:2512.08545 (cross-list from cs.CL) [pdf, ps, other]: Title: Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks

Authors: Indrajit Kar, Kalathur Chenchu Kishore Kumar

Comments: 22 pages, 2 tables, 9 figures

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[114] arXiv:2512.08500 (cross-list from cs.GR) [pdf, ps, other]: Title: Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D Motions

Authors: Jianan Li, Xiao Chen, Tao Huang, Tien-Tsin Wong

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[115] arXiv:2512.08360 (cross-list from cs.NE) [pdf, ps, other]: Title: Conditional Morphogenesis: Emergent Generation of Structural Digits via Neural Cellular Automata

Authors: Ali Sakour

Comments: 13 pages, 5 figures. Code available at: this https URL

Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[116] arXiv:2512.08284 (cross-list from physics.geo-ph) [pdf, ps, other]: Title: Self-Reinforced Deep Priors for Reparameterized Full Waveform Inversion

Authors: Guangyuan Zou, Junlun Li, Feng Liu, Xuejing Zheng, Jianjian Xie, Guoyi Chen

Comments: Submitted to GEOPHYSICS

Subjects: Geophysics (physics.geo-ph); Computer Vision and Pattern Recognition (cs.CV)
[117] arXiv:2512.08271 (cross-list from cs.RO) [pdf, ps, other]: Title: Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic Teleoperation

Authors: Srijan Dokania, Dharini Raghavan

Comments: Published and Presented at 3rd Workshop on Human-Centric Multilateral Teleoperation in ICRA 2025

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[118] arXiv:2512.08216 (cross-list from eess.IV) [pdf, ps, other]: Title: Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation

Authors: Aneesh Rangnekar, Harini Veeraraghavan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[119] arXiv:2512.08188 (cross-list from cs.RO) [pdf, ps, other]: Title: Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World Model

Authors: Wenjiang Xu, Cindy Wang, Rui Fang, Mingkang Zhang, Lusong Li, Jing Xu, Jiayuan Gu, Zecui Zeng, Rui Chen

Comments: Website at this https URL

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2512.08170 (cross-list from cs.RO) [pdf, ps, other]: Title: RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric Features

Authors: Haoxin Zhang, Shuaixin Li, Xiaozhou Zhu, Hongbo Chen, Wen Yao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2512.08153 (cross-list from cs.LG) [pdf, ps, other]: Title: TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models

Authors: Zheng Ding, Weirui Ye

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2512.08125 (cross-list from eess.IV) [pdf, ps, other]: Title: FlowSteer: Conditioning Flow Field for Consistent Image Restoration

Authors: Tharindu Wickremasinghe, Chenyang Qi, Harshana Weligampola, Zhengzhong Tu, Stanley H. Chan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[123] arXiv:2512.08099 (cross-list from math.NA) [pdf, ps, other]: Title: Generalizations of the Normalized Radon Cumulative Distribution Transform for Limited Data Recognition

Authors: Matthias Beckmann, Robert Beinert, Jonas Bresch

Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
[124] arXiv:2512.08029 (cross-list from cs.LG) [pdf, ps, other]: Title: CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space

Authors: Tianxingjian Ding, Yuanhao Zou, Chen Chen, Mubarak Shah, Yu Tian

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[125] arXiv:2512.07998 (cross-list from cs.RO) [pdf, ps, other]: Title: DIJIT: A Robotic Head for an Active Observer

Authors: Mostafa Kamali Tabrizi, Mingshi Chi, Bir Bikram Dey, Yu Qing Yuan, Markus D. Solbach, Yiqian Liu, Michael Jenkin, John K. Tsotsos

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[126] arXiv:2512.07981 (cross-list from cs.LG) [pdf, ps, other]: Title: CIP-Net: Continual Interpretable Prototype-based Network

Authors: Federico Di Valerio, Michela Proietti, Alessio Ragno, Roberto Capobianco

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2512.07976 (cross-list from cs.RO) [pdf, ps, other]: Title: VLD: Visual Language Goal Distance for Reinforcement Learning Navigation

Authors: Lazar Milikic, Manthan Patel, Jonas Frey

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2512.07969 (cross-list from cs.RO) [pdf, ps, other]: Title: Sparse Variable Projection in Robotic Perception: Exploiting Separable Structure for Efficient Nonlinear Optimization

Authors: Alan Papalia, Nikolas Sanderson, Haoyu Han, Heng Yang, Hanumant Singh, Michael Everett

Comments: 8 pages, submitted for review

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[129] arXiv:2512.07884 (cross-list from cs.LG) [pdf, ps, other]: Title: GSPN-2: Efficient Parallel Sequence Modeling

Authors: Hongjun Wang, Yitong Jiang, Collin McCarthy, David Wehr, Hanrong Ye, Xinhao Li, Ka Chun Cheung, Wonmin Byeon, Jinwei Gu, Ke Chen, Kai Han, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Sifei Liu

Comments: NeurIPS 2025

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2512.07855 (cross-list from cs.LG) [pdf, ps, other]: Title: LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer Model

Authors: Huizheng Wang, Hongbin Wang, Shaojun Wei, Yang Hu, Shouyi Yin

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[131] arXiv:2512.05791 (cross-list from physics.med-ph) [pdf, ps, other]: Title: Fast and Robust Diffusion Posterior Sampling for MR Image Reconstruction Using the Preconditioned Unadjusted Langevin Algorithm

Authors: Moritz Blumenthal, Tina Holliber, Jonathan I. Tamir, Martin Uecker

Comments: Submitted to Magnetic Resonance in Medicine

Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Probability (math.PR)

Tue, 9 Dec 2025 (showing first 119 of 259 entries)

[132] arXiv:2512.07834 [pdf, ps, other]: Title: Voxify3D: Pixel Art Meets Volumetric Rendering

Authors: Yi-Chuan Huang, Jiewen Chan, Hao-Jen Chien, Yu-Lun Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2512.07833 [pdf, ps, other]: Title: Relational Visual Similarity

Authors: Thao Nguyen, Sicheng Mo, Krishna Kumar Singh, Yilin Wang, Jing Shi, Nicholas Kolkin, Eli Shechtman, Yong Jae Lee, Yuheng Li

Comments: Project page, data, and code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[134] arXiv:2512.07831 [pdf, ps, other]: Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

Authors: Jiehui Huang, Yuechen Zhang, Xu He, Yuan Gao, Zhi Cen, Bin Xia, Yan Zhou, Xin Tao, Pengfei Wan, Jiaya Jia

Comments: Project Website this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2512.07829 [pdf, ps, other]: Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation

Authors: Yuan Gao, Chen Chen, Tianrong Chen, Jiatao Gu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[136] arXiv:2512.07826 [pdf, ps, other]: Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

Authors: Haoyang He, Jie Wang, Jiangning Zhang, Zhucun Xue, Xingyuan Bu, Qiangpeng Yang, Shilei Wen, Lei Xie

Comments: 38 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137] arXiv:2512.07821 [pdf, ps, other]: Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion Modeling

Authors: Shaoheng Fang, Hanwen Jiang, Yunpeng Bai, Niloy J. Mitra, Qixing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[138] arXiv:2512.07807 [pdf, ps, other]: Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale Scenes

Authors: Shai Krakovsky, Gal Fiebelman, Sagie Benaim, Hadar Averbuch-Elor

Comments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[139] arXiv:2512.07806 [pdf, ps, other]: Title: Multi-view Pyramid Transformer: Look Coarser to See Broader

Authors: Gyeongjin Kang, Seungkwon Yang, Seungtae Nam, Younggeun Lee, Jungwoo Kim, Eunbyung Park

Comments: Project page: see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140] arXiv:2512.07802 [pdf, ps, other]: Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Authors: Zhaochong An, Menglin Jia, Haonan Qiu, Zijian Zhou, Xiaoke Huang, Zhiheng Liu, Weiming Ren, Kumara Kahatapitiya, Ding Liu, Sen He, Chenyang Zhang, Tao Xiang, Fanny Yang, Serge Belongie, Tian Xie

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2512.07778 [pdf, ps, other]: Title: Distribution Matching Variational AutoEncoder

Authors: Sen Ye, Jianning Pei, Mengde Xu, Shuyang Gu, Chunyu Wang, Liwei Wang, Han Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142] arXiv:2512.07776 [pdf, ps, other]: Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring

Authors: Maximilian Schall, Felix Leonard Knöfel, Noah Elias König, Jan Jonas Kubeler, Maximilian von Klinski, Joan Wilhelm Linnemann, Xiaoshi Liu, Iven Jelle Schlegelmilch, Ole Woyciniuk, Alexandra Schild, Dante Wasmuht, Magdalena Bermejo Espinet, German Illera Basas, Gerard de Melo

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[143] arXiv:2512.07760 [pdf, ps, other]: Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-Identification

Authors: Menglin Wang, Xiaojin Gong, Jiachen Li, Genlin Ji

Comments: Accepted to AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144] arXiv:2512.07756 [pdf, ps, other]: Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound Reconstruction

Authors: Mayank Anand, Ujair Alam, Surya Prakash, Priya Shukla, Gora Chand Nandi, Domenec Puig

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[145] arXiv:2512.07747 [pdf, ps, other]: Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and Generation

Authors: Shihao Zhao, Yitong Chen, Zeyinzi Jiang, Bojia Zi, Shaozhe Hao, Yu Liu, Chaojie Mao, Kwan-Yee K. Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146] arXiv:2512.07745 [pdf, ps, other]: Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

Authors: Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, Xinggang Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2512.07738 [pdf, ps, other]: Title: HLTCOE Evaluation Team at TREC 2025: VQA Track

Authors: Dengjia Zhang, Charles Weng, Katherine Guerrerio, Yi Lu, Kenton Murray, Alexander Martin, Reno Kriz, Benjamin Van Durme

Comments: 7 pages, 1 figure

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2512.07733 [pdf, ps, other]: Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery

Authors: Meng Cao, Xingyu Li, Xue Liu, Ian Reid, Xiaodan Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149] arXiv:2512.07730 [pdf, ps, other]: Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination

Authors: Sangha Park, Seungryong Yoo, Jisoo Mok, Sungroh Yoon

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[150] arXiv:2512.07729 [pdf, ps, other]: Title: Improving action classification with brain-inspired deep networks

Authors: Aidas Aglinskas, Stefano Anzellotti

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[151] arXiv:2512.07720 [pdf, ps, other]: Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation

Authors: Fan Yang, Heyuan Li, Peihao Li, Weihao Yuan, Lingteng Qiu, Chaoyue Song, Cheng Chen, Yisheng He, Shifeng Zhang, Xiaoguang Han, Steven Hoi, Guosheng Lin

Comments: Project page: \url{this https URL}

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152] arXiv:2512.07712 [pdf, ps, other]: Title: UnCageNet: Tracking and Pose Estimation of Caged Animal

Authors: Sayak Dutta, Harish Katti, Shashikant Verma, Shanmuganathan Raman

Comments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, India

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153] arXiv:2512.07703 [pdf, ps, other]: Title: PVeRA: Probabilistic Vector-Based Random Matrix Adaptation

Authors: Leo Fillioux, Enzo Ferrante, Paul-Henry Cournède, Maria Vakalopoulou, Stergios Christodoulidis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[154] arXiv:2512.07702 [pdf, ps, other]: Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment

Authors: Sangha Park, Eunji Kim, Yeongtak Oh, Jooyoung Choi, Sungroh Yoon

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[155] arXiv:2512.07698 [pdf, ps, other]: Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data Only

Authors: Arslan Artykov, Corentin Sautier, Vincent Lepetit

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[156] arXiv:2512.07674 [pdf, ps, other]: Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast Representations

Authors: Mehmet Yigit Avci, Pedro Borges, Virginia Fernandez, Paul Wright, Mehmet Yigitsoy, Sebastien Ourselin, Jorge Cardoso

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[157] arXiv:2512.07668 [pdf, ps, other]: Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset

Authors: Ronan John, Aditya Kesari, Vincenzo DiMatteo, Kristin Dana

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158] arXiv:2512.07661 [pdf, ps, other]: Title: Optimization-Guided Diffusion for Interactive Scene Generation

Authors: Shiaho Li, Naisheng Ye, Tianyu Li, Kashyap Chitta, Tuo An, Peng Su, Boyang Wang, Haiou Liu, Chen Lv, Hongyang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159] arXiv:2512.07652 [pdf, ps, other]: Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research

Authors: Hamad Almazrouei, Mariam Al Nasseri, Maha Alzaabi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[160] arXiv:2512.07651 [pdf, ps, other]: Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline Method

Authors: Yuanye Liu, Hanxiao Zhang, Nannan Shi, Yuxin Shi, Arif Mahmood, Murtaza Taj, Xiahai Zhuang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161] arXiv:2512.07628 [pdf, ps, other]: Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation

Authors: Zhiqi Li, Wenhuan Li, Tengfei Wang, Zhenwei Wang, Junta Wu, Haoyuan Wang, Yunhan Yang, Zehuan Huang, Yang Li, Peidong Liu, Chunchao Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2512.07606 [pdf, ps, other]: Title: Decomposition Sampling for Efficient Region Annotations in Active Learning

Authors: Jingna Qiu, Frauke Wilm, Mathias Öttl, Jonas Utz, Maja Schlereth, Moritz Schillinger, Marc Aubreville, Katharina Breininger

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163] arXiv:2512.07599 [pdf, ps, other]: Title: Online Segment Any 3D Thing as Instance Tracking

Authors: Hanshi Wang, Zijian Cai, Jin Gao, Yiwei Zhang, Weiming Hu, Ke Wang, Zhipeng Zhang

Comments: NeurIPS 2025, Code is at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2512.07596 [pdf, ps, other]: Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery

Authors: Wenzhen Dong, Jieming Yu, Yiming Huang, Hongqiu Wang, Lei Zhu, Albert C. S. Chung, Hongliang Ren, Long Bai

Comments: Technical Report

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[165] arXiv:2512.07590 [pdf, ps, other]: Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image Segmentation

Authors: Kaili Qi, Zhongyi Huang, Wenli Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166] arXiv:2512.07584 [pdf, ps, other]: Title: LongCat-Image Technical Report

Authors: Meituan LongCat Team: Hanghang Ma, Haoxian Tan, Jiale Huang, Junqiang Wu, Jun-Yan He, Lishuai Gao, Songlin Xiao, Xiaoming Wei, Xiaoqi Ma, Xunliang Cai, Yayong Guan, Jie Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167] arXiv:2512.07580 [pdf, ps, other]: Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs

Authors: Yahong Wang, Juncheng Wu, Zhangkai Ni, Longzhen Yang, Yihang Liu, Chengmei Yang, Ying Wen, Xianfeng Tang, Hui Liu, Yuyin Zhou, Lianghua He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168] arXiv:2512.07568 [pdf, ps, other]: Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation

Authors: Xuecheng Li, Weikuan Jia, Alisher Kurbonaliev, Qurbonaliev Alisher, Khudzhamkulov Rustam, Ismoilov Shuhratjon, Eshmatov Javhariddin, Yuanjie Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[169] arXiv:2512.07564 [pdf, ps, other]: Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models

Authors: Kassoum Sanogo, Renzo Ardiccioni

Comments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[170] arXiv:2512.07527 [pdf, ps, other]: Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images

Authors: Fei Yu, Yu Liu, Luyang Tang, Mingchao Sun, Zengye Ge, Rui Bu, Yuchao Jin, Haisen Zhao, He Sun, Yangyan Li, Mu Xu, Wenzheng Chen, Baoquan Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[171] arXiv:2512.07514 [pdf, ps, other]: Title: MeshRipple: Structured Autoregressive Generation of Artist-Meshes

Authors: Junkai Lin, Hang Long, Huipeng Guo, Jielei Zhang, JiaYi Yang, Tianle Guo, Yang Yang, Jianwen Li, Wenxiao Zhang, Matthias Nießner, Wei Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172] arXiv:2512.07504 [pdf, ps, other]: Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points

Authors: Ryota Okumura, Kaede Shiohara, Toshihiko Yamasaki

Comments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[173] arXiv:2512.07503 [pdf, ps, other]: Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation

Authors: Yao Teng, Zhihuan Jiang, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2512.07500 [pdf, ps, other]: Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer

Authors: Penghui Liu, Jiangshan Wang, Yutong Shen, Shanhui Mo, Chenyang Qi, Yue Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175] arXiv:2512.07498 [pdf, ps, other]: Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior

Authors: Chih-Chung Hsu, Shao-Ning Chen, Chia-Ming Lee, Yi-Fang Wang, Yi-Shiuan Chou

Comments: 16 pages (including appendix)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176] arXiv:2512.07480 [pdf, ps, other]: Title: Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance

Authors: Naifu Xue, Zhaoyang Jia, Jiahao Li, Bin Li, Zihan Zheng, Yuan Zhang, Yan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2512.07469 [pdf, ps, other]: Title: Unified Video Editing with Temporal Reasoner

Authors: Xiangpeng Yang, Ji Xie, Yiyuan Yang, Yan Huang, Min Xu, Qiang Wu

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178] arXiv:2512.07426 [pdf, ps, other]: Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processing

Authors: Karel Moens, Matthew B. Blaschko, Tinne Tuytelaars, Bart Diricx, Jonas De Vylder, Mustafa Yousif

Comments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[179] arXiv:2512.07415 [pdf, ps, other]: Title: Data-driven Exploration of Mobility Interaction Patterns

Authors: Gabriele Galatolo, Mirco Nanni

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[180] arXiv:2512.07410 [pdf, ps, other]: Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs

Authors: Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang, Xin Chen, Lan Xu, Jingyi Yu, Jingya Wang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181] arXiv:2512.07394 [pdf, ps, other]: Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric Video

Authors: Zhifan Zhu, Siddhant Bansal, Shashank Tripathi, Dima Damen

Comments: webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182] arXiv:2512.07391 [pdf, ps, other]: Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency Monitoring

Authors: Đorđe Nedeljković

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183] arXiv:2512.07385 [pdf, ps, other]: Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline

Authors: Chunhui Zhang, Li Liu, Zhipeng Zhang, Yong Wang, Hao Wen, Xi Zhou, Shiming Ge, Yanfeng Wang

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2512.07383 [pdf, ps, other]: Title: LogicCBMs: Logic-Enhanced Concept-Based Learning

Authors: Deepika SN Vemuri, Gautham Bellamkonda, Aditya Pola, Vineeth N Balasubramanian

Comments: 18 pages, 19 figures, WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185] arXiv:2512.07381 [pdf, ps, other]: Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic Objects

Authors: Shuohan Tao, Boyao Zhou, Hanzhang Tu, Yuwang Wang, Yebin Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[186] arXiv:2512.07379 [pdf, ps, other]: Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and Efficiency

Authors: Mahila Moghadami, Mohammad Ali Keyvanrad, Melika Sabaghian

Comments: 22 pages, 16 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187] arXiv:2512.07360 [pdf, ps, other]: Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic Segmentation

Authors: Qiming Huang, Hao Ai, Jianbo Jiao

Comments: Accepted to WACV2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[188] arXiv:2512.07351 [pdf, ps, other]: Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake Detection

Authors: Sayeem Been Zaman, Wasimul Karim, Arefin Ittesafun Abian, Reem E. Mohamed, Md Rafiqul Islam, Asif Karim, Sami Azam

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
[189] arXiv:2512.07348 [pdf, ps, other]: Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition

Authors: Xinyu Wei, Kangrui Cen, Hongyang Wei, Zhen Guo, Bairui Li, Zeqing Wang, Jinrui Zhang, Lei Zhang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190] arXiv:2512.07345 [pdf, ps, other]: Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian Splatting

Authors: Shilong Jin, Haoran Duan, Litao Hua, Wentao Huang, Yuan Zhou

Comments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191] arXiv:2512.07338 [pdf, ps, other]: Title: Generalized Referring Expression Segmentation on Aerial Photos

Authors: Luís Marnoto, Alexandre Bernardino, Bruno Martins

Comments: Submitted to IEEE J-STARS

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192] arXiv:2512.07331 [pdf, ps, other]: Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision Transformers

Authors: Kanishk Awadhiya

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193] arXiv:2512.07328 [pdf, ps, other]: Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation

Authors: Ziyang Mai, Yu-Wing Tai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[194] arXiv:2512.07305 [pdf, ps, other]: Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image Dataset

Authors: Tobias Abraham Haider

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195] arXiv:2512.07302 [pdf, ps, other]: Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts

Authors: Mingning Guo, Mengwei Wu, Shaoxian Li, Haifeng Li, Chao Tao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[196] arXiv:2512.07276 [pdf, ps, other]: Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery

Authors: Mai Tsujimoto, Junjue Wang, Weihao Xuan, Naoto Yokoya

Comments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[197] arXiv:2512.07275 [pdf, ps, other]: Title: Effective Attention-Guided Multi-Scale Medical Network for Skin Lesion Segmentation

Authors: Siyu Wang, Hua Wang, Huiyu Li, Fan Zhang

Comments: The paper has been accepted by BIBM 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[198] arXiv:2512.07273 [pdf, ps, other]: Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language Translation

Authors: Zhi Rao, Yucheng Zhou, Benjia Zhou, Yiqing Huang, Sergio Escalera, Jun Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199] arXiv:2512.07269 [pdf, ps, other]: Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth data

Authors: Mike Diessner, Yannick Tarant

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[200] arXiv:2512.07253 [pdf, ps, other]: Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement

Authors: Handing Xu, Zhenguo Nie, Tairan Peng, Huimin Pan, Xin-Jun Liu

Comments: 18 pages, 8 figures, and 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[201] arXiv:2512.07251 [pdf, ps, other]: Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement

Authors: Junqi Liu, Zejun Wu, Pedro R. A. S. Bassi, Xinze Zhou, Wenxuan Li, Ibrahim E. Hamamci, Sezgin Er, Tianyu Lin, Yi Luo, Szymon Płotka, Bjoern Menze, Daguang Xu, Kai Ding, Kang Wang, Yang Yang, Yucheng Tang, Alan L. Yuille, Zongwei Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202] arXiv:2512.07247 [pdf, ps, other]: Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing

Authors: Ziming Hong, Tianyu Huang, Runnan Chen, Shanshan Ye, Mingming Gong, Bo Han, Tongliang Liu

Comments: 40 pages, 34 figures, 18 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[203] arXiv:2512.07245 [pdf, ps, other]: Title: Zero-Shot Textual Explanations via Translating Decision-Critical Features

Authors: Toshinori Yamauchi, Hiroshi Kera, Kazuhiko Kawamoto

Comments: 11+6 pages, 8 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204] arXiv:2512.07241 [pdf, ps, other]: Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network Architecture

Authors: Md. Srabon Chowdhury, Syeda Fahmida Tanzim, Sheekar Banerjee, Ishtiak Al Mamoon, AKM Muzahidul Islam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205] arXiv:2512.07237 [pdf, ps, other]: Title: Unified Camera Positional Encoding for Controlled Video Generation

Authors: Cheng Zhang, Boying Li, Meng Wei, Yan-Pei Cao, Camilo Cruz Gambardella, Dinh Phung, Jianfei Cai

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206] arXiv:2512.07234 [pdf, ps, other]: Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language Models

Authors: Biao Chen, Lin Zuo, Mengmeng Jing, Kunbin He, Yuchen Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[207] arXiv:2512.07230 [pdf, ps, other]: Title: STRinGS: Selective Text Refinement in Gaussian Splatting

Authors: Abhinav Raundhal, Gaurav Behera, P J Narayanan, Ravi Kiran Sarvadevabhatla, Makarand Tapaswi

Comments: Accepted to WACV 2026. Project Page, see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208] arXiv:2512.07229 [pdf, ps, other]: Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category Discovery

Authors: Fang Zhou, Zhiqiang Chen, Martin Pavlovski, Yizhong Zhang

Comments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[209] arXiv:2512.07228 [pdf, ps, other]: Title: Towards Robust Protective Perturbation against DeepFake Face Swapping

Authors: Hengyang Yao, Lin Li, Ke Sun, Jianing Qiu, Huiping Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[210] arXiv:2512.07215 [pdf, ps, other]: Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose Estimation

Authors: Md Selim Sarowar, Sungho Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[211] arXiv:2512.07211 [pdf, ps, other]: Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point Clouds

Authors: Frederik Hagelskjær, Dimitrios Arapis, Steffen Madsen, Thorbjørn Mosekjær Iversen

Comments: 8 pages, 8 figures, 5 tables, ICCR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[212] arXiv:2512.07206 [pdf, ps, other]: Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CT

Authors: Boyang Pan, Zeyu Zhang, Hongyu Meng, Bin Cui, Yingying Zhang, Wenli Hou, Junhao Li, Langdi Zhong, Xiaoxiao Chen, Xiaoyu Xu, Changjin Zuo, Chao Cheng, Nan-Jie Gong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[213] arXiv:2512.07203 [pdf, ps, other]: Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning

Authors: Xuhui Zheng, Kang An, Ziliang Wang, Yuhang Wang, Faqiang Qian, Yichao Wu

Comments: 7 pages, 1 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214] arXiv:2512.07201 [pdf, ps, other]: Title: Understanding Diffusion Models via Code Execution

Authors: Cheng Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[215] arXiv:2512.07198 [pdf, ps, other]: Title: Generating Storytelling Images with Rich Chains-of-Reasoning

Authors: Xiujie Song, Qi Jia, Shota Watanabe, Xiaoyi Pang, Ruijie Chen, Mengyue Wu, Kenny Q. Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[216] arXiv:2512.07197 [pdf, ps, other]: Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting

Authors: Seokhyun Youn, Soohyun Lee, Geonho Kim, Weeyoung Kwon, Sung-Ho Bae, Jihyong Oh

Comments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217] arXiv:2512.07192 [pdf, ps, other]: Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image Compression

Authors: Niu Yi, Xu Tianyi, Ma Mingming, Wang Xinkun

Comments: 12 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[218] arXiv:2512.07191 [pdf, ps, other]: Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field Correction

Authors: Wenqi Zhao, Jiacheng Sang, Fenghua Cheng, Yonglu Shu, Dong Li, Xiaofeng Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219] arXiv:2512.07190 [pdf, ps, other]: Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification

Authors: Pengfei Gu, Huimin Li, Haoteng Tang, Dongkuan (DK)Xu, Erik Enriquez, DongChul Kim, Bin Fu, Danny Z. Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220] arXiv:2512.07186 [pdf, ps, other]: Title: START: Spatial and Textual Learning for Chart Understanding

Authors: Zhuoming Liu, Xiaofeng Gao, Feiyang Niu, Qiaozi Gao, Liu Liu, Robinson Piramuthu

Comments: WACV2026 Camera Ready

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[221] arXiv:2512.07171 [pdf, ps, other]: Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image Restoration

Authors: Shravan Venkatraman, Rakesh Raj Madavan, Pavan Kumar S, Muthu Subash Kavitha

Comments: 21 pages, 11 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222] arXiv:2512.07170 [pdf, ps, other]: Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach

Authors: Jiayang Li, Chengjie Jiang, Junjun Jiang, Pengwei Liang, Jiayi Ma, Liqiang Nie

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[223] arXiv:2512.07166 [pdf, ps, other]: Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing

Authors: Siyuan Xu, Yibing Liu, Peilin Chen, Yung-Hui Li, Shiqi Wang, Sam Kwong

Comments: 9 pages,7figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[224] arXiv:2512.07165 [pdf, ps, other]: Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation

Authors: Muyu Xu, Fangneng Zhan, Xiaoqin Zhang, Ling Shao, Shijian Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225] arXiv:2512.07155 [pdf, ps, other]: Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented Metrics

Authors: Dahyeon Kye, Jeahun Sung, Mingyu Jeon, Jihyong Oh

Comments: Please visit our project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226] arXiv:2512.07141 [pdf, ps, other]: Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models

Authors: Fenghua Weng, Chaochao Lu, Xia Hu, Wenqi Shao, Wenjie Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[227] arXiv:2512.07136 [pdf, ps, other]: Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning

Authors: Siyang Jiang, Mu Yuan, Xiang Ji, Bufang Yang, Zeyu Liu, Lilin Xu, Yang Li, Yuting He, Liran Dong, Wenrui Lu, Zhenyu Yan, Xiaofan Jiang, Wei Gao, Hongkai Chen, Guoliang Xing

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[228] arXiv:2512.07135 [pdf, ps, other]: Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning

Authors: Zebin Xing, Pengxuan Yang, Linbo Wang, Yichen Zhang, Yiming Hu, Yupeng Zheng, Junli Wang, Yinfeng Gao, Guang Li, Kun Ma, Long Chen, Zhongpu Xia, Qichao Zhang, Hangjun Ye, Dongbin Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[229] arXiv:2512.07128 [pdf, ps, other]: Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIP

Authors: Chau Truong, Hieu Ta Quang, Dung D. Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[230] arXiv:2512.07126 [pdf, ps, other]: Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-On

Authors: Shengjie Lu, Zhibin Wan, Jiejie Liu, Quan Zhang, Mingjie Sun

Comments: 16 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231] arXiv:2512.07110 [pdf, ps, other]: Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection

Authors: Liangwei Jiang, Jinluo Xie, Yecheng Huang, Hua Zhang, Hongyu Yang, Di Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2512.07107 [pdf, ps, other]: Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D Supervision

Authors: Jaeyoon Lee, Hojoon Jung, Sungtae Hwang, Jihyong Oh, Jongwon Choi

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233] arXiv:2512.07078 [pdf, ps, other]: Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object Detection

Authors: Bo Gao, Jingcheng Tong, Xingsheng Chen, Han Yu, Zichen Li

Comments: 16 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[234] arXiv:2512.07076 [pdf, ps, other]: Title: Context-measure: Contextualizing Metric for Camouflage

Authors: Chen-Yang Wang, Gepeng Ji, Song Shao, Ming-Ming Cheng, Deng-Ping Fan

Comments: Technical Report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[235] arXiv:2512.07065 [pdf, ps, other]: Title: Persistent Homology-Guided Frequency Filtering for Image Compression

Authors: Anil Chintapalli, Peter Tenholder, Henry Chen, Arjun Rao

Comments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compression

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236] arXiv:2512.07062 [pdf, ps, other]: Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense Prediction

Authors: Changliang Xia, Chengyou Jia, Minnan Luo, Zhuohang Dang, Xin Shen, Bowen Ping

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[237] arXiv:2512.07052 [pdf, ps, other]: Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian Splatting

Authors: Hoang-Nhat Tran, Francesco Di Sario, Gabriele Spadaro, Giuseppe Valenzise, Enzo Tartaglione

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238] arXiv:2512.07051 [pdf, ps, other]: Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation

Authors: Adnan Munir, Shujaat Khan

Comments: 11 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[239] arXiv:2512.07037 [pdf, ps, other]: Title: Evaluating and Preserving High-level Fidelity in Super-Resolution

Authors: Josep M. Rocafort, Shaolin Su, Alexandra Gomez-Villa, Javier Vazquez-Corral

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[240] arXiv:2512.07034 [pdf, ps, other]: Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent Cues

Authors: Tuan-Anh Vu, Hai Nguyen-Truong, Ziqiang Zheng, Binh-Son Hua, Qing Guo, Ivor Tsang, Sai-Kit Yeung

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[241] arXiv:2512.06981 [pdf, ps, other]: Title: Selective Masking based Self-Supervised Learning for Image Semantic Segmentation

Authors: Yuemin Wang, Ian Stavness

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[242] arXiv:2512.06949 [pdf, ps, other]: Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology

Authors: Shravan Venkatraman, Muthu Subash Kavitha, Joe Dhanith P R, V Manikandarajan, Jia Wu

Comments: 19 pages, 5 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243] arXiv:2512.06921 [pdf, ps, other]: Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification

Authors: Ziyang Song, Zelin Zang, Xiaofan Ye, Boqiang Xu, Long Bai, Jinlin Wu, Hongliang Ren, Hongbin Liu, Jiebo Luo, Zhen Lei

Comments: Accepted by IEEE ICIA 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[244] arXiv:2512.06905 [pdf, ps, other]: Title: Scaling Zero-Shot Reference-to-Video Generation

Authors: Zijian Zhou, Shikun Liu, Haozhe Liu, Haonan Qiu, Zhaochong An, Weiming Ren, Zhiheng Liu, Xiaoke Huang, Kam Woh Ng, Tian Xie, Xiao Han, Yuren Cong, Hang Li, Chuyan Zhu, Aditya Patel, Tao Xiang, Sen He

Comments: Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245] arXiv:2512.06888 [pdf, ps, other]: Title: Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation

Authors: Liyang Song, Hardik Bishnoi, Sai Kumar Reddy Manne, Sarah Ostadabbas, Briana J. Taylor, Michael Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[246] arXiv:2512.06886 [pdf, ps, other]: Title: Balanced Learning for Domain Adaptive Semantic Segmentation

Authors: Wangkai Li, Rui Sun, Bohao Liao, Zhaoyang Li, Tianzhu Zhang

Comments: Accepted by International Conference on Machine Learning (ICML 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[247] arXiv:2512.06885 [pdf, ps, other]: Title: JoPano: Unified Panorama Generation via Joint Modeling

Authors: Wancheng Feng, Chen An, Zhenliang He, Meina Kan, Shiguang Shan, Lukun Wang

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[248] arXiv:2512.06882 [pdf, ps, other]: Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian Fusion

Authors: Yu Zhu, Naoya Chiba, Koichi Hashimoto

Comments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon request

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249] arXiv:2512.06877 [pdf, ps, other]: Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene Classification

Authors: Mohammed Q. Alkhatib, Ali Jamali, Swalpa Kumar Roy

Comments: Accepted and presented in ICSPIS

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250] arXiv:2512.06870 [pdf, ps, other]: Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective

Authors: Wangkai Li, Rui Sun, Zhaoyang Li, Tianzhu Zhang

Comments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 749 entries: 1-250 | 251-500 | 501-749 ]
[ showing 250 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Wed, 10 Dec 2025

Tue, 9 Dec 2025 (showing first 119 of 259 entries)