Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 381

[ total of 749 entries: 1-250 | 132-381 | 382-631 | 632-749 ]
[ showing 250 entries per page: fewer | more | all ]

Tue, 9 Dec 2025 (continued, showing last 9 of 259 entries)

[382] arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]: Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods

Authors: Panagiota Kiourti, Anu Singh, Preeti Duraipandian, Weichao Zhou, Wenchao Li

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[383] arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]: Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine Learning

Authors: Camellia Zakaria, Aryan Sadeghi, Weaam Jaafar, Junshi Xu, Alex Mariakakis, Marianne Hatzopoulou

Comments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
[384] arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]: Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network

Authors: Xiao Li

Comments: in Chinese language

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[385] arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]: Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

Authors: Ruicheng Zhang, Mingyang Zhang, Jun Zhou, Zhangrui Guo, Xiaofan Liu, Zunnan Xu, Zhizhou Zhong, Puxin Yan, Haocheng Luo, Xiu Li

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[386] arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]: Title: Vector Quantization using Gaussian Variational Autoencoder

Authors: Tongda Xu, Wendi Zheng, Jiajun He, Jose Miguel Hernandez-Lobato, Yan Wang, Ya-Qin Zhang, Jie Tang

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[387] arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]: Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation

Authors: Xiaojun Jia, Jie Liao, Qi Guo, Teng Ma, Simeng Qin, Ranjie Duan, Tianlin Li, Yihao Huang, Zhitao Zeng, Dongxian Wu, Yiming Li, Wenqi Ren, Xiaochun Cao, Yang Liu

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[388] arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]: Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers

Authors: Hochul Hwang, Soowan Yang, Jahir Sadik Monon, Nicholas A Giudice, Sunghoon Ivan Lee, Joydeep Biswas, Donghyun Kim

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[389] arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]: Title: Semantic Temporal Single-photon LiDAR

Authors: Fang Li, Tonglin Mu, Shuling Li, Junran Guo, Keyuan Li, Jianing Li, Ziyang Luo, Xiaodong Fan, Ye Chen, Yunfeng Liu, Hong Cai, Lip Ket Chin, Jinbei Zhang, Shihai Sun

Comments: 14 pages, 5 figures. And any comment is welcome

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
[390] arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]: Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation

Authors: Azeez Idris, Abdurahman Ali Mohammed, Samuel Fanijo

Comments: NeurIPS Black in AI workshop - 2022

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Mon, 8 Dec 2025

[391] arXiv:2512.05965 [pdf, ps, other]: Title: EditThinker: Unlocking Iterative Reasoning for Any Image Editor

Authors: Hongyu Li, Manyuan Zhang, Dian Zheng, Ziyu Guo, Yimeng Jia, Kaituo Feng, Hao Yu, Yexin Liu, Yan Feng, Peng Pei, Xunliang Cai, Linjiang Huang, Hongsheng Li, Si Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[392] arXiv:2512.05960 [pdf, ps, other]: Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement

Authors: Munsif Ali, Najmul Hassan, Lucia Ventura, Davide Di Bari, Simonepietro Canese

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[393] arXiv:2512.05941 [pdf, ps, other]: Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding

Authors: Zhiyuan Jiang, Shenghao Xie, Wenyi Li, Wenqiang Zu, Peihang Li, Jiahao Qiu, Siqi Pei, Lei Ma, Tiejun Huang, Mengdi Wang, Shilong Liu

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[394] arXiv:2512.05937 [pdf, ps, other]: Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV Perception

Authors: Anne Sielemann, Valentin Barner, Stefan Wolf, Masoud Roschani, Jens Ziehn, Juergen Beyerer

Comments: 8 pages, 2 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[395] arXiv:2512.05936 [pdf, ps, other]: Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition

Authors: Anne Sielemann, Lena Loercher, Max-Lion Schumacher, Stefan Wolf, Masoud Roschani, Jens Ziehn

Comments: 8 pages, 8 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[396] arXiv:2512.05928 [pdf, ps, other]: Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face Recognition

Authors: Pedro Vidal, Bernardo Biesseck, Luiz E. L. Coelho, Roger Granada, David Menotti

Comments: 18 pages, 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[397] arXiv:2512.05927 [pdf, ps, other]: Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty

Authors: Zhiting Mei, Tenny Yin, Micah Baker, Ola Shorinwa, Anirudha Majumdar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[398] arXiv:2512.05922 [pdf, ps, other]: Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology Segmentation

Authors: Khang Le, Anh Mai Vu, Thi Kim Trang Vo, Ha Thach, Ngoc Bui Lam Quang, Thanh-Huy Nguyen, Minh H. N. Le, Zhu Han, Chandra Mohan, Hien Van Nguyen

Comments: Note: Khang Le and Anh Mai Vu contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[399] arXiv:2512.05920 [pdf, ps, other]: Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery Prediction

Authors: Jiawen Yang, Yihui Cao, Xuanyu Tian, Yuyao Zhang, Hongjiang Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[400] arXiv:2512.05905 [pdf, ps, other]: Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

Authors: Wenhao Yan, Sheng Ye, Zhuoyi Yang, Jiayan Teng, ZhenHui Dong, Kairui Wen, Xiaotao Gu, Yong-Jin Liu, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[401] arXiv:2512.05866 [pdf, ps, other]: Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator

Authors: Md. Mahbub Hasan Akash, Aria Tasnim Mridula, Sheekar Banerjee, Ishtiak Al Mamoon

Comments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[402] arXiv:2512.05859 [pdf, ps, other]: Title: Edit-aware RAW Reconstruction

Authors: Abhijith Punnappurath, Luxi Zhao, Ke Zhao, Hue Nguyen, Radek Grzeszczuk, Michael S. Brown

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403] arXiv:2512.05853 [pdf, ps, other]: Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack

Authors: Shiji Zhao, Shukun Xiong, Yao Huang, Yan Jin, Zhenyu Wu, Jiyang Guan, Ranjie Duan, Jialing Tao, Hui Xue, Xingxing Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404] arXiv:2512.05830 [pdf, ps, other]: Title: Phase-OTDR Event Detection Using Image-Based Data Transformation and Deep Learning

Authors: Muhammet Cagri Yeke, Samil Sirin, Kivilcim Yuksel, Abdurrahman Gumus

Comments: 22 pages, 11 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[405] arXiv:2512.05814 [pdf, ps, other]: Title: UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease Detection

Authors: Fubao Zhu, Zhanyuan Jia, Zhiguo Wang, Huan Huang, Danyang Sun, Chuang Han, Yanting Li, Jiaofen Nan, Chen Zhao, Weihua Zhou

Comments: The code is already available on GitHub: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[406] arXiv:2512.05809 [pdf, ps, other]: Title: Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling

Authors: Saurav Jha, M. Jehanzeb Mirza, Wei Lin, Shiqi Yang, Sarath Chandar

Comments: Extended abstract at World Modeling Workshop 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[407] arXiv:2512.05802 [pdf, ps, other]: Title: Bring Your Dreams to Life: Continual Text-to-Video Customization

Authors: Jiahua Dong, Xudong Wang, Wenqi Liang, Zongyan Han, Meng Cao, Duzhen Zhang, Hanbin Zhao, Zhi Han, Salman Khan, Fahad Shahbaz Khan

Comments: Accepted to AAAI2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408] arXiv:2512.05783 [pdf, ps, other]: Title: Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth

Authors: Maryam Yousefi, Soodeh Bakhshandeh

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[409] arXiv:2512.05774 [pdf, ps, other]: Title: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding

Authors: Ziyang Wang, Honglu Zhou, Shijie Wang, Junnan Li, Caiming Xiong, Silvio Savarese, Mohit Bansal, Michael S. Ryoo, Juan Carlos Niebles

Comments: Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[410] arXiv:2512.05762 [pdf, ps, other]: Title: FNOPT: Resolution-Agnostic, Self-Supervised Cloth Simulation using Meta-Optimization with Fourier Neural Operators

Authors: Ruochen Chen, Thuy Tran, Shaifali Parashar

Comments: Accepted for WACV

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[411] arXiv:2512.05759 [pdf, ps, other]: Title: Label-Efficient Point Cloud Segmentation with Active Learning

Authors: Johannes Meyer, Jasper Hoffmann, Felix Schulz, Dominik Merkle, Daniel Buescher, Alexander Reiterer, Joschka Boedecker, Wolfram Burgard

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[412] arXiv:2512.05754 [pdf, ps, other]: Title: USV: Unified Sparsification for Accelerating Video Diffusion Models

Authors: Xinjian Wu, Hongmei Wang, Yuan Zhou, Qinglin Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[413] arXiv:2512.05746 [pdf, ps, other]: Title: HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models

Authors: Shizhuo Mao, Hongtao Zou, Qihu Xie, Song Chen, Yi Kang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[414] arXiv:2512.05740 [pdf, ps, other]: Title: Distilling Expert Surgical Knowledge: How to train local surgical VLMs for anatomy explanation in Complete Mesocolic Excision

Authors: Lennart Maack, Julia-Kristin Graß, Lisa-Marie Toscha, Nathaniel Melling, Alexander Schlaefer

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[415] arXiv:2512.05710 [pdf, ps, other]: Title: Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature Learning

Authors: Jianan Sun, Dongzhihan Wang, Mingyu Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[416] arXiv:2512.05698 [pdf, ps, other]: Title: OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning

Authors: Xusheng Guo, Wanfa Zhang, Shijia Zhao, Qiming Xia, Xiaolong Xie, Mingming Wang, Hai Wu, Chenglu Wen

Comments: The 40th Annual AAAI Conference on Artificial Intelligence

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[417] arXiv:2512.05683 [pdf, ps, other]: Title: Physics-Informed Graph Neural Network with Frequency-Aware Learning for Optical Aberration Correction

Authors: Yong En Kok, Bowen Deng, Alexander Bentley, Andrew J. Parkes, Michael G. Somekh, Amanda J. Wright, Michael P. Pound

Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[418] arXiv:2512.05674 [pdf, ps, other]: Title: Hyperspectral Unmixing with 3D Convolutional Sparse Coding and Projected Simplex Volume Maximization

Authors: Gargi Panda, Soumitra Kundu, Saumik Bhattacharya, Aurobinda Routray

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[419] arXiv:2512.05672 [pdf, ps, other]: Title: InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem

Authors: Yeobin Hong, Suhyeon Lee, Hyungjin Chung, Jong Chul Ye

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[420] arXiv:2512.05669 [pdf, ps, other]: Title: Deep Learning-Based Real-Time Sequential Facial Expression Analysis Using Geometric Features

Authors: Talha Enes Koksal, Abdurrahman Gumus

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[421] arXiv:2512.05663 [pdf, ps, other]: Title: LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection

Authors: Johannes Meier, Jonathan Michel, Oussema Dhaouadi, Yung-Hsu Yang, Christoph Reich, Zuria Bauer, Stefan Roth, Marc Pollefeys, Jacques Kaiser, Daniel Cremers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[422] arXiv:2512.05651 [pdf, ps, other]: Title: Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective

Authors: Nan Zhong, Mian Zou, Yiran Xu, Zhenxing Qian, Xinpeng Zhang, Baoyuan Wu, Kede Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[423] arXiv:2512.05635 [pdf, ps, other]: Title: Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data

Authors: Georgy Perevozchikov, Nancy Mehta, Egor Ershov, Radu Timofte

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[424] arXiv:2512.05613 [pdf, ps, other]: Title: DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation Model

Authors: Pasquale De Marinis, Pieter M. Blok, Uzay Kaymak, Rogier Brussee, Gennaro Vessio, Giovanna Castellano

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[425] arXiv:2512.05610 [pdf, ps, other]: Title: NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projections

Authors: Juho Korkeala, Jesse Muhojoki, Josef Taher, Klaara Salolahti, Matti Hyyppä, Antero Kukko, Juha Hyyppä

Comments: 19 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[426] arXiv:2512.05597 [pdf, ps, other]: Title: Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token Prediction

Authors: Ruihong Yin, Xuepeng Shi, Oleksandr Bailo, Marco Manfredi, Theo Gevers

Comments: 10 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427] arXiv:2512.05593 [pdf, ps, other]: Title: Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer

Authors: Rong Wang, Wei Mao, Changsheng Lu, Hongdong Li

Comments: Accepted to 3DV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[428] arXiv:2512.05571 [pdf, ps, other]: Title: MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical Imaging

Authors: Xingyu Zhang, Anna Reithmeir, Fryderyk Kögl, Rickmer Braren, Julia A. Schnabel, Daniel M. Lang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429] arXiv:2512.05564 [pdf, ps, other]: Title: ProPhy: Progressive Physical Alignment for Dynamic World Simulation

Authors: Zijun Wang, Panwen Hu, Jing Wang, Terry Jingchen Zhang, Yuhao Cheng, Long Chen, Yiqiang Yan, Zutao Jiang, Hanhui Li, Xiaodan Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430] arXiv:2512.05557 [pdf, ps, other]: Title: 2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence Consistency

Authors: Xingxi Yin, Yicheng Li, Gong Yan, Chenglin Li, Jian Zhao, Cong Huang, Yue Deng, Yin Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[431] arXiv:2512.05546 [pdf, ps, other]: Title: Conscious Gaze: Adaptive Attention Mechanisms for Hallucination Mitigation in Vision-Language Models

Authors: Weijue Bu, Guan Yuan, Guixian Zhang

Comments: 6 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[432] arXiv:2512.05539 [pdf, ps, other]: Title: Ideal Observer for Segmentation of Dead Leaves Images

Authors: Swantje Mahncke, Malte Ott

Comments: 41 pages, 16 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Methodology (stat.ME)
[433] arXiv:2512.05529 [pdf, ps, other]: Title: See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors

Authors: Kunyi Yang, Qingyu Wang, Cheng Yuan, Yutong Ban

Comments: The first two authors contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[434] arXiv:2512.05524 [pdf, ps, other]: Title: VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation

Authors: Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435] arXiv:2512.05515 [pdf, ps, other]: Title: DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis

Authors: Yuhua Wen, Qifei Li, Yingying Zhou, Yingming Gao, Zhengqi Wen, Jianhua Tao, Ya Li

Comments: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[436] arXiv:2512.05513 [pdf, ps, other]: Title: Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning

Authors: Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[437] arXiv:2512.05511 [pdf, ps, other]: Title: Rethinking Infrared Small Target Detection: A Foundation-Driven Efficient Paradigm

Authors: Chuang Yu, Jinmiao Zhao, Yunpeng Liu, Yaokun Li, Xiujun Shu, Yuanhao Feng, Bo Wang, Yimian Dai, Xiangyu Yue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[438] arXiv:2512.05494 [pdf, ps, other]: Title: Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation

Authors: Fan Zhang, Zhiwei Gu, Hua Wang

Comments: Accepted to AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[439] arXiv:2512.05492 [pdf, ps, other]: Title: WaterWave: Bridging Underwater Image Enhancement into Video Streams via Wavelet-based Temporal Consistency Field

Authors: Qi Zhu, Jingyi Zhang, Naishan Zheng, Wei Yu, Jinghao Zhang, Deyi Ji, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[440] arXiv:2512.05482 [pdf, ps, other]: Title: Concept-based Explainable Data Mining with VLM for 3D Detection

Authors: Mai Tsujimoto

Comments: 28 pages including appendix. Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[441] arXiv:2512.05481 [pdf, ps, other]: Title: UniFS: Unified Multi-Contrast MRI Reconstruction via Frequency-Spatial Fusion

Authors: Jialin Li, Yiwei Ren, Kai Pan, Dong Wei, Pujin Cheng, Xian Wu, Xiaoying Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[442] arXiv:2512.05478 [pdf, ps, other]: Title: EmoStyle: Emotion-Driven Image Stylization

Authors: Jingyuan Yang, Zihuan Bai, Hui Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[443] arXiv:2512.05468 [pdf, ps, other]: Title: University Building Recognition Dataset in Thailand for the mission-oriented IoT sensor system

Authors: Takara Taniguchi, Yudai Ueda, Atsuya Muramatsu, Kohki Hashimoto, Ryo Yagi, Hideya Ochiai, Chaodit Aswakul

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[444] arXiv:2512.05446 [pdf, ps, other]: Title: TED-4DGS: Temporally Activated and Embedding-based Deformation for 4DGS Compression

Authors: Cheng-Yuan Ho, He-Bi Yang, Jui-Chiu Chiang, Yu-Lun Liu, Wen-Hsiao Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[445] arXiv:2512.05422 [pdf, ps, other]: Title: ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction

Authors: Jiangtong Tan, Lin Liu, Jie Huanng, Xiaopeng Zhang, Qi Tian, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[446] arXiv:2512.05418 [pdf, ps, other]: Title: Performance Evaluation of Deep Learning for Tree Branch Segmentation in Autonomous Forestry Systems

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[447] arXiv:2512.05415 [pdf, ps, other]: Title: Moving object detection from multi-depth images with an attention-enhanced CNN

Authors: Masato Shibukawa, Fumi Yoshida, Toshifumi Yanagisawa, Takashi Ito, Hirohisa Kurosaki, Makoto Yoshikawa, Kohki Kamiya, Ji-an Jiang, Wesley Fraser, JJ Kavelaars, Susan Benecchi, Anne Verbiscer, Akira Hatakeyama, Hosei O, Naoya Ozaki

Comments: 14 pages, 22 figures, submitted to PASJ

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[448] arXiv:2512.05412 [pdf, ps, other]: Title: YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[449] arXiv:2512.05410 [pdf, ps, other]: Title: Genetic Algorithms For Parameter Optimization for Disparity Map Generation of Radiata Pine Branch Images

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[450] arXiv:2512.05398 [pdf, ps, other]: Title: The Dynamic Prior: Understanding 3D Structures for Casual Dynamic Videos

Authors: Zhuoyuan Wu, Xurui Yang, Jiahui Huang, Yue Wang, Jun Gao

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451] arXiv:2512.05394 [pdf, ps, other]: Title: Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability

Authors: Shizhan Liu, Xinran Deng, Zhuoyi Yang, Jiayan Teng, Xiaotao Gu, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[452] arXiv:2512.05391 [pdf, ps, other]: Title: LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models

Authors: Qingqiao Hu, Weimin Lyu, Meilong Xu, Kehan Qi, Xiaoling Hu, Saumya Gupta, Jiawei Zhou, Chao Chen

Comments: 20 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453] arXiv:2512.05385 [pdf, ps, other]: Title: ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration

Authors: Yingjie Xia, Tao Liu, Jinglei Shi, Qingsong Xie, Heng Guo, Jian Yang, Xi Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[454] arXiv:2512.05362 [pdf, ps, other]: Title: PoolNet: Deep Learning for 2D to 3D Video Process Validation

Authors: Sanchit Kaul, Joseph Luna, Shray Arora

Comments: All code related to this paper can be found at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[455] arXiv:2512.05359 [pdf, ps, other]: Title: Group Orthogonal Low-Rank Adaptation for RGB-T Tracking

Authors: Zekai Shao, Yufan Hu, Jingyuan Liu, Bin Fan, Hongmin Liu

Comments: 13 pages, 8 figures. Accepted by AAAI 2026. Extended version

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456] arXiv:2512.05354 [pdf, ps, other]: Title: SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training

Authors: Yang Zheng, Hao Tan, Kai Zhang, Peng Wang, Leonidas Guibas, Gordon Wetzstein, Wang Yifan

Comments: project page this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[457] arXiv:2512.05343 [pdf, ps, other]: Title: SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling

Authors: Elisabetta Fedele, Francis Engelmann, Ian Huang, Or Litany, Marc Pollefeys, Leonidas Guibas

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[458] arXiv:2512.05277 [pdf, ps, other]: Title: From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model

Authors: Kevin Cannons, Saeed Ranjbar Alvar, Mohammad Asiful Hossain, Ahmad Rezaei, Mohsen Gholami, Alireza Heidarikhazaei, Zhou Weimin, Yong Zhang, Mohammad Akbari

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[459] arXiv:2512.05272 [pdf, ps, other]: Title: Inferring Compositional 4D Scenes without Ever Seeing One

Authors: Ahmet Berke Gokmen, Ajad Chhatkuli, Luc Van Gool, Danda Pani Paudel

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460] arXiv:2512.05268 [pdf, ps, other]: Title: CARD: Correlation Aware Restoration with Diffusion

Authors: Niki Nezakati, Arnab Ghosh, Amit Roy-Chowdhury, Vishwanath Saragadam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461] arXiv:2512.05259 [pdf, ps, other]: Title: Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization

Authors: Georgios Chatzichristodoulou, Niki Efthymiou, Panagiotis Filntisis, Georgios Pavlakos, Petros Maragos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462] arXiv:2512.05240 [pdf, ps, other]: Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction

Authors: Dmitrii Torbunov, Onur Okuducu, Yi Huang, Odera Dim, Rebecca Coles, Yonggang Cui, Yihui Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463] arXiv:2512.05209 [pdf, ps, other]: Title: DEAR: Dataset for Evaluating the Aesthetics of RenderingDEAR: Dataset for Evaluating the Aesthetics of Rendering

Authors: Vsevolod Plohotnuk, Artyom Panshin, Nikola Banić, Simone Bianco, Michael Freeman, Egor Ershov

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464] arXiv:2512.05198 [pdf, ps, other]: Title: Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion Models

Authors: Rowan Bradbury, Dazhi Zhong

Comments: 16 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[465] arXiv:2512.05172 [pdf, ps, other]: Title: Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning

Authors: Wentao Wang, Chunyang Liu, Kehua Sheng, Bo Zhang, Yan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[466] arXiv:2512.05152 [pdf, ps, other]: Title: EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer Models

Authors: Kun Wang, Donglin Di, Tonghua Su, Lei Fan

Comments: 6pages, 5figures, published to 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[467] arXiv:2512.05150 [pdf, ps, other]: Title: TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Authors: Zhenglin Cheng, Peng Sun, Jianguo Li, Tao Lin

Comments: arxiv v0

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[468] arXiv:2512.05145 [pdf, ps, other]: Title: Self-Improving VLM Judges Without Human Annotations

Authors: Inna Wanyin Lin, Yushi Hu, Shuyue Stella Li, Scott Geng, Pang Wei Koh, Luke Zettlemoyer, Tim Althoff, Marjan Ghazvininejad

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469] arXiv:2512.05140 [pdf, other]: Title: FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation

Authors: Georges Le Bellier (CEDRIC - VERTIGO, Cnam), Nicolas Audebert (LaSTIG, IGN, CEDRIC - VERTIGO)

Comments: 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2026, Tucson (AZ), United States

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[470] arXiv:2512.05139 [pdf, ps, other]: Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models

Authors: Yang Xiang, Jingwen Zhong, Yige Yan, Petros Koutrakis, Eric Garshick, Meredith Franklin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[471] arXiv:2512.05137 [pdf, ps, other]: Title: ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images

Authors: Yunfei Zhang, Yizhuo He, Yuanxun Shao, Zhengtao Yao, Haoyan Xu, Junhao Dong, Zhen Yao, Zhikang Dong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[472] arXiv:2512.05136 [pdf, ps, other]: Title: Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes

Authors: Yujie Xiao, Gongzhen Tang, Deyun Zhang, Jun Li, Guangkun Nie, Haoyu Wang, Shun Huang, Tong Liu, Qinghao Zhao, Kangyin Chen, Shenda Hong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[473] arXiv:2512.05134 [pdf, ps, other]: Title: InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models

Authors: Zihao Wu

Comments: 8 pages main, 8 pages appendix, 16 figures, 5 tables. Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[474] arXiv:2512.05132 [pdf, ps, other]: Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training

Authors: Wenshuo Wang, Fan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[475] arXiv:2512.05131 [pdf, ps, other]: Title: AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance

Authors: Tianling Xu, Shengzhe Gan, Leslie Gu, Yuelei Li, Fangneng Zhan, Hanspeter Pfister

Comments: Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[476] arXiv:2512.05959 (cross-list from cs.CL) [pdf, ps, other]: Title: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

Authors: David Anugraha, Patrick Amadeus Irawan, Anshul Singh, En-Shiun Annie Lee, Genta Indra Winata

Comments: Preprint

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[477] arXiv:2512.05955 (cross-list from cs.RO) [pdf, ps, other]: Title: SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models

Authors: Haowen Liu, Shaoxiong Yao, Haonan Chen, Jiawei Gao, Jiayuan Mao, Jia-Bin Huang, Yilun Du

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[478] arXiv:2512.05932 (cross-list from cs.RO) [pdf, ps, other]: Title: Physically-Based Simulation of Automotive LiDAR

Authors: L. Dudzik, M. Roschani, A. Sielemann, K. Trampert, J. Ziehn, J. Beyerer, C. Neumann

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[479] arXiv:2512.05824 (cross-list from cs.AI) [pdf, ps, other]: Title: Multimodal Oncology Agent for IDH1 Mutation Prediction in Low-Grade Glioma

Authors: Hafsa Akebli (1), Adam Shephard (2), Vincenzo Della Mea (1), Nasir Rajpoot (2 and 3) ((1) University of Udine, Udine, Italy, (2) University of Warwick, Coventry, UK, (3) Histofy Ltd, Coventry, UK)

Comments: 4 pages, 2 figures

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[480] arXiv:2512.05812 (cross-list from cs.RO) [pdf, ps, other]: Title: Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation

Authors: Fabian Konstantinidis, Moritz Sackmann, Ulrich Hofmann, Christoph Stiller

Comments: This work has been submitted to the IEEE for possible publication

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[481] arXiv:2512.05665 (cross-list from cs.CL) [pdf, ps, other]: Title: Interleaved Latent Visual Reasoning with Selective Perceptual Modeling

Authors: Shuai Dong, Siyuan Wang, Xingyu Liu, Zhongyu Wei

Comments: 11 pages, 6 figures. Code available at this https URL

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[482] arXiv:2512.05438 (cross-list from cs.HC) [pdf, ps, other]: Title: EXR: An Interactive Immersive EHR Visualization in Extended Reality

Authors: Benoit Marteau, Shaun Q. Y. Tan, Jieru Li, Andrew Hornback, Yishan Zhong, Shaunna Wang, Christian Lowson, Jason Woloff, Joshua M. Pahys, Steven W. Hwang, Coleman Hilton, May D. Wang

Comments: 11 pages, 6 figures. Preprint version. This paper has been accepted to IEEE ICIR 2025. This is the author-prepared version and not the final published version. The final version will appear in IEEE Xplo

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[483] arXiv:2512.05299 (cross-list from eess.SY) [pdf, ps, other]: Title: ARCAS: An Augmented Reality Collision Avoidance System with SLAM-Based Tracking for Enhancing VRU Safety

Authors: Ahmad Yehia, Jiseop Byeon, Tianyi Wang, Huihai Wang, Yiming Xu, Junfeng Jiao, Christian Claudel

Comments: 8 pages, 3 figures, 1 table

Subjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Robotics (cs.RO); Image and Video Processing (eess.IV)
[484] arXiv:2512.05126 (cross-list from eess.AS) [pdf, ps, other]: Title: SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS Model

Authors: Kaidi Wang, Yi He, Wenhao Guan, Weijie Wu, Hongwu Ding, Xiong Zhang, Di Wu, Meng Meng, Jian Luan, Lin Li, Qingyang Hong

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)

Fri, 5 Dec 2025

[485] arXiv:2512.05115 [pdf, ps, other]: Title: Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Authors: Tianqi Liu, Zhaoxi Chen, Zihao Huang, Shaocong Xu, Saining Zhang, Chongjie Ye, Bohan Li, Zhiguo Cao, Wei Li, Hao Zhao, Ziwei Liu

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[486] arXiv:2512.05113 [pdf, ps, other]: Title: Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting

Authors: Hao-Jen Chien, Yi-Chuan Huang, Chung-Ho Wu, Wei-Lun Chao, Yu-Lun Liu

Comments: WACV 2026. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[487] arXiv:2512.05112 [pdf, ps, other]: Title: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

Authors: Dongzhi Jiang, Renrui Zhang, Haodong Li, Zhuofan Zong, Ziyu Guo, Jun He, Claire Guo, Junyan Ye, Rongyao Fang, Weijia Li, Rui Liu, Hongsheng Li

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[488] arXiv:2512.05111 [pdf, ps, other]: Title: ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Authors: Shengyuan Ding, Xinyu Fang, Ziyu Liu, Yuhang Zang, Yuhang Cao, Xiangyu Zhao, Haodong Duan, Xiaoyi Dong, Jianze Liang, Bin Wang, Conghui He, Dahua Lin, Jiaqi Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[489] arXiv:2512.05110 [pdf, ps, other]: Title: ShadowDraw: From Any Object to Shadow-Drawing Compositional Art

Authors: Rundong Luo, Noah Snavely, Wei-Chiu Ma

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[490] arXiv:2512.05106 [pdf, ps, other]: Title: NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation

Authors: Yu Zeng, Charles Ochoa, Mingyuan Zhou, Vishal M. Patel, Vitor Guizilini, Rowan McAllister

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Robotics (cs.RO)
[491] arXiv:2512.05104 [pdf, ps, other]: Title: EvoIR: Towards All-in-One Image Restoration via Evolutionary Frequency Modulation

Authors: Jiaqi Ma, Shengkai Hu, Jun Wan, Jiaxing Huang, Lefei Zhang, Salman Khan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[492] arXiv:2512.05098 [pdf, ps, other]: Title: SA-IQA: Redefining Image Quality Assessment for Spatial Aesthetics with Multi-Dimensional Rewards

Authors: Yuan Gao, Jin Song

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[493] arXiv:2512.05091 [pdf, ps, other]: Title: Visual Reasoning Tracer: Object-Level Grounded Reasoning Benchmark

Authors: Haobo Yuan, Yueyi Sun, Yanwei Li, Tao Zhang, Xueqing Deng, Henghui Ding, Lu Qi, Anran Wang, Xiangtai Li, Ming-Hsuan Yang

Comments: Technical Report; Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[494] arXiv:2512.05081 [pdf, ps, other]: Title: Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression

Authors: Jung Yi, Wooseok Jang, Paul Hyunbin Cho, Jisu Nam, Heeji Yoon, Seungryong Kim

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[495] arXiv:2512.05079 [pdf, ps, other]: Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints

Authors: Minghan Zhu, Zhiyi Wang, Qihang Sun, Maani Ghaffari, Michael Posa

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[496] arXiv:2512.05076 [pdf, ps, other]: Title: BulletTime: Decoupled Control of Time and Camera Pose for Video Generation

Authors: Yiming Wang, Qihang Zhang, Shengqu Cai, Tong Wu, Jan Ackermann, Zhengfei Kuang, Yang Zheng, Frano Rajič, Siyu Tang, Gordon Wetzstein

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[497] arXiv:2512.05060 [pdf, ps, other]: Title: 4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer

Authors: Xianfeng Wu, Yajing Bai, Minghan Li, Xianzu Wu, Xueqi Zhao, Zhongyuan Lai, Wenyu Liu, Xinggang Wang

Comments: Code: this https URL, Webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[498] arXiv:2512.05044 [pdf, ps, other]: Title: Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Authors: Yanran Zhang, Ziyi Wang, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu

Comments: 18 Pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[499] arXiv:2512.05039 [pdf, ps, other]: Title: Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual Encoding

Authors: Abhigyan Bhattacharya, Hiranmoy Roy

Comments: Submitted for review CVPR-2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[500] arXiv:2512.05025 [pdf, ps, other]: Title: RAMEN: Resolution-Adjustable Multimodal Encoder for Earth Observation

Authors: Nicolas Houdré, Diego Marcos, Hugo Riffaud de Turckheim, Dino Ienco, Laurent Wendling, Camille Kurtz, Sylvain Lobry

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[501] arXiv:2512.05021 [pdf, ps, other]: Title: HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition

Authors: Pham Thach Thanh Truc, Dang Hoai Nam, Huynh Tong Dang Khoa, Vo Nguyen Le Duy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[502] arXiv:2512.05016 [pdf, ps, other]: Title: Generative Neural Video Compression via Video Diffusion Prior

Authors: Qi Mao, Hao Cheng, Tinghan Yang, Libiao Jin, Siwei Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[503] arXiv:2512.05006 [pdf, ps, other]: Title: Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects

Authors: Xianghui Fan, Zhaoyu Chen, Mengyang Pan, Anping Deng, Hang Yang

Comments: conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[504] arXiv:2512.05000 [pdf, ps, other]: Title: Reflection Removal through Efficient Adaptation of Diffusion Transformers

Authors: Daniyar Zakarin, Thiemo Wandel, Anton Obukhov, Dengxin Dai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[505] arXiv:2512.04996 [pdf, ps, other]: Title: A dynamic memory assignment strategy for dilation-based ICP algorithm on embedded GPUs

Authors: Qiong Chang, Weimin Wang, Junpei Zhong, Jun Miyazaki

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[506] arXiv:2512.04981 [pdf, ps, other]: Title: Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

Authors: NaHyeon Park, Namin An, Kunhee Kim, Soyeon Yoon, Jiahao Huo, Hyunjung Shim

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[507] arXiv:2512.04970 [pdf, ps, other]: Title: Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks

Authors: Leonid Pogorelyuk, Niels Bracher, Aaron Verkleeren, Lars Kühmichel, Stefan T. Radev

Comments: UniReps Workshop 2025, 12 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[508] arXiv:2512.04969 [pdf, ps, other]: Title: Rethinking the Use of Vision Transformers for AI-Generated Image Detection

Authors: NaHyeon Park, Kunhee Kim, Junsuk Choe, Hyunjung Shim

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[509] arXiv:2512.04967 [pdf, ps, other]: Title: Balanced Few-Shot Episodic Learning for Accurate Retinal Disease Diagnosis

Authors: Jasmaine Khale, Ravi Prakash Srivastava

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[510] arXiv:2512.04963 [pdf, ps, other]: Title: GeoPE:A Unified Geometric Positional Embedding for Structured Tensors

Authors: Yupu Yao, Bowen Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[511] arXiv:2512.04952 [pdf, ps, other]: Title: FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via Neural Action Tokenization

Authors: Yicheng Liu, Shiduo Zhang, Zibin Dong, Baijun Ye, Tianyuan Yuan, Xiaopeng Yu, Linqi Yin, Chenhao Lu, Junhao Shi, Luca Jiang-Tao Yu, Liangtao Zheng, Tao Jiang, Jingjing Gong, Xipeng Qiu, Hang Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[512] arXiv:2512.04943 [pdf, ps, other]: Title: Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition

Authors: Novanto Yudistira

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[513] arXiv:2512.04939 [pdf, ps, other]: Title: LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging

Authors: Zhijian Shu, Cheng Lin, Tao Xie, Wei Yin, Ben Li, Zhiyuan Pu, Weize Li, Yao Yao, Xun Cao, Xiaoyang Guo, Xiao-Xiao Long

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[514] arXiv:2512.04927 [pdf, ps, other]: Title: Virtually Unrolling the Herculaneum Papyri by Diffeomorphic Spiral Fitting

Authors: Paul Henderson

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[515] arXiv:2512.04926 [pdf, ps, other]: Title: Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion

Authors: Yueming Pan, Ruoyu Feng, Qi Dai, Yuqi Wang, Wenfeng Lin, Mingyu Guo, Chong Luo, Nanning Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[516] arXiv:2512.04904 [pdf, ps, other]: Title: ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching

Authors: Guanbo Huang, Jingjia Mao, Fanding Huang, Fengkai Liu, Xiangyang Luo, Yaoyuan Liang, Jiasheng Lu, Xiaoe Wang, Pei Liu, Ruiliu Fu, Shao-Lun Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[517] arXiv:2512.04890 [pdf, ps, other]: Title: Equivariant Symmetry-Aware Head Pose Estimation for Fetal MRI

Authors: Ramya Muthukrishnan, Borjan Gagoski, Aryn Lee, P. Ellen Grant, Elfar Adalsteinsson, Polina Golland, Benjamin Billot

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[518] arXiv:2512.04888 [pdf, ps, other]: Title: You Only Train Once (YOTO): A Retraining-Free Object Detection Framework

Authors: Priyanto Hidayatullah, Nurjannah Syakrani, Yudi Widhiyasana, Muhammad Rizqi Sholahuddin, Refdinal Tubagus, Zahri Al Adzani Hidayat, Hanri Fajar Ramadhan, Dafa Alfarizki Pratama, Farhan Muhammad Yasin

Comments: This manuscript was first submitted to the Engineering (Elsevier Journal). The preprint version was posted to arXiv afterwards to facilitate open access and community feedback

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[519] arXiv:2512.04883 [pdf, ps, other]: Title: SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms

Authors: Jiawen Wen, Yu Hu, Suixuan Qiu, Jinshan Huang, Xiaowen Chu

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[520] arXiv:2512.04875 [pdf, ps, other]: Title: SP-Det: Self-Prompted Dual-Text Fusion for Generalized Multi-Label Lesion Detection

Authors: Qing Xu, Yanqian Wang, Xiangjian Hea, Yue Li, Yixuan Zhang, Rong Qu, Wenting Duan, Zhen Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[521] arXiv:2512.04862 [pdf, ps, other]: Title: Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing

Authors: Maria-Paola Forte, Nikos Athanasiou, Giulia Ballardini, Jan Ulrich Bartels, Katherine J. Kuchenbecker, Michael J. Black

Comments: * Equal contribution. Minor figure corrections compared to the ICCV 2025 version

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[522] arXiv:2512.04857 [pdf, ps, other]: Title: Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens

Authors: Ziran Qin, Youru Lv, Mingbao Lin, Zeren Zhang, Chanfan Gan, Tieyuan Chen, Weiyao Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[523] arXiv:2512.04837 [pdf, ps, other]: Title: A Sanity Check for Multi-In-Domain Face Forgery Detection in the Real World

Authors: Jikang Cheng, Renye Yan, Zhiyuan Yan, Yaozhong Gan, Xueyi Zhang, Zhongyuan Wang, Wei Peng, Ling Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[524] arXiv:2512.04832 [pdf, ps, other]: Title: Tokenizing Buildings: A Transformer for Layout Synthesis

Authors: Manuel Ladron de Guevara, Jinmo Rhee, Ardavan Bidgoli, Vaidas Razgaitis, Michael Bergin

Comments: 8 pages, 1 page References, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[525] arXiv:2512.04830 [pdf, ps, other]: Title: FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis

Authors: Shijie Chen, Peixi Peng

Comments: Novel View Synthesis, Driving Scene, Free Trajectory, Image Generation

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[526] arXiv:2512.04821 [pdf, ps, other]: Title: LatentFM: A Latent Flow Matching Approach for Generative Medical Image Segmentation

Authors: Huynh Trinh Ngoc, Hoang Anh Nguyen Kim, Toan Nguyen Hai, Long Tran Quoc

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527] arXiv:2512.04815 [pdf, ps, other]: Title: RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS

Authors: Chuanyu Fu, Guanying Chen, Yuqi Zhang, Kunbin Yao, Yuan Xiong, Chuan Huang, Shuguang Cui, Yasuyuki Matsushita, Xiaochun Cao

Comments: arXiv admin note: substantial text overlap with arXiv:2506.02751

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528] arXiv:2512.04810 [pdf, ps, other]: Title: EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

Authors: Xin He, Longhui Wei, Jianbo Ouyang, Minghui Liao, Lingxi Xie, Qi Tian

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[529] arXiv:2512.04786 [pdf, ps, other]: Title: LaFiTe: A Generative Latent Field for 3D Native Texturing

Authors: Chia-Hao Chen, Zi-Xin Zou, Yan-Pei Cao, Ze Yuan, Guan Luo, Xiaojuan Qi, Ding Liang, Song-Hai Zhang, Yuan-Chen Guo

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530] arXiv:2512.04784 [pdf, ps, other]: Title: PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

Authors: Bowen Ping, Chengyou Jia, Minnan Luo, Changliang Xia, Xin Shen, Zhuohang Dang, Hangwei Qian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531] arXiv:2512.04761 [pdf, ps, other]: Title: Order Matters: 3D Shape Generation from Sequential VR Sketches

Authors: Yizi Chen, Sidi Wu, Tianyi Xiao, Nina Wiedemann, Loic Landrieu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532] arXiv:2512.04734 [pdf, ps, other]: Title: MT-Depth: Multi-task Instance feature analysis for the Depth Completion

Authors: Abdul Haseeb Nizamani, Dandi Zhou, Xinhai Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533] arXiv:2512.04733 [pdf, ps, other]: Title: E3AD: An Emotion-Aware Vision-Language-Action Model for Human-Centric End-to-End Autonomous Driving

Authors: Yihong Tang, Haicheng Liao, Tong Nie, Junlin He, Ao Qu, Kehua Chen, Wei Ma, Zhenning Li, Lijun Sun, Chengzhong Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[534] arXiv:2512.04728 [pdf, ps, other]: Title: Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild

Authors: Yigui Feng, Qinglin Wang, Haotian Mo, Yang Liu, Ke Liu, Gencheng Liu, Xinhai Chen, Siqi Shen, Songzhu Mei, Jie Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[535] arXiv:2512.04699 [pdf, ps, other]: Title: OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution

Authors: Xinning Chai, Zhengxue Cheng, Yuhong Zhang, Hengsheng Zhang, Yingsheng Qin, Yucai Yang, Rong Xie, Li Song

Comments: Accepted as TCSVT, 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[536] arXiv:2512.04686 [pdf, ps, other]: Title: Towards Cross-View Point Correspondence in Vision-Language Models

Authors: Yipu Wang, Yuheng Ji, Yuyang Liu, Enshen Zhou, Ziqiang Yang, Yuxuan Tian, Ziheng Qin, Yue Liu, Huajie Tan, Cheng Chi, Zhiyuan Ma, Daniel Dajun Zeng, Xiaolong Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[537] arXiv:2512.04678 [pdf, ps, other]: Title: Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

Authors: Yunhong Lu, Yanhong Zeng, Haobo Li, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jiapeng Zhu, Hengyuan Cao, Zhipeng Zhang, Xing Zhu, Yujun Shen, Min Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538] arXiv:2512.04677 [pdf, ps, other]: Title: Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Authors: Yubo Huang, Hailong Guo, Fangtai Wu, Shifeng Zhang, Shijie Huang, Qijun Gan, Lin Liu, Sirui Zhao, Enhong Chen, Jiaming Liu, Steven Hoi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539] arXiv:2512.04660 [pdf, ps, other]: Title: I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models

Authors: Juntong Wang, Jiarui Wang, Huiyu Duan, Jiaxiang Kang, Guangtao Zhai, Xiongkuo Min

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540] arXiv:2512.04643 [pdf, ps, other]: Title: SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding

Authors: Chang-Hsun Wu, Kai-Po Chang, Yu-Yang Sheng, Hung-Kai Chung, Kuei-Chun Wang, Yu-Chiang Frank Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[541] arXiv:2512.04619 [pdf, ps, other]: Title: Denoise to Track: Harnessing Video Diffusion Priors for Robust Correspondence

Authors: Tianyu Yuan, Yuanbo Yang, Lin-Zhuo Chen, Yao Yao, Zhuzhong Qian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[542] arXiv:2512.04599 [pdf, ps, other]: Title: Malicious Image Analysis via Vision-Language Segmentation Fusion: Detection, Element, and Location in One-shot

Authors: Sheng Hang, Chaoxiang He, Hongsheng Hu, Hanqing Hu, Bin Benjamin Zhu, Shi-Feng Sun, Dawu Gu, Shuo Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543] arXiv:2512.04597 [pdf, ps, other]: Title: When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering

Authors: Tao Wu, Chuhao Zhou, Guangyu Zhao, Haozhi Cao, Yewen Pu, Jianfei Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[544] arXiv:2512.04585 [pdf, ps, other]: Title: SAM3-I: Segment Anything with Instructions

Authors: Jingjing Li, Yue Feng, Yuchen Guo, Jincai Huang, Yongri Piao, Qi Bi, Miao Zhang, Xiaoqi Zhao, Qiang Chen, Shihao Zou, Wei Ji, Huchuan Lu, Li Cheng

Comments: Preliminary results; work in progress

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545] arXiv:2512.04581 [pdf, ps, other]: Title: Infrared UAV Target Tracking with Dynamic Feature Refinement and Global Contextual Attention Knowledge Distillation

Authors: Houzhang Fang, Chenxing Wu, Kun Bai, Tianqi Chen, Xiaolin Wang, Xiyang Liu, Yi Chang, Luxin Yan

Comments: Accepted by IEEE TMM

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546] arXiv:2512.04576 [pdf, ps, other]: Title: TARDis: Time Attenuated Representation Disentanglement for Incomplete Multi-Modal Tumor Segmentation and Classification

Authors: Zishuo Wan, Qinqin Kang, Yi Huang, Yun Bian, Dawei Ding, Ke Yan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547] arXiv:2512.04568 [pdf, ps, other]: Title: Prompt2Craft: Generating Functional Craft Assemblies with LLMs

Authors: Vitor Hideyo Isume, Takuya Kiyokawa, Natsuki Yamanobe, Yukiyasu Domae, Weiwei Wan, Kensuke Harada

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548] arXiv:2512.04564 [pdf, ps, other]: Title: Dataset creation for supervised deep learning-based analysis of microscopic images -- review of important considerations and recommendations

Authors: Christof A. Bertram, Viktoria Weiss, Jonas Ammeling, F. Maria Schabel, Taryn A. Donovan, Frauke Wilm, Christian Marzahl, Katharina Breininger, Marc Aubreville

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549] arXiv:2512.04563 [pdf, ps, other]: Title: COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence

Authors: Zefeng Zhang, Xiangzhao Hao, Hengzhu Tang, Zhenyu Zhang, Jiawei Sheng, Xiaodong Li, Zhenyang Li, Li Gao, Daiting Shi, Dawei Yin, Tingwen Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550] arXiv:2512.04554 [pdf, ps, other]: Title: Counterfeit Answers: Adversarial Forgery against OCR-Free Document Visual Question Answering

Authors: Marco Pintore, Maura Pintor, Dimosthenis Karatzas, Battista Biggio

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[551] arXiv:2512.04542 [pdf, ps, other]: Title: Gaussian Entropy Fields: Driving Adaptive Sparsity in 3D Gaussian Optimization

Authors: Hong Kuang, Jianchen Liu

Comments: 28 pages,11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[552] arXiv:2512.04540 [pdf, ps, other]: Title: VideoMem: Enhancing Ultra-Long Video Understanding via Adaptive Memory Management

Authors: Hongbo Jin, Qingyuan Wang, Wenhao Zhang, Yang Liu, Sijie Cheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[553] arXiv:2512.04537 [pdf, ps, other]: Title: X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale

Authors: Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554] arXiv:2512.04536 [pdf, ps, other]: Title: Detection of Intoxicated Individuals from Facial Video Sequences via a Recurrent Fusion Model

Authors: Bita Baroutian, Atefe Aghaei, Mohsen Ebrahimi Moghaddam

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[555] arXiv:2512.04534 [pdf, ps, other]: Title: Refaçade: Editing Object with Given Reference Texture

Authors: Youze Huang (1), Penghui Ruan (2), Bojia Zi (3), Xianbiao Qi (4), Jianan Wang (5), Rong Xiao (4) ((1) University of Electronic Science and Technology of China, (2) The Hong Kong Polytechnic University, (3) The Chinese University of Hong Kong, (4) IntelliFusion Inc., (5) Astribot Inc.)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556] arXiv:2512.04532 [pdf, ps, other]: Title: PhyVLLM: Physics-Guided Video Language Model with Motion-Appearance Disentanglement

Authors: Yu-Wei Zhan, Xin Wang, Hong Chen, Tongtong Feng, Wei Feng, Ren Wang, Guangyao Li, Qing Li, Wenwu Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[557] arXiv:2512.04528 [pdf, ps, other]: Title: Auto3R: Automated 3D Reconstruction and Scanning via Data-driven Uncertainty Quantification

Authors: Chentao Shen, Sizhe Zheng, Bingqian Wu, Yaohua Feng, Yuanchen Fei, Mingyu Mei, Hanwen Jiang, Xiangru Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[558] arXiv:2512.04522 [pdf, ps, other]: Title: Identity Clue Refinement and Enhancement for Visible-Infrared Person Re-Identification

Authors: Guoqing Zhang, Zhun Wang, Hairui Wang, Zhonglin Ye, Yuhui Zheng

Comments: 14 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[559] arXiv:2512.04521 [pdf, ps, other]: Title: WiFi-based Cross-Domain Gesture Recognition Using Attention Mechanism

Authors: Ruijing Liu, Cunhua Pan, Jiaming Zeng, Hong Ren, Kezhi Wang, Lei Kong, Jiangzhou Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[560] arXiv:2512.04520 [pdf, ps, other]: Title: Boundary-Aware Test-Time Adaptation for Zero-Shot Medical Image Segmentation

Authors: Chenlin Xu, Lei Zhang, Lituan Wang, Xinyu Pu, Pengfei Ma, Guangwu Qian, Zizhou Wang, Yan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561] arXiv:2512.04519 [pdf, ps, other]: Title: VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory

Authors: Yifei Yu, Xiaoshan Wu, Xinting Hu, Tao Hu, Yangtian Sun, Xiaoyang Lyu, Bo Wang, Lin Ma, Yuewen Ma, Zhongrui Wang, Xiaojuan Qi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562] arXiv:2512.04515 [pdf, ps, other]: Title: EgoLCD: Egocentric Video Generation with Long Context Diffusion

Authors: Liuzhou Zhang, Jiarui Ye, Yuanlei Wang, Ming Zhong, Mingju Cao, Wanke Xia, Bowen Zeng, Zeyu Zhang, Hao Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[563] arXiv:2512.04511 [pdf, ps, other]: Title: DuGI-MAE: Improving Infrared Mask Autoencoders via Dual-Domain Guidance

Authors: Yinghui Xing, Xiaoting Su, Shizhou Zhang, Donghao Chu, Di Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[564] arXiv:2512.04504 [pdf, ps, other]: Title: UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers

Authors: Min Zhao, Bokai Yan, Xue Yang, Hongzhou Zhu, Jintao Zhang, Shilong Liu, Chongxuan Li, Jun Zhu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[565] arXiv:2512.04499 [pdf, ps, other]: Title: Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model

Authors: Yuduo Jin, Brandon Haworth

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[566] arXiv:2512.04496 [pdf, ps, other]: Title: Shift-Window Meets Dual Attention: A Multi-Model Architecture for Specular Highlight Removal

Authors: Tianci Huo, Lingfeng Qi, Yuhan Chen, Qihong Xue, Jinyuan Shao, Hai Yu, Jie Li, Zhanhua Zhang, Guofa Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567] arXiv:2512.04487 [pdf, ps, other]: Title: Controllable Long-term Motion Generation with Extended Joint Targets

Authors: Eunjong Lee, Eunhee Kim, Sanghoon Hong, Eunho Jung, Jihoon Kim

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[568] arXiv:2512.04485 [pdf, ps, other]: Title: Not All Birds Look The Same: Identity-Preserving Generation For Birds

Authors: Aaron Sun, Oindrila Saha, Subhransu Maji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569] arXiv:2512.04483 [pdf, ps, other]: Title: DeRA: Decoupled Representation Alignment for Video Tokenization

Authors: Pengbo Guo, Junke Wang, Zhen Xing, Chengxu Liu, Daoguo Dong, Xueming Qian, Zuxuan Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[570] arXiv:2512.04461 [pdf, ps, other]: Title: UniTS: Unified Time Series Generative Model for Remote Sensing

Authors: Yuxiang Zhang, Shunlin Liang, Wenyuan Li, Han Ma, Jianglei Xu, Yichuan Ma, Jiangwei Xie, Wei Li, Mengmeng Zhang, Ran Tao, Xiang-Gen Xia

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571] arXiv:2512.04459 [pdf, ps, other]: Title: dVLM-AD: Enhance Diffusion Vision-Language-Model for Driving via Controllable Reasoning

Authors: Yingzi Ma, Yulong Cao, Wenhao Ding, Shuibai Zhang, Yan Wang, Boris Ivanovic, Ming Jiang, Marco Pavone, Chaowei Xiao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572] arXiv:2512.04456 [pdf, ps, other]: Title: GuidNoise: Single-Pair Guided Diffusion for Generalized Noise Synthesis

Authors: Changjin Kim, HyeokJun Lee, YoungJoon Yoo

Comments: AAAI2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[573] arXiv:2512.04451 [pdf, ps, other]: Title: StreamEQA: Towards Streaming Video Understanding for Embodied Scenarios

Authors: Yifei Wang, Zhenkai Li, Tianwen Qian, Huanran Zheng, Zheng Wang, Yuqian Fu, Xiaoling Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574] arXiv:2512.04441 [pdf, ps, other]: Title: MindDrive: An All-in-One Framework Bridging World Models and Vision-Language Model for End-to-End Autonomous Driving

Authors: Bin Sun, Yaoguang Cao, Yan Wang, Rui Wang, Jiachen Shang, Xiejie Feng, Jiayi Lu, Jia Shi, Shichun Yang, Xiaoyu Yan, Ziying Song

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[575] arXiv:2512.04426 [pdf, ps, other]: Title: Self-Paced and Self-Corrective Masked Prediction for Movie Trailer Generation

Authors: Sidan Zhu, Hongteng Xu, Dixin Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[576] arXiv:2512.04425 [pdf, ps, other]: Title: Explainable Parkinsons Disease Gait Recognition Using Multimodal RGB-D Fusion and Large Language Models

Authors: Manar Alnaasan, Md Selim Sarowar, Sungho Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[577] arXiv:2512.04421 [pdf, ps, other]: Title: UTrice: Unifying Primitives in Differentiable Ray Tracing and Rasterization via Triangles for Particle-Based 3D Scenes

Authors: Changhe Liu, Ehsan Javanmardi, Naren Bao, Alex Orsholits, Manabu Tsukada

Comments: 13 pages, 10 figures, submitted to CVPR2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[578] arXiv:2512.04413 [pdf, ps, other]: Title: Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection

Authors: Xiangyi Gao, Danpei Zhao, Bo Yuan, Wentao Li

Comments: 12 pages, 8 figures, 11 tables

Journal-ref: IEEE Transactions on Geoscience and Remote Sensing 63 (2025) 1-11

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[579] arXiv:2512.04397 [pdf, ps, other]: Title: Performance Evaluation of Transfer Learning Based Medical Image Classification Techniques for Disease Detection

Authors: Zeeshan Ahmad, Shudi Bao, Meng Chen

Journal-ref: 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Copenhagen, Denmark, 2025, pp. 1-5

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[580] arXiv:2512.04395 [pdf, ps, other]: Title: Fourier-Attentive Representation Learning: A Fourier-Guided Framework for Few-Shot Generalization in Vision-Language Models

Authors: Hieu Dinh Trung Pham, Huy Minh Nhat Nguyen, Cuong Tuan Nguyen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[581] arXiv:2512.04390 [pdf, ps, other]: Title: FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring

Authors: Geunhyuk Youk, Jihyong Oh, Munchurl Kim

Comments: 20 pages, 15 figures. Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[582] arXiv:2512.04358 [pdf, ps, other]: Title: MAFNet:Multi-frequency Adaptive Fusion Network for Real-time Stereo Matching

Authors: Ao Xu, Rujin Zhao, Xiong Xu, Boceng Huang, Yujia Jia, Hongfeng Long, Fuxuan Chen, Zilong Cao, Fangyuan Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[583] arXiv:2512.04356 [pdf, ps, other]: Title: Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment

Authors: Kai-Po Chang, Wei-Yuan Cheng, Chi-Pin Huang, Fu-En Yang, Yu-Chiang Frank Wang

Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2026. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[584] arXiv:2512.04331 [pdf, ps, other]: Title: Open Set Face Forgery Detection via Dual-Level Evidence Collection

Authors: Zhongyi Cai, Bryce Gernon, Wentao Bao, Yifan Li, Matthew Wright, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[585] arXiv:2512.04329 [pdf, ps, other]: Title: A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks

Authors: Waleed Khalid, Dmitry Ignatov, Radu Timofte

Subjects: Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[586] arXiv:2512.04323 [pdf, ps, other]: Title: Bayes-DIC Net: Estimating Digital Image Correlation Uncertainty with Bayesian Neural Networks

Authors: Biao Chen, Zhenhua Lei, Yahui Zhang, Tongzhi Niu

Comments: 17 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG)
[587] arXiv:2512.04315 [pdf, ps, other]: Title: SyncTrack4D: Cross-Video Motion Alignment and Video Synchronization for Multi-Video 4D Gaussian Splatting

Authors: Yonghan Lee, Tsung-Wei Huang, Shiv Gehlot, Jaehoon Choi, Guan-Ming Su, Dinesh Manocha

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[588] arXiv:2512.04314 [pdf, ps, other]: Title: DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision

Authors: Jiashu Liao, Pietro Liò, Marc de Kamps, Duygu Sarikaya

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[589] arXiv:2512.04313 [pdf, ps, other]: Title: Mind-to-Face: Neural-Driven Photorealistic Avatar Synthesis via EEG Decoding

Authors: Haolin Xiong, Tianwen Fu, Pratusha Bhuvana Prasad, Yunxuan Cai, Haiwei Chen, Wenbin Teng, Hanyuan Xiao, Yajie Zhao

Comments: 16 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[590] arXiv:2512.04311 [pdf, ps, other]: Title: Real-time Cricket Sorting By Sex

Authors: Juan Manuel Cantarero Angulo, Matthew Smith

Comments: 13 pages, 14 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[591] arXiv:2512.04309 [pdf, ps, other]: Title: Text-Only Training for Image Captioning with Retrieval Augmentation and Modality Gap Correction

Authors: Rui Fonseca, Bruno Martins, Gil Rocha

Comments: Submitted to CVPR 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[592] arXiv:2512.04305 [pdf, ps, other]: Title: How (Mis)calibrated is Your Federated CLIP and What To Do About It?

Authors: Mainak Singha, Masih Aminbeidokhti, Paolo Casari, Elisa Ricci, Subhankar Roy

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[593] arXiv:2512.04303 [pdf, ps, other]: Title: Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications

Authors: Gasser Elazab, Maximilian Jansen, Michael Unterreiner, Olaf Hellwich

Comments: Accepted in 3DV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[594] arXiv:2512.04284 [pdf, ps, other]: Title: Learning Single-Image Super-Resolution in the JPEG Compressed Domain

Authors: Sruthi Srinivasan, Elham Shakibapour, Rajy Rawther, Mehdi Saeedi

Comments: 7 pages, 4 figures, 2 tables, SEEDS Workshop, ICIP 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[595] arXiv:2512.04283 [pdf, ps, other]: Title: Plug-and-Play Image Restoration with Flow Matching: A Continuous Viewpoint

Authors: Fan Jia, Yuhao Huang, Shih-Hsin Wang, Cristina Garcia-Cardona, Andrea L. Bertozzi, Bao Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[596] arXiv:2512.04282 [pdf, ps, other]: Title: Inference-time Stochastic Refinement of GRU-Normalizing Flow for Real-time Video Motion Transfer

Authors: Tasmiah Haque, Srinjoy Das

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[597] arXiv:2512.04267 [pdf, ps, other]: Title: UniLight: A Unified Representation for Lighting

Authors: Zitian Zhang, Iliyan Georgiev, Michael Fischer, Yannick Hold-Geoffroy, Jean-François Lalonde, Valentin Deschaintre

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[598] arXiv:2512.04248 [pdf, ps, other]: Title: MVRoom: Controllable 3D Indoor Scene Generation with Multi-View Diffusion Models

Authors: Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[599] arXiv:2512.04238 [pdf, ps, other]: Title: 6 Fingers, 1 Kidney: Natural Adversarial Medical Images Reveal Critical Weaknesses of Vision-Language Models

Authors: Leon Mayer, Piotr Kalinowski, Caroline Ebersbach, Marcel Knopp, Tim Rädsch, Evangelia Christodoulou, Annika Reinke, Fiona R. Kolbinger, Lena Maier-Hein

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[600] arXiv:2512.04222 [pdf, ps, other]: Title: ReasonX: MLLM-Guided Intrinsic Image Decomposition

Authors: Alara Dirik, Tuanfeng Wang, Duygu Ceylan, Stefanos Zafeiriou, Anna Frühstück

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[601] arXiv:2512.04221 [pdf, ps, other]: Title: MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis

Authors: Xiangyu Bai, He Liang, Bishoy Galoaa, Utsav Nandi, Shayda Moezzi, Yuhang He, Sarah Ostadabbas

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[602] arXiv:2512.04219 [pdf, ps, other]: Title: Generalized Event Partonomy Inference with Structured Hierarchical Predictive Learning

Authors: Zhou Chen, Joe Lin, Sathyanarayanan N. Aakur\\

Comments: 16 pages, 7 figures, 3 tables. Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[603] arXiv:2512.04187 [pdf, ps, other]: Title: OnSight Pathology: A real-time platform-agnostic computational pathology companion for histopathology

Authors: Jinzhen Hu, Kevin Faust, Parsa Babaei Zadeh, Adrienn Bourkas, Shane Eaton, Andrew Young, Anzar Alvi, Dimitrios George Oreopoulos, Ameesha Paliwal, Assem Saleh Alrumeh, Evelyn Rose Kamski-Hennekam, Phedias Diamandis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[604] arXiv:2512.04175 [pdf, ps, other]: Title: Beyond Flicker: Detecting Kinematic Inconsistencies for Generalizable Deepfake Video Detection

Authors: Alejandro Cobo, Roberto Valle, José Miguel Buenaposada, Luis Baumela

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[605] arXiv:2512.05117 (cross-list from cs.LG) [pdf, ps, other]: Title: The Universal Weight Subspace Hypothesis

Authors: Prakhar Kaushik, Shravan Chaudhari, Ankit Vaidya, Rama Chellappa, Alan Yuille

Comments: 37 pages

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[606] arXiv:2512.05116 (cross-list from cs.LG) [pdf, ps, other]: Title: Value Gradient Guidance for Flow Matching Alignment

Authors: Zhen Liu, Tim Z. Xiao, Carles Domingo-Enrich, Weiyang Liu, Dinghuai Zhang

Comments: Accepted at NeurIPS 2025; 26 pages, 20 figures

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[607] arXiv:2512.05114 (cross-list from cs.LG) [pdf, ps, other]: Title: Deep infant brain segmentation from multi-contrast MRI

Authors: Malte Hoffmann, Lilla Zöllei, Adrian V. Dalca

Comments: 8 pages, 8 figures, 1 table, website at this https URL, presented at the 2025 IEEE Asilomar Conference on Signals, Systems, and Computers

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[608] arXiv:2512.05103 (cross-list from cs.LG) [pdf, ps, other]: Title: TV2TV: A Unified Framework for Interleaved Language and Video Generation

Authors: Xiaochuang Han, Youssef Emad, Melissa Hall, John Nguyen, Karthik Padthe, Liam Robbins, Amir Bar, Delong Chen, Michal Drozdzal, Maha Elbayad, Yushi Hu, Shang-Wen Li, Sreya Dutta Roy, Jakob Verbeek, XuDong Wang, Marjan Ghazvininejad, Luke Zettlemoyer, Emily Dinan

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[609] arXiv:2512.05094 (cross-list from cs.RO) [pdf, ps, other]: Title: From Generated Human Videos to Physically Plausible Robot Trajectories

Authors: James Ni, Zekai Wang, Wei Lin, Amir Bar, Yann LeCun, Trevor Darrell, Jitendra Malik, Roei Herzig

Comments: For project website, see this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[610] arXiv:2512.04814 (cross-list from cs.SD) [pdf, ps, other]: Title: Shared Multi-modal Embedding Space for Face-Voice Association

Authors: Christopher Simic, Korbinian Riedhammer, Tobias Bocklet

Comments: Ranked 1st in Fame 2026 Challenge, ICASSP

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[611] arXiv:2512.04763 (cross-list from cs.LG) [pdf, ps, other]: Title: MemLoRA: Distilling Expert Adapters for On-Device Memory Systems

Authors: Massimo Bini, Ondrej Bohdal, Umberto Michieli, Zeynep Akata, Mete Ozay, Taha Ceritli

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[612] arXiv:2512.04705 (cross-list from cs.CC) [pdf, ps, other]: Title: Hardware-aware Neural Architecture Search of Early Exiting Networks on Edge Accelerators

Authors: Alaa Zniber, Arne Symons, Ouassim Karrakchou, Marian Verhelst, Mounir Ghogho

Comments: Submitted to IEEE Transactions on Emerging Topics in Computing

Subjects: Computational Complexity (cs.CC); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
[613] arXiv:2512.04625 (cross-list from cs.LG) [pdf, ps, other]: Title: Rethinking Decoupled Knowledge Distillation: A Predictive Distribution Perspective

Authors: Bowen Zheng, Ran Cheng

Comments: Accepted to IEEE TNNLS

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[614] arXiv:2512.04556 (cross-list from cs.GR) [pdf, ps, other]: Title: Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex

Authors: Zhizhen Wu, Zhe Cao, Yuchi Huo

Comments: 10 pages, 7 figures

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[615] arXiv:2512.04464 (cross-list from cs.LG) [pdf, ps, other]: Title: Feature Engineering vs. Deep Learning for Automated Coin Grading: A Comparative Study on Saint-Gaudens Double Eagles

Authors: Tanmay Dogra, Eric Ngo, Mohammad Alam, Jean-Paul Talavera, Asim Dahal

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[616] arXiv:2512.04385 (cross-list from cs.LG) [pdf, ps, other]: Title: STeP-Diff: Spatio-Temporal Physics-Informed Diffusion Models for Mobile Fine-Grained Pollution Forecasting

Authors: Nan Zhou, Weijie Hong, Huandong Wang, Jianfeng Zheng, Qiuhua Wang, Yali Song, Xiao-Ping Zhang, Yong Li, Xinlei Chen

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[617] arXiv:2512.04264 (cross-list from cs.LG) [pdf, ps, other]: Title: Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness

Authors: Long Dang, Thushari Hapuarachchi, Kaiqi Xiong, Jing Lin

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[618] arXiv:2512.04092 (cross-list from physics.soc-ph) [pdf, ps, other]: Title: The changing surface of the world's roads

Authors: Sukanya Randhawa, Guntaj Randhawa, Clemens Langer, Francis Andorful, Benjamin Herfort, Daniel Kwakye, Omer Olchik, Sven Lautenbach, Alexander Zipf

Subjects: Physics and Society (physics.soc-ph); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[619] arXiv:2512.04087 (cross-list from q-bio.NC) [pdf, ps, other]: Title: Human-Centred Evaluation of Text-to-Image Generation Models for Self-expression of Mental Distress: A Dataset Based on GPT-4o

Authors: Sui He, Shenbin Qian

Subjects: Neurons and Cognition (q-bio.NC); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)

Thu, 4 Dec 2025 (showing first 12 of 130 entries)

[620] arXiv:2512.04085 [pdf, ps, other]: Title: Unique Lives, Shared World: Learning from Single-Life Videos

Authors: Tengda Han, Sayna Ebrahimi, Dilara Gokay, Li Yang Ku, Maks Ovsjanikov, Iva Babukova, Daniel Zoran, Viorica Patraucean, Joao Carreira, Andrew Zisserman, Dima Damen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[621] arXiv:2512.04084 [pdf, ps, other]: Title: SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows

Authors: Qinyu Zhao, Guangting Zheng, Tao Yang, Rui Zhu, Xingjian Leng, Stephen Gould, Liang Zheng

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[622] arXiv:2512.04082 [pdf, ps, other]: Title: PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

Authors: Jiazhe Wei, Ken Li, Tianyu Lao, Haofan Wang, Liang Wang, Caifeng Shan, Chenyang Si

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[623] arXiv:2512.04069 [pdf, ps, other]: Title: SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL

Authors: Siyi Chen, Mikaela Angelina Uy, Chan Hee Song, Faisal Ladhak, Adithyavairavan Murali, Qing Qu, Stan Birchfield, Valts Blukis, Jonathan Tremblay

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[624] arXiv:2512.04048 [pdf, ps, other]: Title: Stable Signer: Hierarchical Sign Language Generative Model

Authors: Sen Fang, Yalin Feng, Hongbin Zhong, Yanxin Zhang, Dimitris N. Metaxas

Comments: 12 pages, 7 figures. More Demo at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Computers and Society (cs.CY)
[625] arXiv:2512.04040 [pdf, ps, other]: Title: RELIC: Interactive Video World Model with Long-Horizon Memory

Authors: Yicong Hong, Yiqun Mei, Chongjian Ge, Yiran Xu, Yang Zhou, Sai Bi, Yannick Hold-Geoffroy, Mike Roberts, Matthew Fisher, Eli Shechtman, Kalyan Sunkavalli, Feng Liu, Zhengqi Li, Hao Tan

Comments: 22 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[626] arXiv:2512.04039 [pdf, ps, other]: Title: Fast & Efficient Normalizing Flows and Applications of Image Generative Models

Authors: Sandeep Nagar

Comments: PhD Thesis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[627] arXiv:2512.04025 [pdf, ps, other]: Title: PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation

Authors: Xiaolong Li, Youping Gu, Xi Lin, Weijie Wang, Bohan Zhuang

Comments: Tech report

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[628] arXiv:2512.04021 [pdf, ps, other]: Title: C3G: Learning Compact 3D Representations with 2K Gaussians

Authors: Honggyu An, Jaewoo Jung, Mungyeom Kim, Sunghwan Hong, Chaehyun Kim, Kazumi Fukuda, Minkyeong Jeon, Jisang Han, Takuya Narihira, Hyuna Ko, Junsu Kim, Yuki Mitsufuji, Seungryong Kim

Comments: Project Page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[629] arXiv:2512.04019 [pdf, ps, other]: Title: Ultra-lightweight Neural Video Representation Compression

Authors: Ho Man Kwan, Tianhao Peng, Ge Gao, Fan Zhang, Mike Nilsson, Andrew Gower, David Bull

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[630] arXiv:2512.04015 [pdf, ps, other]: Title: Learning Group Actions In Disentangled Latent Image Representations

Authors: Farhana Hossain Swarnali, Miaomiao Zhang, Tonmoy Hossain

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[631] arXiv:2512.04012 [pdf, ps, other]: Title: Emergent Outlier View Rejection in Visual Geometry Grounded Transformers

Authors: Jisang Han, Sunghwan Hong, Jaewoo Jung, Wooseok Jang, Honggyu An, Qianqian Wang, Seungryong Kim, Chen Feng

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 749 entries: 1-250 | 132-381 | 382-631 | 632-749 ]
[ showing 250 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 381

Tue, 9 Dec 2025 (continued, showing last 9 of 259 entries)

Mon, 8 Dec 2025

Fri, 5 Dec 2025

Thu, 4 Dec 2025 (showing first 12 of 130 entries)