Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 390

[ total of 749 entries: 1-1000 | 391-749 ]
[ showing up to 1000 entries per page: fewer | more ]

Mon, 8 Dec 2025

[391] arXiv:2512.05965 [pdf, ps, other]: Title: EditThinker: Unlocking Iterative Reasoning for Any Image Editor

Authors: Hongyu Li, Manyuan Zhang, Dian Zheng, Ziyu Guo, Yimeng Jia, Kaituo Feng, Hao Yu, Yexin Liu, Yan Feng, Peng Pei, Xunliang Cai, Linjiang Huang, Hongsheng Li, Si Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[392] arXiv:2512.05960 [pdf, ps, other]: Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement

Authors: Munsif Ali, Najmul Hassan, Lucia Ventura, Davide Di Bari, Simonepietro Canese

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[393] arXiv:2512.05941 [pdf, ps, other]: Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding

Authors: Zhiyuan Jiang, Shenghao Xie, Wenyi Li, Wenqiang Zu, Peihang Li, Jiahao Qiu, Siqi Pei, Lei Ma, Tiejun Huang, Mengdi Wang, Shilong Liu

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[394] arXiv:2512.05937 [pdf, ps, other]: Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV Perception

Authors: Anne Sielemann, Valentin Barner, Stefan Wolf, Masoud Roschani, Jens Ziehn, Juergen Beyerer

Comments: 8 pages, 2 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[395] arXiv:2512.05936 [pdf, ps, other]: Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition

Authors: Anne Sielemann, Lena Loercher, Max-Lion Schumacher, Stefan Wolf, Masoud Roschani, Jens Ziehn

Comments: 8 pages, 8 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[396] arXiv:2512.05928 [pdf, ps, other]: Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face Recognition

Authors: Pedro Vidal, Bernardo Biesseck, Luiz E. L. Coelho, Roger Granada, David Menotti

Comments: 18 pages, 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[397] arXiv:2512.05927 [pdf, ps, other]: Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty

Authors: Zhiting Mei, Tenny Yin, Micah Baker, Ola Shorinwa, Anirudha Majumdar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[398] arXiv:2512.05922 [pdf, ps, other]: Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology Segmentation

Authors: Khang Le, Anh Mai Vu, Thi Kim Trang Vo, Ha Thach, Ngoc Bui Lam Quang, Thanh-Huy Nguyen, Minh H. N. Le, Zhu Han, Chandra Mohan, Hien Van Nguyen

Comments: Note: Khang Le and Anh Mai Vu contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[399] arXiv:2512.05920 [pdf, ps, other]: Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery Prediction

Authors: Jiawen Yang, Yihui Cao, Xuanyu Tian, Yuyao Zhang, Hongjiang Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[400] arXiv:2512.05905 [pdf, ps, other]: Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

Authors: Wenhao Yan, Sheng Ye, Zhuoyi Yang, Jiayan Teng, ZhenHui Dong, Kairui Wen, Xiaotao Gu, Yong-Jin Liu, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[401] arXiv:2512.05866 [pdf, ps, other]: Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator

Authors: Md. Mahbub Hasan Akash, Aria Tasnim Mridula, Sheekar Banerjee, Ishtiak Al Mamoon

Comments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[402] arXiv:2512.05859 [pdf, ps, other]: Title: Edit-aware RAW Reconstruction

Authors: Abhijith Punnappurath, Luxi Zhao, Ke Zhao, Hue Nguyen, Radek Grzeszczuk, Michael S. Brown

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403] arXiv:2512.05853 [pdf, ps, other]: Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack

Authors: Shiji Zhao, Shukun Xiong, Yao Huang, Yan Jin, Zhenyu Wu, Jiyang Guan, Ranjie Duan, Jialing Tao, Hui Xue, Xingxing Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404] arXiv:2512.05830 [pdf, ps, other]: Title: Phase-OTDR Event Detection Using Image-Based Data Transformation and Deep Learning

Authors: Muhammet Cagri Yeke, Samil Sirin, Kivilcim Yuksel, Abdurrahman Gumus

Comments: 22 pages, 11 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[405] arXiv:2512.05814 [pdf, ps, other]: Title: UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease Detection

Authors: Fubao Zhu, Zhanyuan Jia, Zhiguo Wang, Huan Huang, Danyang Sun, Chuang Han, Yanting Li, Jiaofen Nan, Chen Zhao, Weihua Zhou

Comments: The code is already available on GitHub: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[406] arXiv:2512.05809 [pdf, ps, other]: Title: Probing the effectiveness of World Models for Spatial Reasoning through Test-time Scaling

Authors: Saurav Jha, M. Jehanzeb Mirza, Wei Lin, Shiqi Yang, Sarath Chandar

Comments: Extended abstract at World Modeling Workshop 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[407] arXiv:2512.05802 [pdf, ps, other]: Title: Bring Your Dreams to Life: Continual Text-to-Video Customization

Authors: Jiahua Dong, Xudong Wang, Wenqi Liang, Zongyan Han, Meng Cao, Duzhen Zhang, Hanbin Zhao, Zhi Han, Salman Khan, Fahad Shahbaz Khan

Comments: Accepted to AAAI2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408] arXiv:2512.05783 [pdf, ps, other]: Title: Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse Depth

Authors: Maryam Yousefi, Soodeh Bakhshandeh

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[409] arXiv:2512.05774 [pdf, ps, other]: Title: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding

Authors: Ziyang Wang, Honglu Zhou, Shijie Wang, Junnan Li, Caiming Xiong, Silvio Savarese, Mohit Bansal, Michael S. Ryoo, Juan Carlos Niebles

Comments: Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[410] arXiv:2512.05762 [pdf, ps, other]: Title: FNOPT: Resolution-Agnostic, Self-Supervised Cloth Simulation using Meta-Optimization with Fourier Neural Operators

Authors: Ruochen Chen, Thuy Tran, Shaifali Parashar

Comments: Accepted for WACV

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[411] arXiv:2512.05759 [pdf, ps, other]: Title: Label-Efficient Point Cloud Segmentation with Active Learning

Authors: Johannes Meyer, Jasper Hoffmann, Felix Schulz, Dominik Merkle, Daniel Buescher, Alexander Reiterer, Joschka Boedecker, Wolfram Burgard

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[412] arXiv:2512.05754 [pdf, ps, other]: Title: USV: Unified Sparsification for Accelerating Video Diffusion Models

Authors: Xinjian Wu, Hongmei Wang, Yuan Zhou, Qinglin Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[413] arXiv:2512.05746 [pdf, ps, other]: Title: HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models

Authors: Shizhuo Mao, Hongtao Zou, Qihu Xie, Song Chen, Yi Kang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[414] arXiv:2512.05740 [pdf, ps, other]: Title: Distilling Expert Surgical Knowledge: How to train local surgical VLMs for anatomy explanation in Complete Mesocolic Excision

Authors: Lennart Maack, Julia-Kristin Graß, Lisa-Marie Toscha, Nathaniel Melling, Alexander Schlaefer

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[415] arXiv:2512.05710 [pdf, ps, other]: Title: Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature Learning

Authors: Jianan Sun, Dongzhihan Wang, Mingyu Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[416] arXiv:2512.05698 [pdf, ps, other]: Title: OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning

Authors: Xusheng Guo, Wanfa Zhang, Shijia Zhao, Qiming Xia, Xiaolong Xie, Mingming Wang, Hai Wu, Chenglu Wen

Comments: The 40th Annual AAAI Conference on Artificial Intelligence

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[417] arXiv:2512.05683 [pdf, ps, other]: Title: Physics-Informed Graph Neural Network with Frequency-Aware Learning for Optical Aberration Correction

Authors: Yong En Kok, Bowen Deng, Alexander Bentley, Andrew J. Parkes, Michael G. Somekh, Amanda J. Wright, Michael P. Pound

Subjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[418] arXiv:2512.05674 [pdf, ps, other]: Title: Hyperspectral Unmixing with 3D Convolutional Sparse Coding and Projected Simplex Volume Maximization

Authors: Gargi Panda, Soumitra Kundu, Saumik Bhattacharya, Aurobinda Routray

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[419] arXiv:2512.05672 [pdf, ps, other]: Title: InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem

Authors: Yeobin Hong, Suhyeon Lee, Hyungjin Chung, Jong Chul Ye

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[420] arXiv:2512.05669 [pdf, ps, other]: Title: Deep Learning-Based Real-Time Sequential Facial Expression Analysis Using Geometric Features

Authors: Talha Enes Koksal, Abdurrahman Gumus

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[421] arXiv:2512.05663 [pdf, ps, other]: Title: LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection

Authors: Johannes Meier, Jonathan Michel, Oussema Dhaouadi, Yung-Hsu Yang, Christoph Reich, Zuria Bauer, Stefan Roth, Marc Pollefeys, Jacques Kaiser, Daniel Cremers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[422] arXiv:2512.05651 [pdf, ps, other]: Title: Self-Supervised AI-Generated Image Detection: A Camera Metadata Perspective

Authors: Nan Zhong, Mian Zou, Yiran Xu, Zhenxing Qian, Xinpeng Zhang, Baoyuan Wu, Kede Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[423] arXiv:2512.05635 [pdf, ps, other]: Title: Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired Data

Authors: Georgy Perevozchikov, Nancy Mehta, Egor Ershov, Radu Timofte

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[424] arXiv:2512.05613 [pdf, ps, other]: Title: DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation Model

Authors: Pasquale De Marinis, Pieter M. Blok, Uzay Kaymak, Rogier Brussee, Gennaro Vessio, Giovanna Castellano

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[425] arXiv:2512.05610 [pdf, ps, other]: Title: NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projections

Authors: Juho Korkeala, Jesse Muhojoki, Josef Taher, Klaara Salolahti, Matti Hyyppä, Antero Kukko, Juha Hyyppä

Comments: 19 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[426] arXiv:2512.05597 [pdf, ps, other]: Title: Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token Prediction

Authors: Ruihong Yin, Xuepeng Shi, Oleksandr Bailo, Marco Manfredi, Theo Gevers

Comments: 10 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[427] arXiv:2512.05593 [pdf, ps, other]: Title: Learning High-Fidelity Cloth Animation via Skinning-Free Image Transfer

Authors: Rong Wang, Wei Mao, Changsheng Lu, Hongdong Li

Comments: Accepted to 3DV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[428] arXiv:2512.05571 [pdf, ps, other]: Title: MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical Imaging

Authors: Xingyu Zhang, Anna Reithmeir, Fryderyk Kögl, Rickmer Braren, Julia A. Schnabel, Daniel M. Lang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[429] arXiv:2512.05564 [pdf, ps, other]: Title: ProPhy: Progressive Physical Alignment for Dynamic World Simulation

Authors: Zijun Wang, Panwen Hu, Jing Wang, Terry Jingchen Zhang, Yuhao Cheng, Long Chen, Yiqiang Yan, Zutao Jiang, Hanhui Li, Xiaodan Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[430] arXiv:2512.05557 [pdf, ps, other]: Title: 2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence Consistency

Authors: Xingxi Yin, Yicheng Li, Gong Yan, Chenglin Li, Jian Zhao, Cong Huang, Yue Deng, Yin Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[431] arXiv:2512.05546 [pdf, ps, other]: Title: Conscious Gaze: Adaptive Attention Mechanisms for Hallucination Mitigation in Vision-Language Models

Authors: Weijue Bu, Guan Yuan, Guixian Zhang

Comments: 6 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[432] arXiv:2512.05539 [pdf, ps, other]: Title: Ideal Observer for Segmentation of Dead Leaves Images

Authors: Swantje Mahncke, Malte Ott

Comments: 41 pages, 16 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Methodology (stat.ME)
[433] arXiv:2512.05529 [pdf, ps, other]: Title: See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors

Authors: Kunyi Yang, Qingyu Wang, Cheng Yuan, Yutong Ban

Comments: The first two authors contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[434] arXiv:2512.05524 [pdf, ps, other]: Title: VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation

Authors: Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[435] arXiv:2512.05515 [pdf, ps, other]: Title: DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis

Authors: Yuhua Wen, Qifei Li, Yingying Zhou, Yingming Gao, Zhengqi Wen, Jianhua Tao, Ya Li

Comments: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[436] arXiv:2512.05513 [pdf, ps, other]: Title: Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reasoning

Authors: Chinthani Sugandhika, Chen Li, Deepu Rajan, Basura Fernando

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[437] arXiv:2512.05511 [pdf, ps, other]: Title: Rethinking Infrared Small Target Detection: A Foundation-Driven Efficient Paradigm

Authors: Chuang Yu, Jinmiao Zhao, Yunpeng Liu, Yaokun Li, Xiujun Shu, Yuanhao Feng, Bo Wang, Yimian Dai, Xiangyu Yue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[438] arXiv:2512.05494 [pdf, ps, other]: Title: Decoding with Structured Awareness: Integrating Directional, Frequency-Spatial, and Structural Attention for Medical Image Segmentation

Authors: Fan Zhang, Zhiwei Gu, Hua Wang

Comments: Accepted to AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[439] arXiv:2512.05492 [pdf, ps, other]: Title: WaterWave: Bridging Underwater Image Enhancement into Video Streams via Wavelet-based Temporal Consistency Field

Authors: Qi Zhu, Jingyi Zhang, Naishan Zheng, Wei Yu, Jinghao Zhang, Deyi Ji, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[440] arXiv:2512.05482 [pdf, ps, other]: Title: Concept-based Explainable Data Mining with VLM for 3D Detection

Authors: Mai Tsujimoto

Comments: 28 pages including appendix. Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[441] arXiv:2512.05481 [pdf, ps, other]: Title: UniFS: Unified Multi-Contrast MRI Reconstruction via Frequency-Spatial Fusion

Authors: Jialin Li, Yiwei Ren, Kai Pan, Dong Wei, Pujin Cheng, Xian Wu, Xiaoying Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[442] arXiv:2512.05478 [pdf, ps, other]: Title: EmoStyle: Emotion-Driven Image Stylization

Authors: Jingyuan Yang, Zihuan Bai, Hui Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[443] arXiv:2512.05468 [pdf, ps, other]: Title: University Building Recognition Dataset in Thailand for the mission-oriented IoT sensor system

Authors: Takara Taniguchi, Yudai Ueda, Atsuya Muramatsu, Kohki Hashimoto, Ryo Yagi, Hideya Ochiai, Chaodit Aswakul

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[444] arXiv:2512.05446 [pdf, ps, other]: Title: TED-4DGS: Temporally Activated and Embedding-based Deformation for 4DGS Compression

Authors: Cheng-Yuan Ho, He-Bi Yang, Jui-Chiu Chiang, Yu-Lun Liu, Wen-Hsiao Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[445] arXiv:2512.05422 [pdf, ps, other]: Title: ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction

Authors: Jiangtong Tan, Lin Liu, Jie Huanng, Xiaopeng Zhang, Qi Tian, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[446] arXiv:2512.05418 [pdf, ps, other]: Title: Performance Evaluation of Deep Learning for Tree Branch Segmentation in Autonomous Forestry Systems

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[447] arXiv:2512.05415 [pdf, ps, other]: Title: Moving object detection from multi-depth images with an attention-enhanced CNN

Authors: Masato Shibukawa, Fumi Yoshida, Toshifumi Yanagisawa, Takashi Ito, Hirohisa Kurosaki, Makoto Yoshikawa, Kohki Kamiya, Ji-an Jiang, Wesley Fraser, JJ Kavelaars, Susan Benecchi, Anne Verbiscer, Akira Hatakeyama, Hosei O, Naoya Ozaki

Comments: 14 pages, 22 figures, submitted to PASJ

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[448] arXiv:2512.05412 [pdf, ps, other]: Title: YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[449] arXiv:2512.05410 [pdf, ps, other]: Title: Genetic Algorithms For Parameter Optimization for Disparity Map Generation of Radiata Pine Branch Images

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[450] arXiv:2512.05398 [pdf, ps, other]: Title: The Dynamic Prior: Understanding 3D Structures for Casual Dynamic Videos

Authors: Zhuoyuan Wu, Xurui Yang, Jiahui Huang, Yue Wang, Jun Gao

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451] arXiv:2512.05394 [pdf, ps, other]: Title: Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability

Authors: Shizhan Liu, Xinran Deng, Zhuoyi Yang, Jiayan Teng, Xiaotao Gu, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[452] arXiv:2512.05391 [pdf, ps, other]: Title: LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models

Authors: Qingqiao Hu, Weimin Lyu, Meilong Xu, Kehan Qi, Xiaoling Hu, Saumya Gupta, Jiawei Zhou, Chao Chen

Comments: 20 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453] arXiv:2512.05385 [pdf, ps, other]: Title: ShaRP: SHAllow-LayeR Pruning for Video Large Language Models Acceleration

Authors: Yingjie Xia, Tao Liu, Jinglei Shi, Qingsong Xie, Heng Guo, Jian Yang, Xi Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[454] arXiv:2512.05362 [pdf, ps, other]: Title: PoolNet: Deep Learning for 2D to 3D Video Process Validation

Authors: Sanchit Kaul, Joseph Luna, Shray Arora

Comments: All code related to this paper can be found at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[455] arXiv:2512.05359 [pdf, ps, other]: Title: Group Orthogonal Low-Rank Adaptation for RGB-T Tracking

Authors: Zekai Shao, Yufan Hu, Jingyuan Liu, Bin Fan, Hongmin Liu

Comments: 13 pages, 8 figures. Accepted by AAAI 2026. Extended version

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[456] arXiv:2512.05354 [pdf, ps, other]: Title: SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training

Authors: Yang Zheng, Hao Tan, Kai Zhang, Peng Wang, Leonidas Guibas, Gordon Wetzstein, Wang Yifan

Comments: project page this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[457] arXiv:2512.05343 [pdf, ps, other]: Title: SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling

Authors: Elisabetta Fedele, Francis Engelmann, Ian Huang, Or Litany, Marc Pollefeys, Leonidas Guibas

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[458] arXiv:2512.05277 [pdf, ps, other]: Title: From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model

Authors: Kevin Cannons, Saeed Ranjbar Alvar, Mohammad Asiful Hossain, Ahmad Rezaei, Mohsen Gholami, Alireza Heidarikhazaei, Zhou Weimin, Yong Zhang, Mohammad Akbari

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[459] arXiv:2512.05272 [pdf, ps, other]: Title: Inferring Compositional 4D Scenes without Ever Seeing One

Authors: Ahmet Berke Gokmen, Ajad Chhatkuli, Luc Van Gool, Danda Pani Paudel

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460] arXiv:2512.05268 [pdf, ps, other]: Title: CARD: Correlation Aware Restoration with Diffusion

Authors: Niki Nezakati, Arnab Ghosh, Amit Roy-Chowdhury, Vishwanath Saragadam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461] arXiv:2512.05259 [pdf, ps, other]: Title: Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data Anonymization

Authors: Georgios Chatzichristodoulou, Niki Efthymiou, Panagiotis Filntisis, Georgios Pavlakos, Petros Maragos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[462] arXiv:2512.05240 [pdf, ps, other]: Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video Reconstruction

Authors: Dmitrii Torbunov, Onur Okuducu, Yi Huang, Odera Dim, Rebecca Coles, Yonggang Cui, Yihui Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463] arXiv:2512.05209 [pdf, ps, other]: Title: DEAR: Dataset for Evaluating the Aesthetics of RenderingDEAR: Dataset for Evaluating the Aesthetics of Rendering

Authors: Vsevolod Plohotnuk, Artyom Panshin, Nikola Banić, Simone Bianco, Michael Freeman, Egor Ershov

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464] arXiv:2512.05198 [pdf, ps, other]: Title: Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion Models

Authors: Rowan Bradbury, Dazhi Zhong

Comments: 16 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[465] arXiv:2512.05172 [pdf, ps, other]: Title: Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement Learning

Authors: Wentao Wang, Chunyang Liu, Kehua Sheng, Bo Zhang, Yan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[466] arXiv:2512.05152 [pdf, ps, other]: Title: EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer Models

Authors: Kun Wang, Donglin Di, Tonghua Su, Lei Fan

Comments: 6pages, 5figures, published to 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[467] arXiv:2512.05150 [pdf, ps, other]: Title: TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Authors: Zhenglin Cheng, Peng Sun, Jianguo Li, Tao Lin

Comments: arxiv v0

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[468] arXiv:2512.05145 [pdf, ps, other]: Title: Self-Improving VLM Judges Without Human Annotations

Authors: Inna Wanyin Lin, Yushi Hu, Shuyue Stella Li, Scott Geng, Pang Wei Koh, Luke Zettlemoyer, Tim Althoff, Marjan Ghazvininejad

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469] arXiv:2512.05140 [pdf, other]: Title: FlowEO: Generative Unsupervised Domain Adaptation for Earth Observation

Authors: Georges Le Bellier (CEDRIC - VERTIGO, Cnam), Nicolas Audebert (LaSTIG, IGN, CEDRIC - VERTIGO)

Comments: 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2026, Tucson (AZ), United States

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[470] arXiv:2512.05139 [pdf, ps, other]: Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative Models

Authors: Yang Xiang, Jingwen Zhong, Yige Yan, Petros Koutrakis, Eric Garshick, Meredith Franklin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[471] arXiv:2512.05137 [pdf, ps, other]: Title: ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged Images

Authors: Yunfei Zhang, Yizhuo He, Yuanxun Shao, Zhengtao Yao, Haoyan Xu, Junhao Dong, Zhen Yao, Zhikang Dong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[472] arXiv:2512.05136 [pdf, ps, other]: Title: Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes

Authors: Yujie Xiao, Gongzhen Tang, Deyun Zhang, Jun Li, Guangkun Nie, Haoyu Wang, Shun Huang, Tong Liu, Qinghao Zhao, Kangyin Chen, Shenda Hong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[473] arXiv:2512.05134 [pdf, ps, other]: Title: InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models

Authors: Zihao Wu

Comments: 8 pages main, 8 pages appendix, 16 figures, 5 tables. Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[474] arXiv:2512.05132 [pdf, ps, other]: Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training

Authors: Wenshuo Wang, Fan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[475] arXiv:2512.05131 [pdf, ps, other]: Title: AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance

Authors: Tianling Xu, Shengzhe Gan, Leslie Gu, Yuelei Li, Fangneng Zhan, Hanspeter Pfister

Comments: Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[476] arXiv:2512.05959 (cross-list from cs.CL) [pdf, ps, other]: Title: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

Authors: David Anugraha, Patrick Amadeus Irawan, Anshul Singh, En-Shiun Annie Lee, Genta Indra Winata

Comments: Preprint

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[477] arXiv:2512.05955 (cross-list from cs.RO) [pdf, ps, other]: Title: SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models

Authors: Haowen Liu, Shaoxiong Yao, Haonan Chen, Jiawei Gao, Jiayuan Mao, Jia-Bin Huang, Yilun Du

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[478] arXiv:2512.05932 (cross-list from cs.RO) [pdf, ps, other]: Title: Physically-Based Simulation of Automotive LiDAR

Authors: L. Dudzik, M. Roschani, A. Sielemann, K. Trampert, J. Ziehn, J. Beyerer, C. Neumann

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[479] arXiv:2512.05824 (cross-list from cs.AI) [pdf, ps, other]: Title: Multimodal Oncology Agent for IDH1 Mutation Prediction in Low-Grade Glioma

Authors: Hafsa Akebli (1), Adam Shephard (2), Vincenzo Della Mea (1), Nasir Rajpoot (2 and 3) ((1) University of Udine, Udine, Italy, (2) University of Warwick, Coventry, UK, (3) Histofy Ltd, Coventry, UK)

Comments: 4 pages, 2 figures

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[480] arXiv:2512.05812 (cross-list from cs.RO) [pdf, ps, other]: Title: Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation

Authors: Fabian Konstantinidis, Moritz Sackmann, Ulrich Hofmann, Christoph Stiller

Comments: This work has been submitted to the IEEE for possible publication

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[481] arXiv:2512.05665 (cross-list from cs.CL) [pdf, ps, other]: Title: Interleaved Latent Visual Reasoning with Selective Perceptual Modeling

Authors: Shuai Dong, Siyuan Wang, Xingyu Liu, Zhongyu Wei

Comments: 11 pages, 6 figures. Code available at this https URL

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[482] arXiv:2512.05438 (cross-list from cs.HC) [pdf, ps, other]: Title: EXR: An Interactive Immersive EHR Visualization in Extended Reality

Authors: Benoit Marteau, Shaun Q. Y. Tan, Jieru Li, Andrew Hornback, Yishan Zhong, Shaunna Wang, Christian Lowson, Jason Woloff, Joshua M. Pahys, Steven W. Hwang, Coleman Hilton, May D. Wang

Comments: 11 pages, 6 figures. Preprint version. This paper has been accepted to IEEE ICIR 2025. This is the author-prepared version and not the final published version. The final version will appear in IEEE Xplo

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[483] arXiv:2512.05299 (cross-list from eess.SY) [pdf, ps, other]: Title: ARCAS: An Augmented Reality Collision Avoidance System with SLAM-Based Tracking for Enhancing VRU Safety

Authors: Ahmad Yehia, Jiseop Byeon, Tianyi Wang, Huihai Wang, Yiming Xu, Junfeng Jiao, Christian Claudel

Comments: 8 pages, 3 figures, 1 table

Subjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Robotics (cs.RO); Image and Video Processing (eess.IV)
[484] arXiv:2512.05126 (cross-list from eess.AS) [pdf, ps, other]: Title: SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS Model

Authors: Kaidi Wang, Yi He, Wenhao Guan, Weijie Wu, Hongwu Ding, Xiong Zhang, Di Wu, Meng Meng, Jian Luan, Lin Li, Qingyang Hong

Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)

Fri, 5 Dec 2025

[485] arXiv:2512.05115 [pdf, ps, other]: Title: Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Authors: Tianqi Liu, Zhaoxi Chen, Zihao Huang, Shaocong Xu, Saining Zhang, Chongjie Ye, Bohan Li, Zhiguo Cao, Wei Li, Hao Zhao, Ziwei Liu

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[486] arXiv:2512.05113 [pdf, ps, other]: Title: Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection Splatting

Authors: Hao-Jen Chien, Yi-Chuan Huang, Chung-Ho Wu, Wei-Lun Chao, Yu-Lun Liu

Comments: WACV 2026. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[487] arXiv:2512.05112 [pdf, ps, other]: Title: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

Authors: Dongzhi Jiang, Renrui Zhang, Haodong Li, Zhuofan Zong, Ziyu Guo, Jun He, Claire Guo, Junyan Ye, Rongyao Fang, Weijia Li, Rui Liu, Hongsheng Li

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[488] arXiv:2512.05111 [pdf, ps, other]: Title: ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Authors: Shengyuan Ding, Xinyu Fang, Ziyu Liu, Yuhang Zang, Yuhang Cao, Xiangyu Zhao, Haodong Duan, Xiaoyi Dong, Jianze Liang, Bin Wang, Conghui He, Dahua Lin, Jiaqi Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[489] arXiv:2512.05110 [pdf, ps, other]: Title: ShadowDraw: From Any Object to Shadow-Drawing Compositional Art

Authors: Rundong Luo, Noah Snavely, Wei-Chiu Ma

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[490] arXiv:2512.05106 [pdf, ps, other]: Title: NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation

Authors: Yu Zeng, Charles Ochoa, Mingyuan Zhou, Vishal M. Patel, Vitor Guizilini, Rowan McAllister

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Robotics (cs.RO)
[491] arXiv:2512.05104 [pdf, ps, other]: Title: EvoIR: Towards All-in-One Image Restoration via Evolutionary Frequency Modulation

Authors: Jiaqi Ma, Shengkai Hu, Jun Wan, Jiaxing Huang, Lefei Zhang, Salman Khan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[492] arXiv:2512.05098 [pdf, ps, other]: Title: SA-IQA: Redefining Image Quality Assessment for Spatial Aesthetics with Multi-Dimensional Rewards

Authors: Yuan Gao, Jin Song

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[493] arXiv:2512.05091 [pdf, ps, other]: Title: Visual Reasoning Tracer: Object-Level Grounded Reasoning Benchmark

Authors: Haobo Yuan, Yueyi Sun, Yanwei Li, Tao Zhang, Xueqing Deng, Henghui Ding, Lu Qi, Anran Wang, Xiangtai Li, Ming-Hsuan Yang

Comments: Technical Report; Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[494] arXiv:2512.05081 [pdf, ps, other]: Title: Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression

Authors: Jung Yi, Wooseok Jang, Paul Hyunbin Cho, Jisu Nam, Heeji Yoon, Seungryong Kim

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[495] arXiv:2512.05079 [pdf, ps, other]: Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced Constraints

Authors: Minghan Zhu, Zhiyi Wang, Qihang Sun, Maani Ghaffari, Michael Posa

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[496] arXiv:2512.05076 [pdf, ps, other]: Title: BulletTime: Decoupled Control of Time and Camera Pose for Video Generation

Authors: Yiming Wang, Qihang Zhang, Shengqu Cai, Tong Wu, Jan Ackermann, Zhengfei Kuang, Yang Zheng, Frano Rajič, Siyu Tang, Gordon Wetzstein

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[497] arXiv:2512.05060 [pdf, ps, other]: Title: 4DLangVGGT: 4D Language-Visual Geometry Grounded Transformer

Authors: Xianfeng Wu, Yajing Bai, Minghan Li, Xianzu Wu, Xueqi Zhao, Zhongyuan Lai, Wenyu Liu, Xinggang Wang

Comments: Code: this https URL, Webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[498] arXiv:2512.05044 [pdf, ps, other]: Title: Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Authors: Yanran Zhang, Ziyi Wang, Wenzhao Zheng, Zheng Zhu, Jie Zhou, Jiwen Lu

Comments: 18 Pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[499] arXiv:2512.05039 [pdf, ps, other]: Title: Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual Encoding

Authors: Abhigyan Bhattacharya, Hiranmoy Roy

Comments: Submitted for review CVPR-2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[500] arXiv:2512.05025 [pdf, ps, other]: Title: RAMEN: Resolution-Adjustable Multimodal Encoder for Earth Observation

Authors: Nicolas Houdré, Diego Marcos, Hugo Riffaud de Turckheim, Dino Ienco, Laurent Wendling, Camille Kurtz, Sylvain Lobry

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[501] arXiv:2512.05021 [pdf, ps, other]: Title: HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition

Authors: Pham Thach Thanh Truc, Dang Hoai Nam, Huynh Tong Dang Khoa, Vo Nguyen Le Duy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[502] arXiv:2512.05016 [pdf, ps, other]: Title: Generative Neural Video Compression via Video Diffusion Prior

Authors: Qi Mao, Hao Cheng, Tinghan Yang, Libiao Jin, Siwei Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[503] arXiv:2512.05006 [pdf, ps, other]: Title: Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects

Authors: Xianghui Fan, Zhaoyu Chen, Mengyang Pan, Anping Deng, Hang Yang

Comments: conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[504] arXiv:2512.05000 [pdf, ps, other]: Title: Reflection Removal through Efficient Adaptation of Diffusion Transformers

Authors: Daniyar Zakarin, Thiemo Wandel, Anton Obukhov, Dengxin Dai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[505] arXiv:2512.04996 [pdf, ps, other]: Title: A dynamic memory assignment strategy for dilation-based ICP algorithm on embedded GPUs

Authors: Qiong Chang, Weimin Wang, Junpei Zhong, Jun Miyazaki

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[506] arXiv:2512.04981 [pdf, ps, other]: Title: Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

Authors: NaHyeon Park, Namin An, Kunhee Kim, Soyeon Yoon, Jiahao Huo, Hyunjung Shim

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[507] arXiv:2512.04970 [pdf, ps, other]: Title: Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks

Authors: Leonid Pogorelyuk, Niels Bracher, Aaron Verkleeren, Lars Kühmichel, Stefan T. Radev

Comments: UniReps Workshop 2025, 12 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[508] arXiv:2512.04969 [pdf, ps, other]: Title: Rethinking the Use of Vision Transformers for AI-Generated Image Detection

Authors: NaHyeon Park, Kunhee Kim, Junsuk Choe, Hyunjung Shim

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[509] arXiv:2512.04967 [pdf, ps, other]: Title: Balanced Few-Shot Episodic Learning for Accurate Retinal Disease Diagnosis

Authors: Jasmaine Khale, Ravi Prakash Srivastava

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[510] arXiv:2512.04963 [pdf, ps, other]: Title: GeoPE:A Unified Geometric Positional Embedding for Structured Tensors

Authors: Yupu Yao, Bowen Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[511] arXiv:2512.04952 [pdf, ps, other]: Title: FASTer: Toward Efficient Autoregressive Vision Language Action Modeling via Neural Action Tokenization

Authors: Yicheng Liu, Shiduo Zhang, Zibin Dong, Baijun Ye, Tianyuan Yuan, Xiaopeng Yu, Linqi Yin, Chenhao Lu, Junhao Shi, Luca Jiang-Tao Yu, Liangtao Zheng, Tao Jiang, Jingjing Gong, Xipeng Qiu, Hang Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[512] arXiv:2512.04943 [pdf, ps, other]: Title: Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition

Authors: Novanto Yudistira

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[513] arXiv:2512.04939 [pdf, ps, other]: Title: LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging

Authors: Zhijian Shu, Cheng Lin, Tao Xie, Wei Yin, Ben Li, Zhiyuan Pu, Weize Li, Yao Yao, Xun Cao, Xiaoyang Guo, Xiao-Xiao Long

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[514] arXiv:2512.04927 [pdf, ps, other]: Title: Virtually Unrolling the Herculaneum Papyri by Diffeomorphic Spiral Fitting

Authors: Paul Henderson

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[515] arXiv:2512.04926 [pdf, ps, other]: Title: Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion

Authors: Yueming Pan, Ruoyu Feng, Qi Dai, Yuqi Wang, Wenfeng Lin, Mingyu Guo, Chong Luo, Nanning Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[516] arXiv:2512.04904 [pdf, ps, other]: Title: ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching

Authors: Guanbo Huang, Jingjia Mao, Fanding Huang, Fengkai Liu, Xiangyang Luo, Yaoyuan Liang, Jiasheng Lu, Xiaoe Wang, Pei Liu, Ruiliu Fu, Shao-Lun Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[517] arXiv:2512.04890 [pdf, ps, other]: Title: Equivariant Symmetry-Aware Head Pose Estimation for Fetal MRI

Authors: Ramya Muthukrishnan, Borjan Gagoski, Aryn Lee, P. Ellen Grant, Elfar Adalsteinsson, Polina Golland, Benjamin Billot

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[518] arXiv:2512.04888 [pdf, ps, other]: Title: You Only Train Once (YOTO): A Retraining-Free Object Detection Framework

Authors: Priyanto Hidayatullah, Nurjannah Syakrani, Yudi Widhiyasana, Muhammad Rizqi Sholahuddin, Refdinal Tubagus, Zahri Al Adzani Hidayat, Hanri Fajar Ramadhan, Dafa Alfarizki Pratama, Farhan Muhammad Yasin

Comments: This manuscript was first submitted to the Engineering (Elsevier Journal). The preprint version was posted to arXiv afterwards to facilitate open access and community feedback

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[519] arXiv:2512.04883 [pdf, ps, other]: Title: SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms

Authors: Jiawen Wen, Yu Hu, Suixuan Qiu, Jinshan Huang, Xiaowen Chu

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[520] arXiv:2512.04875 [pdf, ps, other]: Title: SP-Det: Self-Prompted Dual-Text Fusion for Generalized Multi-Label Lesion Detection

Authors: Qing Xu, Yanqian Wang, Xiangjian Hea, Yue Li, Yixuan Zhang, Rong Qu, Wenting Duan, Zhen Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[521] arXiv:2512.04862 [pdf, ps, other]: Title: Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing

Authors: Maria-Paola Forte, Nikos Athanasiou, Giulia Ballardini, Jan Ulrich Bartels, Katherine J. Kuchenbecker, Michael J. Black

Comments: * Equal contribution. Minor figure corrections compared to the ICCV 2025 version

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[522] arXiv:2512.04857 [pdf, ps, other]: Title: Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens

Authors: Ziran Qin, Youru Lv, Mingbao Lin, Zeren Zhang, Chanfan Gan, Tieyuan Chen, Weiyao Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[523] arXiv:2512.04837 [pdf, ps, other]: Title: A Sanity Check for Multi-In-Domain Face Forgery Detection in the Real World

Authors: Jikang Cheng, Renye Yan, Zhiyuan Yan, Yaozhong Gan, Xueyi Zhang, Zhongyuan Wang, Wei Peng, Ling Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[524] arXiv:2512.04832 [pdf, ps, other]: Title: Tokenizing Buildings: A Transformer for Layout Synthesis

Authors: Manuel Ladron de Guevara, Jinmo Rhee, Ardavan Bidgoli, Vaidas Razgaitis, Michael Bergin

Comments: 8 pages, 1 page References, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[525] arXiv:2512.04830 [pdf, ps, other]: Title: FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis

Authors: Shijie Chen, Peixi Peng

Comments: Novel View Synthesis, Driving Scene, Free Trajectory, Image Generation

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[526] arXiv:2512.04821 [pdf, ps, other]: Title: LatentFM: A Latent Flow Matching Approach for Generative Medical Image Segmentation

Authors: Huynh Trinh Ngoc, Hoang Anh Nguyen Kim, Toan Nguyen Hai, Long Tran Quoc

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527] arXiv:2512.04815 [pdf, ps, other]: Title: RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS

Authors: Chuanyu Fu, Guanying Chen, Yuqi Zhang, Kunbin Yao, Yuan Xiong, Chuan Huang, Shuguang Cui, Yasuyuki Matsushita, Xiaochun Cao

Comments: arXiv admin note: substantial text overlap with arXiv:2506.02751

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528] arXiv:2512.04810 [pdf, ps, other]: Title: EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

Authors: Xin He, Longhui Wei, Jianbo Ouyang, Minghui Liao, Lingxi Xie, Qi Tian

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[529] arXiv:2512.04786 [pdf, ps, other]: Title: LaFiTe: A Generative Latent Field for 3D Native Texturing

Authors: Chia-Hao Chen, Zi-Xin Zou, Yan-Pei Cao, Ze Yuan, Guan Luo, Xiaojuan Qi, Ding Liang, Song-Hai Zhang, Yuan-Chen Guo

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530] arXiv:2512.04784 [pdf, ps, other]: Title: PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling

Authors: Bowen Ping, Chengyou Jia, Minnan Luo, Changliang Xia, Xin Shen, Zhuohang Dang, Hangwei Qian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531] arXiv:2512.04761 [pdf, ps, other]: Title: Order Matters: 3D Shape Generation from Sequential VR Sketches

Authors: Yizi Chen, Sidi Wu, Tianyi Xiao, Nina Wiedemann, Loic Landrieu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532] arXiv:2512.04734 [pdf, ps, other]: Title: MT-Depth: Multi-task Instance feature analysis for the Depth Completion

Authors: Abdul Haseeb Nizamani, Dandi Zhou, Xinhai Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533] arXiv:2512.04733 [pdf, ps, other]: Title: E3AD: An Emotion-Aware Vision-Language-Action Model for Human-Centric End-to-End Autonomous Driving

Authors: Yihong Tang, Haicheng Liao, Tong Nie, Junlin He, Ao Qu, Kehua Chen, Wei Ma, Zhenning Li, Lijun Sun, Chengzhong Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[534] arXiv:2512.04728 [pdf, ps, other]: Title: Measuring the Unspoken: A Disentanglement Model and Benchmark for Psychological Analysis in the Wild

Authors: Yigui Feng, Qinglin Wang, Haotian Mo, Yang Liu, Ke Liu, Gencheng Liu, Xinhai Chen, Siqi Shen, Songzhu Mei, Jie Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[535] arXiv:2512.04699 [pdf, ps, other]: Title: OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution

Authors: Xinning Chai, Zhengxue Cheng, Yuhong Zhang, Hengsheng Zhang, Yingsheng Qin, Yucai Yang, Rong Xie, Li Song

Comments: Accepted as TCSVT, 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[536] arXiv:2512.04686 [pdf, ps, other]: Title: Towards Cross-View Point Correspondence in Vision-Language Models

Authors: Yipu Wang, Yuheng Ji, Yuyang Liu, Enshen Zhou, Ziqiang Yang, Yuxuan Tian, Ziheng Qin, Yue Liu, Huajie Tan, Cheng Chi, Zhiyuan Ma, Daniel Dajun Zeng, Xiaolong Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[537] arXiv:2512.04678 [pdf, ps, other]: Title: Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

Authors: Yunhong Lu, Yanhong Zeng, Haobo Li, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jiapeng Zhu, Hengyuan Cao, Zhipeng Zhang, Xing Zhu, Yujun Shen, Min Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538] arXiv:2512.04677 [pdf, ps, other]: Title: Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Authors: Yubo Huang, Hailong Guo, Fangtai Wu, Shifeng Zhang, Shijie Huang, Qijun Gan, Lin Liu, Sirui Zhao, Enhong Chen, Jiaming Liu, Steven Hoi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539] arXiv:2512.04660 [pdf, ps, other]: Title: I2I-Bench: A Comprehensive Benchmark Suite for Image-to-Image Editing Models

Authors: Juntong Wang, Jiarui Wang, Huiyu Duan, Jiaxiang Kang, Guangtao Zhai, Xiongkuo Min

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540] arXiv:2512.04643 [pdf, ps, other]: Title: SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding

Authors: Chang-Hsun Wu, Kai-Po Chang, Yu-Yang Sheng, Hung-Kai Chung, Kuei-Chun Wang, Yu-Chiang Frank Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[541] arXiv:2512.04619 [pdf, ps, other]: Title: Denoise to Track: Harnessing Video Diffusion Priors for Robust Correspondence

Authors: Tianyu Yuan, Yuanbo Yang, Lin-Zhuo Chen, Yao Yao, Zhuzhong Qian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[542] arXiv:2512.04599 [pdf, ps, other]: Title: Malicious Image Analysis via Vision-Language Segmentation Fusion: Detection, Element, and Location in One-shot

Authors: Sheng Hang, Chaoxiang He, Hongsheng Hu, Hanqing Hu, Bin Benjamin Zhu, Shi-Feng Sun, Dawu Gu, Shuo Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543] arXiv:2512.04597 [pdf, ps, other]: Title: When Robots Should Say "I Don't Know": Benchmarking Abstention in Embodied Question Answering

Authors: Tao Wu, Chuhao Zhou, Guangyu Zhao, Haozhi Cao, Yewen Pu, Jianfei Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[544] arXiv:2512.04585 [pdf, ps, other]: Title: SAM3-I: Segment Anything with Instructions

Authors: Jingjing Li, Yue Feng, Yuchen Guo, Jincai Huang, Yongri Piao, Qi Bi, Miao Zhang, Xiaoqi Zhao, Qiang Chen, Shihao Zou, Wei Ji, Huchuan Lu, Li Cheng

Comments: Preliminary results; work in progress

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545] arXiv:2512.04581 [pdf, ps, other]: Title: Infrared UAV Target Tracking with Dynamic Feature Refinement and Global Contextual Attention Knowledge Distillation

Authors: Houzhang Fang, Chenxing Wu, Kun Bai, Tianqi Chen, Xiaolin Wang, Xiyang Liu, Yi Chang, Luxin Yan

Comments: Accepted by IEEE TMM

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546] arXiv:2512.04576 [pdf, ps, other]: Title: TARDis: Time Attenuated Representation Disentanglement for Incomplete Multi-Modal Tumor Segmentation and Classification

Authors: Zishuo Wan, Qinqin Kang, Yi Huang, Yun Bian, Dawei Ding, Ke Yan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547] arXiv:2512.04568 [pdf, ps, other]: Title: Prompt2Craft: Generating Functional Craft Assemblies with LLMs

Authors: Vitor Hideyo Isume, Takuya Kiyokawa, Natsuki Yamanobe, Yukiyasu Domae, Weiwei Wan, Kensuke Harada

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548] arXiv:2512.04564 [pdf, ps, other]: Title: Dataset creation for supervised deep learning-based analysis of microscopic images -- review of important considerations and recommendations

Authors: Christof A. Bertram, Viktoria Weiss, Jonas Ammeling, F. Maria Schabel, Taryn A. Donovan, Frauke Wilm, Christian Marzahl, Katharina Breininger, Marc Aubreville

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549] arXiv:2512.04563 [pdf, ps, other]: Title: COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence

Authors: Zefeng Zhang, Xiangzhao Hao, Hengzhu Tang, Zhenyu Zhang, Jiawei Sheng, Xiaodong Li, Zhenyang Li, Li Gao, Daiting Shi, Dawei Yin, Tingwen Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550] arXiv:2512.04554 [pdf, ps, other]: Title: Counterfeit Answers: Adversarial Forgery against OCR-Free Document Visual Question Answering

Authors: Marco Pintore, Maura Pintor, Dimosthenis Karatzas, Battista Biggio

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[551] arXiv:2512.04542 [pdf, ps, other]: Title: Gaussian Entropy Fields: Driving Adaptive Sparsity in 3D Gaussian Optimization

Authors: Hong Kuang, Jianchen Liu

Comments: 28 pages,11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[552] arXiv:2512.04540 [pdf, ps, other]: Title: VideoMem: Enhancing Ultra-Long Video Understanding via Adaptive Memory Management

Authors: Hongbo Jin, Qingyuan Wang, Wenhao Zhang, Yang Liu, Sijie Cheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[553] arXiv:2512.04537 [pdf, ps, other]: Title: X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale

Authors: Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554] arXiv:2512.04536 [pdf, ps, other]: Title: Detection of Intoxicated Individuals from Facial Video Sequences via a Recurrent Fusion Model

Authors: Bita Baroutian, Atefe Aghaei, Mohsen Ebrahimi Moghaddam

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[555] arXiv:2512.04534 [pdf, ps, other]: Title: Refaçade: Editing Object with Given Reference Texture

Authors: Youze Huang (1), Penghui Ruan (2), Bojia Zi (3), Xianbiao Qi (4), Jianan Wang (5), Rong Xiao (4) ((1) University of Electronic Science and Technology of China, (2) The Hong Kong Polytechnic University, (3) The Chinese University of Hong Kong, (4) IntelliFusion Inc., (5) Astribot Inc.)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556] arXiv:2512.04532 [pdf, ps, other]: Title: PhyVLLM: Physics-Guided Video Language Model with Motion-Appearance Disentanglement

Authors: Yu-Wei Zhan, Xin Wang, Hong Chen, Tongtong Feng, Wei Feng, Ren Wang, Guangyao Li, Qing Li, Wenwu Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[557] arXiv:2512.04528 [pdf, ps, other]: Title: Auto3R: Automated 3D Reconstruction and Scanning via Data-driven Uncertainty Quantification

Authors: Chentao Shen, Sizhe Zheng, Bingqian Wu, Yaohua Feng, Yuanchen Fei, Mingyu Mei, Hanwen Jiang, Xiangru Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[558] arXiv:2512.04522 [pdf, ps, other]: Title: Identity Clue Refinement and Enhancement for Visible-Infrared Person Re-Identification

Authors: Guoqing Zhang, Zhun Wang, Hairui Wang, Zhonglin Ye, Yuhui Zheng

Comments: 14 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[559] arXiv:2512.04521 [pdf, ps, other]: Title: WiFi-based Cross-Domain Gesture Recognition Using Attention Mechanism

Authors: Ruijing Liu, Cunhua Pan, Jiaming Zeng, Hong Ren, Kezhi Wang, Lei Kong, Jiangzhou Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[560] arXiv:2512.04520 [pdf, ps, other]: Title: Boundary-Aware Test-Time Adaptation for Zero-Shot Medical Image Segmentation

Authors: Chenlin Xu, Lei Zhang, Lituan Wang, Xinyu Pu, Pengfei Ma, Guangwu Qian, Zizhou Wang, Yan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561] arXiv:2512.04519 [pdf, ps, other]: Title: VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory

Authors: Yifei Yu, Xiaoshan Wu, Xinting Hu, Tao Hu, Yangtian Sun, Xiaoyang Lyu, Bo Wang, Lin Ma, Yuewen Ma, Zhongrui Wang, Xiaojuan Qi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562] arXiv:2512.04515 [pdf, ps, other]: Title: EgoLCD: Egocentric Video Generation with Long Context Diffusion

Authors: Liuzhou Zhang, Jiarui Ye, Yuanlei Wang, Ming Zhong, Mingju Cao, Wanke Xia, Bowen Zeng, Zeyu Zhang, Hao Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[563] arXiv:2512.04511 [pdf, ps, other]: Title: DuGI-MAE: Improving Infrared Mask Autoencoders via Dual-Domain Guidance

Authors: Yinghui Xing, Xiaoting Su, Shizhou Zhang, Donghao Chu, Di Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[564] arXiv:2512.04504 [pdf, ps, other]: Title: UltraImage: Rethinking Resolution Extrapolation in Image Diffusion Transformers

Authors: Min Zhao, Bokai Yan, Xue Yang, Hongzhou Zhu, Jintao Zhang, Shilong Liu, Chongxuan Li, Jun Zhu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[565] arXiv:2512.04499 [pdf, ps, other]: Title: Back to Basics: Motion Representation Matters for Human Motion Generation Using Diffusion Model

Authors: Yuduo Jin, Brandon Haworth

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[566] arXiv:2512.04496 [pdf, ps, other]: Title: Shift-Window Meets Dual Attention: A Multi-Model Architecture for Specular Highlight Removal

Authors: Tianci Huo, Lingfeng Qi, Yuhan Chen, Qihong Xue, Jinyuan Shao, Hai Yu, Jie Li, Zhanhua Zhang, Guofa Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567] arXiv:2512.04487 [pdf, ps, other]: Title: Controllable Long-term Motion Generation with Extended Joint Targets

Authors: Eunjong Lee, Eunhee Kim, Sanghoon Hong, Eunho Jung, Jihoon Kim

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[568] arXiv:2512.04485 [pdf, ps, other]: Title: Not All Birds Look The Same: Identity-Preserving Generation For Birds

Authors: Aaron Sun, Oindrila Saha, Subhransu Maji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569] arXiv:2512.04483 [pdf, ps, other]: Title: DeRA: Decoupled Representation Alignment for Video Tokenization

Authors: Pengbo Guo, Junke Wang, Zhen Xing, Chengxu Liu, Daoguo Dong, Xueming Qian, Zuxuan Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[570] arXiv:2512.04461 [pdf, ps, other]: Title: UniTS: Unified Time Series Generative Model for Remote Sensing

Authors: Yuxiang Zhang, Shunlin Liang, Wenyuan Li, Han Ma, Jianglei Xu, Yichuan Ma, Jiangwei Xie, Wei Li, Mengmeng Zhang, Ran Tao, Xiang-Gen Xia

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571] arXiv:2512.04459 [pdf, ps, other]: Title: dVLM-AD: Enhance Diffusion Vision-Language-Model for Driving via Controllable Reasoning

Authors: Yingzi Ma, Yulong Cao, Wenhao Ding, Shuibai Zhang, Yan Wang, Boris Ivanovic, Ming Jiang, Marco Pavone, Chaowei Xiao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572] arXiv:2512.04456 [pdf, ps, other]: Title: GuidNoise: Single-Pair Guided Diffusion for Generalized Noise Synthesis

Authors: Changjin Kim, HyeokJun Lee, YoungJoon Yoo

Comments: AAAI2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[573] arXiv:2512.04451 [pdf, ps, other]: Title: StreamEQA: Towards Streaming Video Understanding for Embodied Scenarios

Authors: Yifei Wang, Zhenkai Li, Tianwen Qian, Huanran Zheng, Zheng Wang, Yuqian Fu, Xiaoling Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574] arXiv:2512.04441 [pdf, ps, other]: Title: MindDrive: An All-in-One Framework Bridging World Models and Vision-Language Model for End-to-End Autonomous Driving

Authors: Bin Sun, Yaoguang Cao, Yan Wang, Rui Wang, Jiachen Shang, Xiejie Feng, Jiayi Lu, Jia Shi, Shichun Yang, Xiaoyu Yan, Ziying Song

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[575] arXiv:2512.04426 [pdf, ps, other]: Title: Self-Paced and Self-Corrective Masked Prediction for Movie Trailer Generation

Authors: Sidan Zhu, Hongteng Xu, Dixin Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[576] arXiv:2512.04425 [pdf, ps, other]: Title: Explainable Parkinsons Disease Gait Recognition Using Multimodal RGB-D Fusion and Large Language Models

Authors: Manar Alnaasan, Md Selim Sarowar, Sungho Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[577] arXiv:2512.04421 [pdf, ps, other]: Title: UTrice: Unifying Primitives in Differentiable Ray Tracing and Rasterization via Triangles for Particle-Based 3D Scenes

Authors: Changhe Liu, Ehsan Javanmardi, Naren Bao, Alex Orsholits, Manabu Tsukada

Comments: 13 pages, 10 figures, submitted to CVPR2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[578] arXiv:2512.04413 [pdf, ps, other]: Title: Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection

Authors: Xiangyi Gao, Danpei Zhao, Bo Yuan, Wentao Li

Comments: 12 pages, 8 figures, 11 tables

Journal-ref: IEEE Transactions on Geoscience and Remote Sensing 63 (2025) 1-11

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[579] arXiv:2512.04397 [pdf, ps, other]: Title: Performance Evaluation of Transfer Learning Based Medical Image Classification Techniques for Disease Detection

Authors: Zeeshan Ahmad, Shudi Bao, Meng Chen

Journal-ref: 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Copenhagen, Denmark, 2025, pp. 1-5

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[580] arXiv:2512.04395 [pdf, ps, other]: Title: Fourier-Attentive Representation Learning: A Fourier-Guided Framework for Few-Shot Generalization in Vision-Language Models

Authors: Hieu Dinh Trung Pham, Huy Minh Nhat Nguyen, Cuong Tuan Nguyen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[581] arXiv:2512.04390 [pdf, ps, other]: Title: FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring

Authors: Geunhyuk Youk, Jihyong Oh, Munchurl Kim

Comments: 20 pages, 15 figures. Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[582] arXiv:2512.04358 [pdf, ps, other]: Title: MAFNet:Multi-frequency Adaptive Fusion Network for Real-time Stereo Matching

Authors: Ao Xu, Rujin Zhao, Xiong Xu, Boceng Huang, Yujia Jia, Hongfeng Long, Fuxuan Chen, Zilong Cao, Fangyuan Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[583] arXiv:2512.04356 [pdf, ps, other]: Title: Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment

Authors: Kai-Po Chang, Wei-Yuan Cheng, Chi-Pin Huang, Fu-En Yang, Yu-Chiang Frank Wang

Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2026. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[584] arXiv:2512.04331 [pdf, ps, other]: Title: Open Set Face Forgery Detection via Dual-Level Evidence Collection

Authors: Zhongyi Cai, Bryce Gernon, Wentao Bao, Yifan Li, Matthew Wright, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[585] arXiv:2512.04329 [pdf, ps, other]: Title: A Retrieval-Augmented Generation Approach to Extracting Algorithmic Logic from Neural Networks

Authors: Waleed Khalid, Dmitry Ignatov, Radu Timofte

Subjects: Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[586] arXiv:2512.04323 [pdf, ps, other]: Title: Bayes-DIC Net: Estimating Digital Image Correlation Uncertainty with Bayesian Neural Networks

Authors: Biao Chen, Zhenhua Lei, Yahui Zhang, Tongzhi Niu

Comments: 17 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG)
[587] arXiv:2512.04315 [pdf, ps, other]: Title: SyncTrack4D: Cross-Video Motion Alignment and Video Synchronization for Multi-Video 4D Gaussian Splatting

Authors: Yonghan Lee, Tsung-Wei Huang, Shiv Gehlot, Jaehoon Choi, Guan-Ming Su, Dinesh Manocha

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[588] arXiv:2512.04314 [pdf, ps, other]: Title: DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision

Authors: Jiashu Liao, Pietro Liò, Marc de Kamps, Duygu Sarikaya

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[589] arXiv:2512.04313 [pdf, ps, other]: Title: Mind-to-Face: Neural-Driven Photorealistic Avatar Synthesis via EEG Decoding

Authors: Haolin Xiong, Tianwen Fu, Pratusha Bhuvana Prasad, Yunxuan Cai, Haiwei Chen, Wenbin Teng, Hanyuan Xiao, Yajie Zhao

Comments: 16 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[590] arXiv:2512.04311 [pdf, ps, other]: Title: Real-time Cricket Sorting By Sex

Authors: Juan Manuel Cantarero Angulo, Matthew Smith

Comments: 13 pages, 14 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[591] arXiv:2512.04309 [pdf, ps, other]: Title: Text-Only Training for Image Captioning with Retrieval Augmentation and Modality Gap Correction

Authors: Rui Fonseca, Bruno Martins, Gil Rocha

Comments: Submitted to CVPR 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[592] arXiv:2512.04305 [pdf, ps, other]: Title: How (Mis)calibrated is Your Federated CLIP and What To Do About It?

Authors: Mainak Singha, Masih Aminbeidokhti, Paolo Casari, Elisa Ricci, Subhankar Roy

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[593] arXiv:2512.04303 [pdf, ps, other]: Title: Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications

Authors: Gasser Elazab, Maximilian Jansen, Michael Unterreiner, Olaf Hellwich

Comments: Accepted in 3DV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[594] arXiv:2512.04284 [pdf, ps, other]: Title: Learning Single-Image Super-Resolution in the JPEG Compressed Domain

Authors: Sruthi Srinivasan, Elham Shakibapour, Rajy Rawther, Mehdi Saeedi

Comments: 7 pages, 4 figures, 2 tables, SEEDS Workshop, ICIP 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[595] arXiv:2512.04283 [pdf, ps, other]: Title: Plug-and-Play Image Restoration with Flow Matching: A Continuous Viewpoint

Authors: Fan Jia, Yuhao Huang, Shih-Hsin Wang, Cristina Garcia-Cardona, Andrea L. Bertozzi, Bao Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[596] arXiv:2512.04282 [pdf, ps, other]: Title: Inference-time Stochastic Refinement of GRU-Normalizing Flow for Real-time Video Motion Transfer

Authors: Tasmiah Haque, Srinjoy Das

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[597] arXiv:2512.04267 [pdf, ps, other]: Title: UniLight: A Unified Representation for Lighting

Authors: Zitian Zhang, Iliyan Georgiev, Michael Fischer, Yannick Hold-Geoffroy, Jean-François Lalonde, Valentin Deschaintre

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[598] arXiv:2512.04248 [pdf, ps, other]: Title: MVRoom: Controllable 3D Indoor Scene Generation with Multi-View Diffusion Models

Authors: Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[599] arXiv:2512.04238 [pdf, ps, other]: Title: 6 Fingers, 1 Kidney: Natural Adversarial Medical Images Reveal Critical Weaknesses of Vision-Language Models

Authors: Leon Mayer, Piotr Kalinowski, Caroline Ebersbach, Marcel Knopp, Tim Rädsch, Evangelia Christodoulou, Annika Reinke, Fiona R. Kolbinger, Lena Maier-Hein

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[600] arXiv:2512.04222 [pdf, ps, other]: Title: ReasonX: MLLM-Guided Intrinsic Image Decomposition

Authors: Alara Dirik, Tuanfeng Wang, Duygu Ceylan, Stefanos Zafeiriou, Anna Frühstück

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[601] arXiv:2512.04221 [pdf, ps, other]: Title: MoReGen: Multi-Agent Motion-Reasoning Engine for Code-based Text-to-Video Synthesis

Authors: Xiangyu Bai, He Liang, Bishoy Galoaa, Utsav Nandi, Shayda Moezzi, Yuhang He, Sarah Ostadabbas

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[602] arXiv:2512.04219 [pdf, ps, other]: Title: Generalized Event Partonomy Inference with Structured Hierarchical Predictive Learning

Authors: Zhou Chen, Joe Lin, Sathyanarayanan N. Aakur\\

Comments: 16 pages, 7 figures, 3 tables. Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[603] arXiv:2512.04187 [pdf, ps, other]: Title: OnSight Pathology: A real-time platform-agnostic computational pathology companion for histopathology

Authors: Jinzhen Hu, Kevin Faust, Parsa Babaei Zadeh, Adrienn Bourkas, Shane Eaton, Andrew Young, Anzar Alvi, Dimitrios George Oreopoulos, Ameesha Paliwal, Assem Saleh Alrumeh, Evelyn Rose Kamski-Hennekam, Phedias Diamandis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[604] arXiv:2512.04175 [pdf, ps, other]: Title: Beyond Flicker: Detecting Kinematic Inconsistencies for Generalizable Deepfake Video Detection

Authors: Alejandro Cobo, Roberto Valle, José Miguel Buenaposada, Luis Baumela

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[605] arXiv:2512.05117 (cross-list from cs.LG) [pdf, ps, other]: Title: The Universal Weight Subspace Hypothesis

Authors: Prakhar Kaushik, Shravan Chaudhari, Ankit Vaidya, Rama Chellappa, Alan Yuille

Comments: 37 pages

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[606] arXiv:2512.05116 (cross-list from cs.LG) [pdf, ps, other]: Title: Value Gradient Guidance for Flow Matching Alignment

Authors: Zhen Liu, Tim Z. Xiao, Carles Domingo-Enrich, Weiyang Liu, Dinghuai Zhang

Comments: Accepted at NeurIPS 2025; 26 pages, 20 figures

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[607] arXiv:2512.05114 (cross-list from cs.LG) [pdf, ps, other]: Title: Deep infant brain segmentation from multi-contrast MRI

Authors: Malte Hoffmann, Lilla Zöllei, Adrian V. Dalca

Comments: 8 pages, 8 figures, 1 table, website at this https URL, presented at the 2025 IEEE Asilomar Conference on Signals, Systems, and Computers

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[608] arXiv:2512.05103 (cross-list from cs.LG) [pdf, ps, other]: Title: TV2TV: A Unified Framework for Interleaved Language and Video Generation

Authors: Xiaochuang Han, Youssef Emad, Melissa Hall, John Nguyen, Karthik Padthe, Liam Robbins, Amir Bar, Delong Chen, Michal Drozdzal, Maha Elbayad, Yushi Hu, Shang-Wen Li, Sreya Dutta Roy, Jakob Verbeek, XuDong Wang, Marjan Ghazvininejad, Luke Zettlemoyer, Emily Dinan

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[609] arXiv:2512.05094 (cross-list from cs.RO) [pdf, ps, other]: Title: From Generated Human Videos to Physically Plausible Robot Trajectories

Authors: James Ni, Zekai Wang, Wei Lin, Amir Bar, Yann LeCun, Trevor Darrell, Jitendra Malik, Roei Herzig

Comments: For project website, see this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[610] arXiv:2512.04814 (cross-list from cs.SD) [pdf, ps, other]: Title: Shared Multi-modal Embedding Space for Face-Voice Association

Authors: Christopher Simic, Korbinian Riedhammer, Tobias Bocklet

Comments: Ranked 1st in Fame 2026 Challenge, ICASSP

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[611] arXiv:2512.04763 (cross-list from cs.LG) [pdf, ps, other]: Title: MemLoRA: Distilling Expert Adapters for On-Device Memory Systems

Authors: Massimo Bini, Ondrej Bohdal, Umberto Michieli, Zeynep Akata, Mete Ozay, Taha Ceritli

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[612] arXiv:2512.04705 (cross-list from cs.CC) [pdf, ps, other]: Title: Hardware-aware Neural Architecture Search of Early Exiting Networks on Edge Accelerators

Authors: Alaa Zniber, Arne Symons, Ouassim Karrakchou, Marian Verhelst, Mounir Ghogho

Comments: Submitted to IEEE Transactions on Emerging Topics in Computing

Subjects: Computational Complexity (cs.CC); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)
[613] arXiv:2512.04625 (cross-list from cs.LG) [pdf, ps, other]: Title: Rethinking Decoupled Knowledge Distillation: A Predictive Distribution Perspective

Authors: Bowen Zheng, Ran Cheng

Comments: Accepted to IEEE TNNLS

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[614] arXiv:2512.04556 (cross-list from cs.GR) [pdf, ps, other]: Title: Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex

Authors: Zhizhen Wu, Zhe Cao, Yuchi Huo

Comments: 10 pages, 7 figures

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[615] arXiv:2512.04464 (cross-list from cs.LG) [pdf, ps, other]: Title: Feature Engineering vs. Deep Learning for Automated Coin Grading: A Comparative Study on Saint-Gaudens Double Eagles

Authors: Tanmay Dogra, Eric Ngo, Mohammad Alam, Jean-Paul Talavera, Asim Dahal

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[616] arXiv:2512.04385 (cross-list from cs.LG) [pdf, ps, other]: Title: STeP-Diff: Spatio-Temporal Physics-Informed Diffusion Models for Mobile Fine-Grained Pollution Forecasting

Authors: Nan Zhou, Weijie Hong, Huandong Wang, Jianfeng Zheng, Qiuhua Wang, Yali Song, Xiao-Ping Zhang, Yong Li, Xinlei Chen

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[617] arXiv:2512.04264 (cross-list from cs.LG) [pdf, ps, other]: Title: Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness

Authors: Long Dang, Thushari Hapuarachchi, Kaiqi Xiong, Jing Lin

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[618] arXiv:2512.04092 (cross-list from physics.soc-ph) [pdf, ps, other]: Title: The changing surface of the world's roads

Authors: Sukanya Randhawa, Guntaj Randhawa, Clemens Langer, Francis Andorful, Benjamin Herfort, Daniel Kwakye, Omer Olchik, Sven Lautenbach, Alexander Zipf

Subjects: Physics and Society (physics.soc-ph); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[619] arXiv:2512.04087 (cross-list from q-bio.NC) [pdf, ps, other]: Title: Human-Centred Evaluation of Text-to-Image Generation Models for Self-expression of Mental Distress: A Dataset Based on GPT-4o

Authors: Sui He, Shenbin Qian

Subjects: Neurons and Cognition (q-bio.NC); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)

Thu, 4 Dec 2025

[620] arXiv:2512.04085 [pdf, ps, other]: Title: Unique Lives, Shared World: Learning from Single-Life Videos

Authors: Tengda Han, Sayna Ebrahimi, Dilara Gokay, Li Yang Ku, Maks Ovsjanikov, Iva Babukova, Daniel Zoran, Viorica Patraucean, Joao Carreira, Andrew Zisserman, Dima Damen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[621] arXiv:2512.04084 [pdf, ps, other]: Title: SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows

Authors: Qinyu Zhao, Guangting Zheng, Tao Yang, Rui Zhu, Xingjian Leng, Stephen Gould, Liang Zheng

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[622] arXiv:2512.04082 [pdf, ps, other]: Title: PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

Authors: Jiazhe Wei, Ken Li, Tianyu Lao, Haofan Wang, Liang Wang, Caifeng Shan, Chenyang Si

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[623] arXiv:2512.04069 [pdf, ps, other]: Title: SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL

Authors: Siyi Chen, Mikaela Angelina Uy, Chan Hee Song, Faisal Ladhak, Adithyavairavan Murali, Qing Qu, Stan Birchfield, Valts Blukis, Jonathan Tremblay

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[624] arXiv:2512.04048 [pdf, ps, other]: Title: Stable Signer: Hierarchical Sign Language Generative Model

Authors: Sen Fang, Yalin Feng, Hongbin Zhong, Yanxin Zhang, Dimitris N. Metaxas

Comments: 12 pages, 7 figures. More Demo at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Computers and Society (cs.CY)
[625] arXiv:2512.04040 [pdf, ps, other]: Title: RELIC: Interactive Video World Model with Long-Horizon Memory

Authors: Yicong Hong, Yiqun Mei, Chongjian Ge, Yiran Xu, Yang Zhou, Sai Bi, Yannick Hold-Geoffroy, Mike Roberts, Matthew Fisher, Eli Shechtman, Kalyan Sunkavalli, Feng Liu, Zhengqi Li, Hao Tan

Comments: 22 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[626] arXiv:2512.04039 [pdf, ps, other]: Title: Fast & Efficient Normalizing Flows and Applications of Image Generative Models

Authors: Sandeep Nagar

Comments: PhD Thesis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[627] arXiv:2512.04025 [pdf, ps, other]: Title: PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation

Authors: Xiaolong Li, Youping Gu, Xi Lin, Weijie Wang, Bohan Zhuang

Comments: Tech report

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[628] arXiv:2512.04021 [pdf, ps, other]: Title: C3G: Learning Compact 3D Representations with 2K Gaussians

Authors: Honggyu An, Jaewoo Jung, Mungyeom Kim, Sunghwan Hong, Chaehyun Kim, Kazumi Fukuda, Minkyeong Jeon, Jisang Han, Takuya Narihira, Hyuna Ko, Junsu Kim, Yuki Mitsufuji, Seungryong Kim

Comments: Project Page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[629] arXiv:2512.04019 [pdf, ps, other]: Title: Ultra-lightweight Neural Video Representation Compression

Authors: Ho Man Kwan, Tianhao Peng, Ge Gao, Fan Zhang, Mike Nilsson, Andrew Gower, David Bull

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[630] arXiv:2512.04015 [pdf, ps, other]: Title: Learning Group Actions In Disentangled Latent Image Representations

Authors: Farhana Hossain Swarnali, Miaomiao Zhang, Tonmoy Hossain

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[631] arXiv:2512.04012 [pdf, ps, other]: Title: Emergent Outlier View Rejection in Visual Geometry Grounded Transformers

Authors: Jisang Han, Sunghwan Hong, Jaewoo Jung, Wooseok Jang, Honggyu An, Qianqian Wang, Seungryong Kim, Chen Feng

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[632] arXiv:2512.04007 [pdf, ps, other]: Title: On the Temporality for Sketch Representation Learning

Authors: Marcelo Isaias de Moraes Junior, Moacir Antonelli Ponti

Comments: Preprint submitted to Pattern Recognition Letters

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[633] arXiv:2512.04000 [pdf, ps, other]: Title: Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding

Authors: Jialuo Li, Bin Li, Jiahao Li, Yan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[634] arXiv:2512.03996 [pdf, ps, other]: Title: Highly Efficient Test-Time Scaling for T2I Diffusion Models with Text Embedding Perturbation

Authors: Hang Xu, Linjiang Huang, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[635] arXiv:2512.03992 [pdf, ps, other]: Title: DIQ-H: Evaluating Hallucination Persistence in VLMs Under Temporal Visual Degradation

Authors: Zexin Lin, Hawen Wan, Yebin Zhong, Xiaoqiang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[636] arXiv:2512.03981 [pdf, ps, other]: Title: DirectDrag: High-Fidelity, Mask-Free, Prompt-Free Drag-based Image Editing via Readout-Guided Feature Alignment

Authors: Sheng-Hao Liao, Shang-Fu Chen, Tai-Ming Huang, Wen-Huang Cheng, Kai-Lung Hua

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[637] arXiv:2512.03979 [pdf, ps, other]: Title: BlurDM: A Blur Diffusion Model for Image Deblurring

Authors: Jin-Ting He, Fu-Jen Tsai, Yan-Tsung Peng, Min-Hung Chen, Chia-Wen Lin, Yen-Yu Lin

Comments: NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[638] arXiv:2512.03964 [pdf, ps, other]: Title: Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization

Authors: Lianyu Pang, Ji Zhou, Qiping Wang, Baoquan Zhao, Zhenguo Yang, Qing Li, Xudong Mao

Comments: 17 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[639] arXiv:2512.03963 [pdf, ps, other]: Title: TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning

Authors: Tao Wu, Li Yang, Gen Zhan, Yabin Zhang, Yiting Liao, Junlin Li, Deliang Fu, Li Zhang, Limin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[640] arXiv:2512.03939 [pdf, ps, other]: Title: MUT3R: Motion-aware Updating Transformer for Dynamic 3D Reconstruction

Authors: Guole Shen, Tianchen Deng, Xingrui Qin, Nailin Wang, Jianyu Wang, Yanbo Wang, Yongtao Chen, Hesheng Wang, Jingchuan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[641] arXiv:2512.03932 [pdf, ps, other]: Title: Beyond the Ground Truth: Enhanced Supervision for Image Restoration

Authors: Donghun Ryou, Inju Ha, Sanghyeok Chu, Bohyung Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[642] arXiv:2512.03918 [pdf, ps, other]: Title: UniMo: Unifying 2D Video and 3D Human Motion with an Autoregressive Framework

Authors: Youxin Pang, Yong Zhang, Ruizhi Shao, Xiang Deng, Feng Gao, Xu Xiaoming, Xiaoming Wei, Yebin Liu

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[643] arXiv:2512.03905 [pdf, ps, other]: Title: Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence

Authors: Shuai Yang, Junxin Lin, Yifan Zhou, Ziwei Liu, Chen Change Loy

Comments: Code: this https URL, Project: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[644] arXiv:2512.03883 [pdf, ps, other]: Title: Dual Cross-Attention Siamese Transformer for Rectal Tumor Regrowth Assessment in Watch-and-Wait Endoscopy

Authors: Jorge Tapias Gomez, Despoina Kanata, Aneesh Rangnekar, Christina Lee, Julio Garcia-Aguilar, Joshua Jesse Smith, Harini Veeraraghavan

Comments: 6 pages, 5 figures, 1 table, submitted to ISBI conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[645] arXiv:2512.03869 [pdf, ps, other]: Title: An Automated Framework for Large-Scale Graph-Based Cerebrovascular Analysis

Authors: Daniele Falcetta, Liane S. Canas, Lorenzo Suppa, Matteo Pentassuglia, Jon Cleary, Marc Modat, Sébastien Ourselin, Maria A. Zuluaga

Comments: Submitted to ISBI 2026. 6 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[646] arXiv:2512.03862 [pdf, ps, other]: Title: Diminishing Returns in Self-Supervised Learning

Authors: Oli Bridge, Huey Sun, Botond Branyicskai-Nagy, Charles D'Ornano, Shomit Basu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[647] arXiv:2512.03854 [pdf, ps, other]: Title: Prostate biopsy whole slide image dataset from an underrepresented Middle Eastern population

Authors: Peshawa J. Muhammad Ali, Navin Vincent, Saman S. Abdulla, Han N. Mohammed Fadhl, Anders Blilie, Kelvin Szolnoky, Julia Anna Mielcarz, Xiaoyi Ji, Kimmo Kartasalo, Abdulbasit K. Al-Talabani, Nita Mulliqi

Comments: 13 pages, 2 figures and 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[648] arXiv:2512.03852 [pdf, ps, other]: Title: Traffic Image Restoration under Adverse Weather via Frequency-Aware Mamba

Authors: Liwen Pan, Longguang Wang, Guangwei Gao, Jun Wang, Jun Shi, Juncheng Li

Comments: 12pages, 13 figures, 5tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[649] arXiv:2512.03848 [pdf, ps, other]: Title: PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation

Authors: Hania Ghouse, Maryam Alsharqi, Farhad R. Nezami, Muzammil Behzad

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[650] arXiv:2512.03844 [pdf, ps, other]: Title: CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation

Authors: Letian Zhou, Songhua Liu, Xinchao Wang

Comments: 34 pages, 24 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[651] arXiv:2512.03837 [pdf, ps, other]: Title: Heatmap Pooling Network for Action Recognition from RGB Videos

Authors: Mengyuan Liu, Jinfu Liu, Yongkang Jiang, Bin He

Comments: Final Version of IEEE Transactions on Pattern Analysis and Machine Intelligence

Journal-ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[652] arXiv:2512.03834 [pdf, ps, other]: Title: Lean Unet: A Compact Model for Image Segmentation

Authors: Ture Hassler, Ida Åkerholm, Marcus Nordström, Gabriele Balletti, Orcun Goksel

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[653] arXiv:2512.03827 [pdf, ps, other]: Title: A Robust Camera-based Method for Breath Rate Measurement

Authors: Alexey Protopopov

Comments: 9 pages, 4 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[654] arXiv:2512.03817 [pdf, ps, other]: Title: HieroGlyphTranslator: Automatic Recognition and Translation of Egyptian Hieroglyphs to English

Authors: Ahmed Nasser, Marwan Mohamed, Alaa Sherif, Basmala Mahmoud, Shereen Yehia, Asmaa Saad, Mariam S. El-Rahmany, Ensaf H. Mohamed

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[655] arXiv:2512.03796 [pdf, ps, other]: Title: LSRS: Latent Scale Rejection Sampling for Visual Autoregressive Modeling

Authors: Hong-Kai Zheng, Piji Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[656] arXiv:2512.03794 [pdf, ps, other]: Title: AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition

Authors: Zichuan Lin, Yicheng Liu, Yang Yang, Lvfang Tao, Deheng Ye

Comments: 15 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[657] arXiv:2512.03751 [pdf, ps, other]: Title: Research on Brain Tumor Classification Method Based on Improved ResNet34 Network

Authors: Yufeng Li, Wenchao Zhao, Bo Dang, Weimin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[658] arXiv:2512.03749 [pdf, ps, other]: Title: Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models

Authors: Korada Sri Vardhana, Shrikrishna Lolla, Soma Biswas

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[659] arXiv:2512.03746 [pdf, ps, other]: Title: Thinking with Programming Vision: Towards a Unified View for Thinking with Images

Authors: Zirun Guo, Minjie Hong, Feng Zhang, Kai Jia, Tao Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[660] arXiv:2512.03745 [pdf, ps, other]: Title: Dual-level Modality Debiasing Learning for Unsupervised Visible-Infrared Person Re-Identification

Authors: Jiaze Li, Yan Lu, Bin Liu, Guojun Yin, Mang Ye

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[661] arXiv:2512.03730 [pdf, ps, other]: Title: Out-of-the-box: Black-box Causal Attacks on Object Detectors

Authors: Melane Navaratnarajah, David A. Kelly, Hana Chockler

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[662] arXiv:2512.03724 [pdf, ps, other]: Title: PosA-VLA: Enhancing Action Generation via Pose-Conditioned Anchor Attention

Authors: Ziwen Li, Xin Wang, Hanlue Zhang, Runnan Chen, Runqi Lin, Xiao He, Han Huang, Yandong Guo, Fakhri Karray, Tongliang Liu, Mingming Gong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[663] arXiv:2512.03715 [pdf, ps, other]: Title: DINO-RotateMatch: A Rotation-Aware Deep Framework for Robust Image Matching in Large-Scale 3D Reconstruction

Authors: Kaichen Zhang, Tianxiang Sheng, Xuanming Shi

Comments: 9 pages, 5 figures, 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[664] arXiv:2512.03701 [pdf, ps, other]: Title: Structured Uncertainty Similarity Score (SUSS): Learning a Probabilistic, Interpretable, Perceptual Metric Between Images

Authors: Paula Seidler, Neill D. F. Campbell, Ivor J A Simpson

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[665] arXiv:2512.03687 [pdf, ps, other]: Title: Active Visual Perception: Opportunities and Challenges

Authors: Yian Li, Xiaoyu Guo, Hao Zhang, Shuiwang Li, Xiaowei Dai

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[666] arXiv:2512.03683 [pdf, ps, other]: Title: GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces

Authors: Melis Ocal, Xiaoyan Xing, Yue Li, Ngo Anh Vien, Sezer Karaoglu, Theo Gevers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[667] arXiv:2512.03673 [pdf, ps, other]: Title: ConvRot: Rotation-Based Plug-and-Play 4-bit Quantization for Diffusion Transformers

Authors: Feice Huang, Zuliang Han, Xing Zhou, Yihuang Chen, Lifei Zhu, Haoqian Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[668] arXiv:2512.03667 [pdf, ps, other]: Title: Colon-X: Advancing Intelligent Colonoscopy from Multimodal Understanding to Clinical Reasoning

Authors: Ge-Peng Ji, Jingyi Liu, Deng-Ping Fan, Nick Barnes

Comments: Technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[669] arXiv:2512.03666 [pdf, ps, other]: Title: ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos

Authors: Qi'ao Xu, Tianwen Qian, Yuqian Fu, Kailing Li, Yang Jiao, Jiacheng Zhang, Xiaoling Wang, Liang He

Comments: 26 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[670] arXiv:2512.03663 [pdf, ps, other]: Title: Multi-Scale Visual Prompting for Lightweight Small-Image Classification

Authors: Salim Khazem

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[671] arXiv:2512.03643 [pdf, ps, other]: Title: Optical Context Compression Is Just (Bad) Autoencoding

Authors: Ivan Yee Lee, Cheng Yang, Taylor Berg-Kirkpatrick

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[672] arXiv:2512.03640 [pdf, ps, other]: Title: MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms

Authors: Jiahao Zhang, Xiao Zhao, Guangyu Gao

Journal-ref: MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15521. Springer, Singapore

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[673] arXiv:2512.03625 [pdf, ps, other]: Title: FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features

Authors: Zhigang Yang, Yuan Liu, Jiawei Zhang, Puning Zhang, Xinqiang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[674] arXiv:2512.03621 [pdf, ps, other]: Title: ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation

Authors: Yaokun Li, Shuaixian Wang, Mantang Guo, Jiehui Huang, Taojun Ding, Mu Hu, Kaixuan Wang, Shaojie Shen, Guang Tan

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[675] arXiv:2512.03619 [pdf, ps, other]: Title: LAMP: Language-Assisted Motion Planning for Controllable Video Generation

Authors: Muhammed Burak Kizil, Enes Sanli, Niloy J. Mitra, Erkut Erdem, Aykut Erdem, Duygu Ceylan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[676] arXiv:2512.03601 [pdf, ps, other]: Title: Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

Authors: Haoran Zhou, Gim Hee Lee

Comments: Accepted to NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[677] arXiv:2512.03598 [pdf, ps, other]: Title: Memory-Guided Point Cloud Completion for Dental Reconstruction

Authors: Jianan Sun, Yukang Huang, Dongzhihan Wang, Mingyu Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[678] arXiv:2512.03597 [pdf, ps, other]: Title: HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation

Authors: Fuchen Zheng, Xinyi Chen, Weixuan Li, Quanjun Li, Junhua Zhou, Xiaojiao Guo, Xuhang Chen, Chi-Man Pun, Shoujun Zhou

Comments: 6 pages, 4 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[679] arXiv:2512.03593 [pdf, ps, other]: Title: CloseUpAvatar: High-Fidelity Animatable Full-Body Avatars with Mixture of Multi-Scale Textures

Authors: David Svitov, Pietro Morerio, Lourdes Agapito, Alessio Del Bue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[680] arXiv:2512.03592 [pdf, ps, other]: Title: Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding

Authors: Guang Yang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[681] arXiv:2512.03590 [pdf, ps, other]: Title: Beyond Boundary Frames: Audio-Visual Semantic Guidance for Context-Aware Video Interpolation

Authors: Yuchen Deng, Xiuyang Wu, Hai-Tao Zheng, Jie Wang, Feidiao Yang, Yuxing Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[682] arXiv:2512.03580 [pdf, ps, other]: Title: Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes

Authors: Malte Bleeker, Mauro Gotsch

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[683] arXiv:2512.03577 [pdf, ps, other]: Title: Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning

Authors: Yizhi Zhang, Lei Fan, Zhulin Tao, Donglin Di, Yang Song, Sidong Liu, Cong Cong

Comments: 6 pages, 2 figures. Camera-ready version accepted for IEEE BIBM 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[684] arXiv:2512.03575 [pdf, ps, other]: Title: UniComp: Rethinking Video Compression Through Informational Uniqueness

Authors: Chao Yuan, Shimin Chen, Minliang Lin, Limeng Qiao, Guanglu Wan, Lin Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[685] arXiv:2512.03574 [pdf, ps, other]: Title: Global-Local Aware Scene Text Editing

Authors: Fuxiang Yang, Tonghua Su, Donglin Di, Yin Chen, Xiangqian Wu, Zhongjie Wang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[686] arXiv:2512.03566 [pdf, ps, other]: Title: GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models

Authors: Hao Sun, Lei Fan, Donglin Di, Shaohui Liu

Comments: Accepted by ACM MM Asia2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[687] arXiv:2512.03558 [pdf, ps, other]: Title: CartoMapQA: A Fundamental Benchmark Dataset Evaluating Vision-Language Models on Cartographic Map Understanding

Authors: Huy Quang Ung, Guillaume Habault, Yasutaka Nishimura, Hao Niu, Roberto Legaspi, Tomoki Oya, Ryoichi Kojima, Masato Taya, Chihiro Ono, Atsunori Minamikawa, Yan Liu

Comments: Accepted at SIGSPATIAL 2025 (Best paper candidates), 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[688] arXiv:2512.03553 [pdf, ps, other]: Title: Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching

Authors: Wei Chee Yew, Hailun Xu, Sanjay Saha, Xiaotian Fan, Hiok Hian Ong, David Yuchen Wang, Kanchan Sarkar, Zhenheng Yang, Danhui Guan

Comments: Accepted at KDD 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[689] arXiv:2512.03542 [pdf, ps, other]: Title: V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time Intervention

Authors: Nan Sun, Zhenyu Zhang, Xixun Lin, Kun Wang, Yanmin Shang, Naibin Gu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang, Yanan Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[690] arXiv:2512.03540 [pdf, ps, other]: Title: CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation

Authors: Ruoxuan Zhang, Bin Wen, Hongxia Xie, Yi Yao, Songhan Zuo, Jian-Yu Jiang-Lin, Hong-Han Shuai, Wen-Huang Cheng

Comments: Accepted by ACM Multimedia 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[691] arXiv:2512.03534 [pdf, ps, other]: Title: Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Authors: Subin Kim, Sangwoo Mo, Mamshad Nayeem Rizve, Yiran Xu, Difan Liu, Jinwoo Shin, Tobias Hinz

Comments: Visualizations are available at the website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[692] arXiv:2512.03532 [pdf, ps, other]: Title: OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation

Authors: Zhishan Zhou, Siyuan Wei, Zengran Wang, Chunjie Wang, Xiaosheng Yan, Xiao Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[693] arXiv:2512.03520 [pdf, ps, other]: Title: FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation

Authors: Yiyi Cai, Yuhan Wu, Kunhang Li, You Zhou, Bo Zheng, Haiyang Liu

Comments: 15 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[694] arXiv:2512.03510 [pdf, ps, other]: Title: CSMapping: Scalable Crowdsourced Semantic Mapping and Topology Inference for Autonomous Driving

Authors: Zhijian Qiao, Zehuan Yu, Tong Li, Chih-Chung Chou, Wenchao Ding, Shaojie Shen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[695] arXiv:2512.03509 [pdf, ps, other]: Title: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model

Authors: Kwaku Opoku-Ware, Gideon Opoku

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[696] arXiv:2512.03508 [pdf, ps, other]: Title: Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation

Authors: Seogkyu Jeon, Kibeom Hong, Hyeran Byun

Comments: ICCV 2025 (poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[697] arXiv:2512.03500 [pdf, ps, other]: Title: EEA: Exploration-Exploitation Agent for Long Video Understanding

Authors: Te Yang, Xiangyu Zhu, Bo Wang, Quan Chen, Peng Jiang, Zhen Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[698] arXiv:2512.03499 [pdf, ps, other]: Title: NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation

Authors: Renqi Chen, Haoyang Su, Shixiang Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[699] arXiv:2512.03479 [pdf, ps, other]: Title: Towards Object-centric Understanding for Instructional Videos

Authors: Wenliang Guo, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[700] arXiv:2512.03477 [pdf, ps, other]: Title: Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis

Authors: Zijian Gu, Yuxi Liu, Zhenhao Zhang, Song Wang

Comments: 10 pages, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[701] arXiv:2512.03474 [pdf, ps, other]: Title: Procedural Mistake Detection via Action Effect Modeling

Authors: Wenliang Guo, Yujiang Pu, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[702] arXiv:2512.03470 [pdf, ps, other]: Title: Difference Decomposition Networks for Infrared Small Target Detection

Authors: Chen Hu, Mingyu Zhou, Shuai Yuan, Hongbo Hu, Xiangyu Qiu, Junhai Luo, Tian Pu, Xiyin Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[703] arXiv:2512.03463 [pdf, ps, other]: Title: Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models

Authors: Shojiro Yamabe, Futa Waseda, Daiki Shiono, Tsubasa Takahashi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[704] arXiv:2512.03454 [pdf, ps, other]: Title: Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles

Authors: Haicheng Liao, Huanming Shen, Bonan Wang, Yongkang Li, Yihong Tang, Chengyue Wang, Dingyi Zhuang, Kehua Chen, Hai Yang, Chengzhong Xu, Zhenning Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[705] arXiv:2512.03453 [pdf, ps, other]: Title: GeoVideo: Introducing Geometric Regularization into Video Generation Model

Authors: Yunpeng Bai, Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[706] arXiv:2512.03451 [pdf, ps, other]: Title: GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers

Authors: Zhiye Song, Steve Dai, Ben Keller, Brucek Khailany

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[707] arXiv:2512.03450 [pdf, ps, other]: Title: KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models

Authors: Rhys Newbury, Juyan Zhang, Tin Tran, Hanna Kurniawati, Dana Kulić

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[708] arXiv:2512.03449 [src]: Title: LM-CartSeg: Automated Segmentation of Lateral and Medial Cartilage and Subchondral Bone for Radiomics Analysis

Authors: Tongxu Zhang

Comments: The manuscript represents only a preliminary and substantially incompleted exploration. The author has decided not to stand by these results, and a thoroughly revised and significantly different version will be developed separately. Therefore this version is withdrawn and should not be cited

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[709] arXiv:2512.03445 [pdf, ps, other]: Title: Multi-Aspect Knowledge-Enhanced Medical Vision-Language Pretraining with Multi-Agent Data Generation

Authors: Xieji Li, Siyuan Yan, Yingsheng Liu, H. Peter Soyer, Monika Janda, Victoria Mar, Zongyuan Ge

Comments: 10 pages. Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[710] arXiv:2512.03430 [pdf, ps, other]: Title: Label-Efficient Hyperspectral Image Classification via Spectral FiLM Modulation of Low-Level Pretrained Diffusion Features

Authors: Yuzhen Hu, Biplab Banerjee, Saurabh Prasad

Comments: Accepted to the ICML 2025 TerraBytes Workshop (June 9, 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[711] arXiv:2512.03427 [pdf, ps, other]: Title: Generalization Evaluation of Deep Stereo Matching Methods for UAV-Based Forestry Applications

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[712] arXiv:2512.03424 [pdf, ps, other]: Title: DM3D: Deformable Mamba via Offset-Guided Gaussian Sequencing for Point Cloud Understanding

Authors: Bin Liu, Chunyang Wang, Xuelian Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[713] arXiv:2512.03418 [pdf, ps, other]: Title: YOLOA: Real-Time Affordance Detection via LLM Adapter

Authors: Yuqi Ji, Junjie Ke, Lihuo He, Jun Liu, Kaifan Zhang, Yu-Kun Lai, Guiguang Ding, Xinbo Gao

Comments: 13 pages, 9 figures, conference

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[714] arXiv:2512.03405 [pdf, ps, other]: Title: ViDiC: Video Difference Captioning

Authors: Jiangtao Wu, Shihao Li, Zhaozhou Bian, Jialu Chen, Runzhe Wen, An Ping, Yiwen He, Jiakai Wang, Yuanxing Zhang, Jiaheng Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[715] arXiv:2512.03404 [pdf, ps, other]: Title: MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification

Authors: Yujian Zhao, Hankun Liu, Guanglin Niu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[716] arXiv:2512.03370 [pdf, ps, other]: Title: ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding

Authors: Lingjun Zhao, Yandong Luo, James Hay, Lu Gan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[717] arXiv:2512.03369 [pdf, ps, other]: Title: FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread Forecasting

Authors: Nan Zhou, Huandong Wang, Jiahao Li, Han Li, Yali Song, Qiuhua Wang, Yong Li, Xinlei Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[718] arXiv:2512.03359 [pdf, ps, other]: Title: A Hybrid Deep Learning Framework with Explainable AI for Lung Cancer Classification with DenseNet169 and SVM

Authors: Md Rashidul Islam, Bakary Gibba, Altagi Abdallah Bakheit Abdelgadir

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[719] arXiv:2512.03350 [pdf, ps, other]: Title: SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation

Authors: Yu Yuan, Tharindu Wickremasinghe, Zeeshan Nadir, Xijun Wang, Yiheng Chi, Stanley H. Chan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[720] arXiv:2512.03346 [pdf, ps, other]: Title: Hierarchical Attention for Sparse Volumetric Anomaly Detection in Subclinical Keratoconus

Authors: Lynn Kandakji, William Woof, Nikolas Pontikos

Comments: 16 pages, 7 figures, 6 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[721] arXiv:2512.03345 [pdf, ps, other]: Title: HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration

Authors: Seunghoi Kim, Henry F. J. Tregidgo, Chen Jin, Matteo Figini, Daniel C. Alexander

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[722] arXiv:2512.03339 [pdf, ps, other]: Title: ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in Echocardiography

Authors: Yeganeh Ghamary, Victoria Wu, Hooman Vaseli, Christina Luong, Teresa Tsang, Siavash Bigdeli, Purang Abolmaesumi

Comments: 11 pages, Accepted in IMIMIC Workshop at MICCAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[723] arXiv:2512.03335 [pdf, ps, other]: Title: Step-by-step Layered Design Generation

Authors: Faizan Farooq Khan, K J Joseph, Koustava Goswami, Mohamed Elhoseiny, Balaji Vasan Srinivasan

Journal-ref: AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[724] arXiv:2512.03317 [pdf, ps, other]: Title: NavMapFusion: Diffusion-based Fusion of Navigation Maps for Online Vectorized HD Map Construction

Authors: Thomas Monninger, Zihan Zhang, Steffen Staab, Sihao Ding

Comments: Accepted to 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2026)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[725] arXiv:2512.03284 [pdf, ps, other]: Title: SpatialReasoner: Active Perception for Large-Scale 3D Scene Understanding

Authors: Hongpei Zheng, Shijie Li, Yanran Li, Hujun Yin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[726] arXiv:2512.03257 [pdf, ps, other]: Title: PyroFocus: A Deep Learning Approach to Real-Time Wildfire Detection in Multispectral Remote Sensing Imagery

Authors: Mark Moussa, Andre Williams, Seth Roffe, Douglas Morton

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[727] arXiv:2512.03247 [pdf, ps, other]: Title: PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement

Authors: Haitian Zheng, Yuan Yao, Yongsheng Yu, Yuqian Zhou, Jiebo Luo, Zhe Lin

Comments: Published in the Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[728] arXiv:2512.03245 [pdf, ps, other]: Title: 2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition

Authors: Liying Lu, Raphaël Achddou, Sabine Süsstrunk

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[729] arXiv:2512.03237 [pdf, ps, other]: Title: LLM-Guided Material Inference for 3D Point Clouds

Authors: Nafiseh Izadyar, Teseo Schneider

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[730] arXiv:2512.03233 [pdf, ps, other]: Title: Object Counting with GPT-4o and GPT-5: A Comparative Study

Authors: Richard Füzesséry, Kaziwa Saleh, Sándor Szénási, Zoltán Vámossy

Comments: 5 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[731] arXiv:2512.03210 [pdf, ps, other]: Title: Flux4D: Flow-based Unsupervised 4D Reconstruction

Authors: Jingkang Wang, Henry Che, Yun Chen, Ze Yang, Lily Goli, Sivabalan Manivasagam, Raquel Urtasun

Comments: NeurIPS 2025. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[732] arXiv:2512.03199 [pdf, ps, other]: Title: Does Head Pose Correction Improve Biometric Facial Recognition?

Authors: Justin Norman, Hany Farid

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[733] arXiv:2512.03182 [pdf, ps, other]: Title: Drainage: A Unifying Framework for Addressing Class Uncertainty

Authors: Yasser Taha, Grégoire Montavon, Nils Körber

Comments: 16 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[734] arXiv:2512.03126 [pdf, ps, other]: Title: Hierarchical Process Reward Models are Symbolic Vision Learners

Authors: Shan Zhang, Aotian Chen, Kai Zou, Jindong Gu, Yuan Xue, Anton van den Hengel

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[735] arXiv:2512.04076 (cross-list from cs.GR) [pdf, ps, other]: Title: Radiance Meshes for Volumetric Reconstruction

Authors: Alexander Mai, Trevor Hedstrom, George Kopanas, Janne Kontkanen, Falko Kuester, Jonathan T. Barron

Comments: Website: half-potato.gitlab.io/rm

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[736] arXiv:2512.04032 (cross-list from cs.CL) [pdf, ps, other]: Title: Jina-VLM: Small Multilingual Vision Language Model

Authors: Andreas Koukounas, Georgios Mastrapas, Florian Hönicke, Sedigheh Eslami, Guillaume Roncari, Scott Martens, Han Xiao

Comments: 18 pages, 1-7 main content, 13-18 appendix for tables and dataset

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[737] arXiv:2512.03995 (cross-list from cs.RO) [pdf, ps, other]: Title: Artificial Microsaccade Compensation: Stable Vision for an Ornithopter

Authors: Levi Burner, Guido de Croon, Yiannis Aloimonos

Comments: 29 pages, 5 figures, 2 tables, under review

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[738] arXiv:2512.03962 (cross-list from eess.IV) [pdf, ps, other]: Title: Tada-DIP: Input-adaptive Deep Image Prior for One-shot 3D Image Reconstruction

Authors: Evan Bell, Shijun Liang, Ismail Alkhouri, Saiprasad Ravishankar

Comments: 6 pages, 8 figures, 2025 Asilomar Conference on Signals, Systems, and Computers. Code is available at github.com/evanbell02/Tada-DIP/

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[739] arXiv:2512.03656 (cross-list from cs.LG) [pdf, ps, other]: Title: Cyclical Temporal Encoding and Hybrid Deep Ensembles for Multistep Energy Forecasting

Authors: Salim Khazem, Houssam Kanso

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[740] arXiv:2512.03556 (cross-list from cs.RO) [pdf, ps, other]: Title: RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL

Authors: Yinzhou Tang, Yu Shang, Yinuo Chen, Bingwen Wei, Xin Zhang, Shu'ang Yu, Liangzhi Shi, Chao Yu, Chen Gao, Wei Wu, Yong Li

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[741] arXiv:2512.03522 (cross-list from cs.RO) [pdf, ps, other]: Title: MSG-Loc: Multi-Label Likelihood-based Semantic Graph Matching for Object-Level Global Localization

Authors: Gihyeon Lee, Jungwoo Lee, Juwon Kim, Young-Sik Shin, Younggun Cho

Comments: Accepted in IEEE Robotics and Automation Letters (2025)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[742] arXiv:2512.03514 (cross-list from cs.IR) [pdf, ps, other]: Title: M3DR: Towards Universal Multilingual Multimodal Document Retrieval

Authors: Adithya S Kolavi, Vyoman Jain

Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[743] arXiv:2512.03422 (cross-list from cs.RO) [pdf, ps, other]: Title: What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models

Authors: Tianchen Deng, Yue Pan, Shenghai Yuan, Dong Li, Chen Wang, Mingrui Li, Long Chen, Lihua Xie, Danwei Wang, Jingchuan Wang, Javier Civera, Hesheng Wang, Weidong Chen

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[744] arXiv:2512.03216 (cross-list from physics.ins-det) [pdf, ps, other]: Title: Kaleidoscopic Scintillation Event Imaging

Authors: Alex Bocchieri, John Mamish, David Appleyard, Andreas Velten

Subjects: Instrumentation and Detectors (physics.ins-det); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[745] arXiv:2512.03173 (cross-list from cs.CY) [pdf, ps, other]: Title: Culture Affordance Atlas: Reconciling Object Diversity Through Functional Mapping

Authors: Joan Nwatu, Longju Bai, Oana Ignat, Rada Mihalcea

Journal-ref: AAAI 2026 Social Impact Track

Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[746] arXiv:2512.03166 (cross-list from cs.RO) [pdf, ps, other]: Title: Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments

Authors: Aya Taourirte, Md Sohag Mia

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[747] arXiv:2512.03111 (cross-list from q-bio.GN) [pdf, ps, other]: Title: PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer

Authors: Xiaoshui Huang, Tianlin Zhu, Yifan Zuo, Xue Xia, Zonghan Wu, Jiebin Yan, Dingli Hua, Zongyi Xu, Yuming Fang, Jian Zhang

Comments: Accepted by AAAI 2026

Subjects: Genomics (q-bio.GN); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[748] arXiv:2512.03054 (cross-list from cs.LG) [pdf, ps, other]: Title: Energy-Efficient Federated Learning via Adaptive Encoder Freezing for MRI-to-CT Conversion: A Green AI-Guided Research

Authors: Ciro Benito Raggio, Lucia Migliorelli, Nils Skupien, Mathias Krohmer Zabaleta, Oliver Blanck, Francesco Cicone, Giuseppe Lucio Cascini, Paolo Zaffino, Maria Francesca Spadea

Comments: 22 pages, 13 figures

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Medical Physics (physics.med-ph)
[749] arXiv:2512.03052 (cross-list from cs.GR) [pdf, ps, other]: Title: LATTICE: Democratize High-Fidelity 3D Generation at Scale

Authors: Zeqiang Lai, Yunfei Zhao, Zibo Zhao, Haolin Liu, Qingxiang Lin, Jingwei Huang, Chunchao Guo, Xiangyu Yue

Comments: Technical Report

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)

[ total of 749 entries: 1-1000 | 391-749 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 390

Mon, 8 Dec 2025

Fri, 5 Dec 2025

Thu, 4 Dec 2025