Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 131

[ total of 749 entries: 1-250 | 132-381 | 382-631 | 632-749 ]
[ showing 250 entries per page: fewer | more | all ]

Tue, 9 Dec 2025 (showing first 250 of 259 entries)

[132] arXiv:2512.07834 [pdf, ps, other]: Title: Voxify3D: Pixel Art Meets Volumetric Rendering

Authors: Yi-Chuan Huang, Jiewen Chan, Hao-Jen Chien, Yu-Lun Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2512.07833 [pdf, ps, other]: Title: Relational Visual Similarity

Authors: Thao Nguyen, Sicheng Mo, Krishna Kumar Singh, Yilin Wang, Jing Shi, Nicholas Kolkin, Eli Shechtman, Yong Jae Lee, Yuheng Li

Comments: Project page, data, and code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[134] arXiv:2512.07831 [pdf, ps, other]: Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

Authors: Jiehui Huang, Yuechen Zhang, Xu He, Yuan Gao, Zhi Cen, Bin Xia, Yan Zhou, Xin Tao, Pengfei Wan, Jiaya Jia

Comments: Project Website this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2512.07829 [pdf, ps, other]: Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation

Authors: Yuan Gao, Chen Chen, Tianrong Chen, Jiatao Gu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[136] arXiv:2512.07826 [pdf, ps, other]: Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing

Authors: Haoyang He, Jie Wang, Jiangning Zhang, Zhucun Xue, Xingyuan Bu, Qiangpeng Yang, Shilei Wen, Lei Xie

Comments: 38 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137] arXiv:2512.07821 [pdf, ps, other]: Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion Modeling

Authors: Shaoheng Fang, Hanwen Jiang, Yunpeng Bai, Niloy J. Mitra, Qixing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[138] arXiv:2512.07807 [pdf, ps, other]: Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale Scenes

Authors: Shai Krakovsky, Gal Fiebelman, Sagie Benaim, Hadar Averbuch-Elor

Comments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[139] arXiv:2512.07806 [pdf, ps, other]: Title: Multi-view Pyramid Transformer: Look Coarser to See Broader

Authors: Gyeongjin Kang, Seungkwon Yang, Seungtae Nam, Younggeun Lee, Jungwoo Kim, Eunbyung Park

Comments: Project page: see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140] arXiv:2512.07802 [pdf, ps, other]: Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory

Authors: Zhaochong An, Menglin Jia, Haonan Qiu, Zijian Zhou, Xiaoke Huang, Zhiheng Liu, Weiming Ren, Kumara Kahatapitiya, Ding Liu, Sen He, Chenyang Zhang, Tao Xiang, Fanny Yang, Serge Belongie, Tian Xie

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2512.07778 [pdf, ps, other]: Title: Distribution Matching Variational AutoEncoder

Authors: Sen Ye, Jianning Pei, Mengde Xu, Shuyang Gu, Chunyu Wang, Liwei Wang, Han Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142] arXiv:2512.07776 [pdf, ps, other]: Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring

Authors: Maximilian Schall, Felix Leonard Knöfel, Noah Elias König, Jan Jonas Kubeler, Maximilian von Klinski, Joan Wilhelm Linnemann, Xiaoshi Liu, Iven Jelle Schlegelmilch, Ole Woyciniuk, Alexandra Schild, Dante Wasmuht, Magdalena Bermejo Espinet, German Illera Basas, Gerard de Melo

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[143] arXiv:2512.07760 [pdf, ps, other]: Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-Identification

Authors: Menglin Wang, Xiaojin Gong, Jiachen Li, Genlin Ji

Comments: Accepted to AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144] arXiv:2512.07756 [pdf, ps, other]: Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound Reconstruction

Authors: Mayank Anand, Ujair Alam, Surya Prakash, Priya Shukla, Gora Chand Nandi, Domenec Puig

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[145] arXiv:2512.07747 [pdf, ps, other]: Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and Generation

Authors: Shihao Zhao, Yitong Chen, Zeyinzi Jiang, Bojia Zi, Shaozhe Hao, Yu Liu, Chaojie Mao, Kwan-Yee K. Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146] arXiv:2512.07745 [pdf, ps, other]: Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

Authors: Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, Xinggang Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2512.07738 [pdf, ps, other]: Title: HLTCOE Evaluation Team at TREC 2025: VQA Track

Authors: Dengjia Zhang, Charles Weng, Katherine Guerrerio, Yi Lu, Kenton Murray, Alexander Martin, Reno Kriz, Benjamin Van Durme

Comments: 7 pages, 1 figure

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2512.07733 [pdf, ps, other]: Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery

Authors: Meng Cao, Xingyu Li, Xue Liu, Ian Reid, Xiaodan Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149] arXiv:2512.07730 [pdf, ps, other]: Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination

Authors: Sangha Park, Seungryong Yoo, Jisoo Mok, Sungroh Yoon

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[150] arXiv:2512.07729 [pdf, ps, other]: Title: Improving action classification with brain-inspired deep networks

Authors: Aidas Aglinskas, Stefano Anzellotti

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[151] arXiv:2512.07720 [pdf, ps, other]: Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation

Authors: Fan Yang, Heyuan Li, Peihao Li, Weihao Yuan, Lingteng Qiu, Chaoyue Song, Cheng Chen, Yisheng He, Shifeng Zhang, Xiaoguang Han, Steven Hoi, Guosheng Lin

Comments: Project page: \url{this https URL}

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152] arXiv:2512.07712 [pdf, ps, other]: Title: UnCageNet: Tracking and Pose Estimation of Caged Animal

Authors: Sayak Dutta, Harish Katti, Shashikant Verma, Shanmuganathan Raman

Comments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, India

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153] arXiv:2512.07703 [pdf, ps, other]: Title: PVeRA: Probabilistic Vector-Based Random Matrix Adaptation

Authors: Leo Fillioux, Enzo Ferrante, Paul-Henry Cournède, Maria Vakalopoulou, Stergios Christodoulidis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[154] arXiv:2512.07702 [pdf, ps, other]: Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment

Authors: Sangha Park, Eunji Kim, Yeongtak Oh, Jooyoung Choi, Sungroh Yoon

Comments: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[155] arXiv:2512.07698 [pdf, ps, other]: Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data Only

Authors: Arslan Artykov, Corentin Sautier, Vincent Lepetit

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[156] arXiv:2512.07674 [pdf, ps, other]: Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast Representations

Authors: Mehmet Yigit Avci, Pedro Borges, Virginia Fernandez, Paul Wright, Mehmet Yigitsoy, Sebastien Ourselin, Jorge Cardoso

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[157] arXiv:2512.07668 [pdf, ps, other]: Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset

Authors: Ronan John, Aditya Kesari, Vincenzo DiMatteo, Kristin Dana

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158] arXiv:2512.07661 [pdf, ps, other]: Title: Optimization-Guided Diffusion for Interactive Scene Generation

Authors: Shiaho Li, Naisheng Ye, Tianyu Li, Kashyap Chitta, Tuo An, Peng Su, Boyang Wang, Haiou Liu, Chen Lv, Hongyang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159] arXiv:2512.07652 [pdf, ps, other]: Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research

Authors: Hamad Almazrouei, Mariam Al Nasseri, Maha Alzaabi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[160] arXiv:2512.07651 [pdf, ps, other]: Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline Method

Authors: Yuanye Liu, Hanxiao Zhang, Nannan Shi, Yuxin Shi, Arif Mahmood, Murtaza Taj, Xiahai Zhuang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161] arXiv:2512.07628 [pdf, ps, other]: Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation

Authors: Zhiqi Li, Wenhuan Li, Tengfei Wang, Zhenwei Wang, Junta Wu, Haoyuan Wang, Yunhan Yang, Zehuan Huang, Yang Li, Peidong Liu, Chunchao Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2512.07606 [pdf, ps, other]: Title: Decomposition Sampling for Efficient Region Annotations in Active Learning

Authors: Jingna Qiu, Frauke Wilm, Mathias Öttl, Jonas Utz, Maja Schlereth, Moritz Schillinger, Marc Aubreville, Katharina Breininger

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163] arXiv:2512.07599 [pdf, ps, other]: Title: Online Segment Any 3D Thing as Instance Tracking

Authors: Hanshi Wang, Zijian Cai, Jin Gao, Yiwei Zhang, Weiming Hu, Ke Wang, Zhipeng Zhang

Comments: NeurIPS 2025, Code is at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2512.07596 [pdf, ps, other]: Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery

Authors: Wenzhen Dong, Jieming Yu, Yiming Huang, Hongqiu Wang, Lei Zhu, Albert C. S. Chung, Hongliang Ren, Long Bai

Comments: Technical Report

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[165] arXiv:2512.07590 [pdf, ps, other]: Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image Segmentation

Authors: Kaili Qi, Zhongyi Huang, Wenli Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166] arXiv:2512.07584 [pdf, ps, other]: Title: LongCat-Image Technical Report

Authors: Meituan LongCat Team: Hanghang Ma, Haoxian Tan, Jiale Huang, Junqiang Wu, Jun-Yan He, Lishuai Gao, Songlin Xiao, Xiaoming Wei, Xiaoqi Ma, Xunliang Cai, Yayong Guan, Jie Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167] arXiv:2512.07580 [pdf, ps, other]: Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs

Authors: Yahong Wang, Juncheng Wu, Zhangkai Ni, Longzhen Yang, Yihang Liu, Chengmei Yang, Ying Wen, Xianfeng Tang, Hui Liu, Yuyin Zhou, Lianghua He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168] arXiv:2512.07568 [pdf, ps, other]: Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation

Authors: Xuecheng Li, Weikuan Jia, Alisher Kurbonaliev, Qurbonaliev Alisher, Khudzhamkulov Rustam, Ismoilov Shuhratjon, Eshmatov Javhariddin, Yuanjie Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[169] arXiv:2512.07564 [pdf, ps, other]: Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models

Authors: Kassoum Sanogo, Renzo Ardiccioni

Comments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[170] arXiv:2512.07527 [pdf, ps, other]: Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images

Authors: Fei Yu, Yu Liu, Luyang Tang, Mingchao Sun, Zengye Ge, Rui Bu, Yuchao Jin, Haisen Zhao, He Sun, Yangyan Li, Mu Xu, Wenzheng Chen, Baoquan Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[171] arXiv:2512.07514 [pdf, ps, other]: Title: MeshRipple: Structured Autoregressive Generation of Artist-Meshes

Authors: Junkai Lin, Hang Long, Huipeng Guo, Jielei Zhang, JiaYi Yang, Tianle Guo, Yang Yang, Jianwen Li, Wenxiao Zhang, Matthias Nießner, Wei Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172] arXiv:2512.07504 [pdf, ps, other]: Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points

Authors: Ryota Okumura, Kaede Shiohara, Toshihiko Yamasaki

Comments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[173] arXiv:2512.07503 [pdf, ps, other]: Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation

Authors: Yao Teng, Zhihuan Jiang, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2512.07500 [pdf, ps, other]: Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer

Authors: Penghui Liu, Jiangshan Wang, Yutong Shen, Shanhui Mo, Chenyang Qi, Yue Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175] arXiv:2512.07498 [pdf, ps, other]: Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior

Authors: Chih-Chung Hsu, Shao-Ning Chen, Chia-Ming Lee, Yi-Fang Wang, Yi-Shiuan Chou

Comments: 16 pages (including appendix)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176] arXiv:2512.07480 [pdf, ps, other]: Title: Single-step Diffusion-based Video Coding with Semantic-Temporal Guidance

Authors: Naifu Xue, Zhaoyang Jia, Jiahao Li, Bin Li, Zihan Zheng, Yuan Zhang, Yan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2512.07469 [pdf, ps, other]: Title: Unified Video Editing with Temporal Reasoner

Authors: Xiangpeng Yang, Ji Xie, Yiyuan Yang, Yan Huang, Min Xu, Qiang Wu

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178] arXiv:2512.07426 [pdf, ps, other]: Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processing

Authors: Karel Moens, Matthew B. Blaschko, Tinne Tuytelaars, Bart Diricx, Jonas De Vylder, Mustafa Yousif

Comments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[179] arXiv:2512.07415 [pdf, ps, other]: Title: Data-driven Exploration of Mobility Interaction Patterns

Authors: Gabriele Galatolo, Mirco Nanni

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[180] arXiv:2512.07410 [pdf, ps, other]: Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction Graphs

Authors: Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang, Xin Chen, Lan Xu, Jingyi Yu, Jingya Wang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181] arXiv:2512.07394 [pdf, ps, other]: Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric Video

Authors: Zhifan Zhu, Siddhant Bansal, Shashank Tripathi, Dima Damen

Comments: webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182] arXiv:2512.07391 [pdf, ps, other]: Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency Monitoring

Authors: Đorđe Nedeljković

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183] arXiv:2512.07385 [pdf, ps, other]: Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline

Authors: Chunhui Zhang, Li Liu, Zhipeng Zhang, Yong Wang, Hao Wen, Xi Zhou, Shiming Ge, Yanfeng Wang

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2512.07383 [pdf, ps, other]: Title: LogicCBMs: Logic-Enhanced Concept-Based Learning

Authors: Deepika SN Vemuri, Gautham Bellamkonda, Aditya Pola, Vineeth N Balasubramanian

Comments: 18 pages, 19 figures, WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185] arXiv:2512.07381 [pdf, ps, other]: Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic Objects

Authors: Shuohan Tao, Boyao Zhou, Hanzhang Tu, Yuwang Wang, Yebin Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[186] arXiv:2512.07379 [pdf, ps, other]: Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and Efficiency

Authors: Mahila Moghadami, Mohammad Ali Keyvanrad, Melika Sabaghian

Comments: 22 pages, 16 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187] arXiv:2512.07360 [pdf, ps, other]: Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic Segmentation

Authors: Qiming Huang, Hao Ai, Jianbo Jiao

Comments: Accepted to WACV2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[188] arXiv:2512.07351 [pdf, ps, other]: Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake Detection

Authors: Sayeem Been Zaman, Wasimul Karim, Arefin Ittesafun Abian, Reem E. Mohamed, Md Rafiqul Islam, Asif Karim, Sami Azam

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
[189] arXiv:2512.07348 [pdf, ps, other]: Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition

Authors: Xinyu Wei, Kangrui Cen, Hongyang Wei, Zhen Guo, Bairui Li, Zeqing Wang, Jinrui Zhang, Lei Zhang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190] arXiv:2512.07345 [pdf, ps, other]: Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian Splatting

Authors: Shilong Jin, Haoran Duan, Litao Hua, Wentao Huang, Yuan Zhou

Comments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191] arXiv:2512.07338 [pdf, ps, other]: Title: Generalized Referring Expression Segmentation on Aerial Photos

Authors: Luís Marnoto, Alexandre Bernardino, Bruno Martins

Comments: Submitted to IEEE J-STARS

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192] arXiv:2512.07331 [pdf, ps, other]: Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision Transformers

Authors: Kanishk Awadhiya

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193] arXiv:2512.07328 [pdf, ps, other]: Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation

Authors: Ziyang Mai, Yu-Wing Tai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[194] arXiv:2512.07305 [pdf, ps, other]: Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image Dataset

Authors: Tobias Abraham Haider

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195] arXiv:2512.07302 [pdf, ps, other]: Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts

Authors: Mingning Guo, Mengwei Wu, Shaoxian Li, Haifeng Li, Chao Tao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[196] arXiv:2512.07276 [pdf, ps, other]: Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery

Authors: Mai Tsujimoto, Junjue Wang, Weihao Xuan, Naoto Yokoya

Comments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[197] arXiv:2512.07275 [pdf, ps, other]: Title: Effective Attention-Guided Multi-Scale Medical Network for Skin Lesion Segmentation

Authors: Siyu Wang, Hua Wang, Huiyu Li, Fan Zhang

Comments: The paper has been accepted by BIBM 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[198] arXiv:2512.07273 [pdf, ps, other]: Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language Translation

Authors: Zhi Rao, Yucheng Zhou, Benjia Zhou, Yiqing Huang, Sergio Escalera, Jun Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199] arXiv:2512.07269 [pdf, ps, other]: Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth data

Authors: Mike Diessner, Yannick Tarant

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[200] arXiv:2512.07253 [pdf, ps, other]: Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement

Authors: Handing Xu, Zhenguo Nie, Tairan Peng, Huimin Pan, Xin-Jun Liu

Comments: 18 pages, 8 figures, and 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[201] arXiv:2512.07251 [pdf, ps, other]: Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast Enhancement

Authors: Junqi Liu, Zejun Wu, Pedro R. A. S. Bassi, Xinze Zhou, Wenxuan Li, Ibrahim E. Hamamci, Sezgin Er, Tianyu Lin, Yi Luo, Szymon Płotka, Bjoern Menze, Daguang Xu, Kai Ding, Kang Wang, Yang Yang, Yucheng Tang, Alan L. Yuille, Zongwei Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202] arXiv:2512.07247 [pdf, ps, other]: Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing

Authors: Ziming Hong, Tianyu Huang, Runnan Chen, Shanshan Ye, Mingming Gong, Bo Han, Tongliang Liu

Comments: 40 pages, 34 figures, 18 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[203] arXiv:2512.07245 [pdf, ps, other]: Title: Zero-Shot Textual Explanations via Translating Decision-Critical Features

Authors: Toshinori Yamauchi, Hiroshi Kera, Kazuhiko Kawamoto

Comments: 11+6 pages, 8 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204] arXiv:2512.07241 [pdf, ps, other]: Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network Architecture

Authors: Md. Srabon Chowdhury, Syeda Fahmida Tanzim, Sheekar Banerjee, Ishtiak Al Mamoon, AKM Muzahidul Islam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205] arXiv:2512.07237 [pdf, ps, other]: Title: Unified Camera Positional Encoding for Controlled Video Generation

Authors: Cheng Zhang, Boying Li, Meng Wei, Yan-Pei Cao, Camilo Cruz Gambardella, Dinh Phung, Jianfei Cai

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206] arXiv:2512.07234 [pdf, ps, other]: Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language Models

Authors: Biao Chen, Lin Zuo, Mengmeng Jing, Kunbin He, Yuchen Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[207] arXiv:2512.07230 [pdf, ps, other]: Title: STRinGS: Selective Text Refinement in Gaussian Splatting

Authors: Abhinav Raundhal, Gaurav Behera, P J Narayanan, Ravi Kiran Sarvadevabhatla, Makarand Tapaswi

Comments: Accepted to WACV 2026. Project Page, see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208] arXiv:2512.07229 [pdf, ps, other]: Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category Discovery

Authors: Fang Zhou, Zhiqiang Chen, Martin Pavlovski, Yizhong Zhang

Comments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[209] arXiv:2512.07228 [pdf, ps, other]: Title: Towards Robust Protective Perturbation against DeepFake Face Swapping

Authors: Hengyang Yao, Lin Li, Ke Sun, Jianing Qiu, Huiping Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[210] arXiv:2512.07215 [pdf, ps, other]: Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose Estimation

Authors: Md Selim Sarowar, Sungho Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[211] arXiv:2512.07211 [pdf, ps, other]: Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point Clouds

Authors: Frederik Hagelskjær, Dimitrios Arapis, Steffen Madsen, Thorbjørn Mosekjær Iversen

Comments: 8 pages, 8 figures, 5 tables, ICCR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[212] arXiv:2512.07206 [pdf, ps, other]: Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CT

Authors: Boyang Pan, Zeyu Zhang, Hongyu Meng, Bin Cui, Yingying Zhang, Wenli Hou, Junhao Li, Langdi Zhong, Xiaoxiao Chen, Xiaoyu Xu, Changjin Zuo, Chao Cheng, Nan-Jie Gong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[213] arXiv:2512.07203 [pdf, ps, other]: Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning

Authors: Xuhui Zheng, Kang An, Ziliang Wang, Yuhang Wang, Faqiang Qian, Yichao Wu

Comments: 7 pages, 1 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214] arXiv:2512.07201 [pdf, ps, other]: Title: Understanding Diffusion Models via Code Execution

Authors: Cheng Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[215] arXiv:2512.07198 [pdf, ps, other]: Title: Generating Storytelling Images with Rich Chains-of-Reasoning

Authors: Xiujie Song, Qi Jia, Shota Watanabe, Xiaoyi Pang, Ruijie Chen, Mengyue Wu, Kenny Q. Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[216] arXiv:2512.07197 [pdf, ps, other]: Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian Splatting

Authors: Seokhyun Youn, Soohyun Lee, Geonho Kim, Weeyoung Kwon, Sung-Ho Bae, Jihyong Oh

Comments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217] arXiv:2512.07192 [pdf, ps, other]: Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image Compression

Authors: Niu Yi, Xu Tianyi, Ma Mingming, Wang Xinkun

Comments: 12 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[218] arXiv:2512.07191 [pdf, ps, other]: Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field Correction

Authors: Wenqi Zhao, Jiacheng Sang, Fenghua Cheng, Yonglu Shu, Dong Li, Xiaofeng Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219] arXiv:2512.07190 [pdf, ps, other]: Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification

Authors: Pengfei Gu, Huimin Li, Haoteng Tang, Dongkuan (DK)Xu, Erik Enriquez, DongChul Kim, Bin Fu, Danny Z. Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220] arXiv:2512.07186 [pdf, ps, other]: Title: START: Spatial and Textual Learning for Chart Understanding

Authors: Zhuoming Liu, Xiaofeng Gao, Feiyang Niu, Qiaozi Gao, Liu Liu, Robinson Piramuthu

Comments: WACV2026 Camera Ready

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[221] arXiv:2512.07171 [pdf, ps, other]: Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image Restoration

Authors: Shravan Venkatraman, Rakesh Raj Madavan, Pavan Kumar S, Muthu Subash Kavitha

Comments: 21 pages, 11 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222] arXiv:2512.07170 [pdf, ps, other]: Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach

Authors: Jiayang Li, Chengjie Jiang, Junjun Jiang, Pengwei Liang, Jiayi Ma, Liqiang Nie

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[223] arXiv:2512.07166 [pdf, ps, other]: Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing

Authors: Siyuan Xu, Yibing Liu, Peilin Chen, Yung-Hui Li, Shiqi Wang, Sam Kwong

Comments: 9 pages,7figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[224] arXiv:2512.07165 [pdf, ps, other]: Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation

Authors: Muyu Xu, Fangneng Zhan, Xiaoqin Zhang, Ling Shao, Shijian Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225] arXiv:2512.07155 [pdf, ps, other]: Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented Metrics

Authors: Dahyeon Kye, Jeahun Sung, Mingyu Jeon, Jihyong Oh

Comments: Please visit our project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226] arXiv:2512.07141 [pdf, ps, other]: Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models

Authors: Fenghua Weng, Chaochao Lu, Xia Hu, Wenqi Shao, Wenjie Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[227] arXiv:2512.07136 [pdf, ps, other]: Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning

Authors: Siyang Jiang, Mu Yuan, Xiang Ji, Bufang Yang, Zeyu Liu, Lilin Xu, Yang Li, Yuting He, Liran Dong, Wenrui Lu, Zhenyu Yan, Xiaofan Jiang, Wei Gao, Hongkai Chen, Guoliang Xing

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[228] arXiv:2512.07135 [pdf, ps, other]: Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning

Authors: Zebin Xing, Pengxuan Yang, Linbo Wang, Yichen Zhang, Yiming Hu, Yupeng Zheng, Junli Wang, Yinfeng Gao, Guang Li, Kun Ma, Long Chen, Zhongpu Xia, Qichao Zhang, Hangjun Ye, Dongbin Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[229] arXiv:2512.07128 [pdf, ps, other]: Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIP

Authors: Chau Truong, Hieu Ta Quang, Dung D. Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[230] arXiv:2512.07126 [pdf, ps, other]: Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-On

Authors: Shengjie Lu, Zhibin Wan, Jiejie Liu, Quan Zhang, Mingjie Sun

Comments: 16 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231] arXiv:2512.07110 [pdf, ps, other]: Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery Detection

Authors: Liangwei Jiang, Jinluo Xie, Yecheng Huang, Hua Zhang, Hongyu Yang, Di Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2512.07107 [pdf, ps, other]: Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D Supervision

Authors: Jaeyoon Lee, Hojoon Jung, Sungtae Hwang, Jihyong Oh, Jongwon Choi

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233] arXiv:2512.07078 [pdf, ps, other]: Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object Detection

Authors: Bo Gao, Jingcheng Tong, Xingsheng Chen, Han Yu, Zichen Li

Comments: 16 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[234] arXiv:2512.07076 [pdf, ps, other]: Title: Context-measure: Contextualizing Metric for Camouflage

Authors: Chen-Yang Wang, Gepeng Ji, Song Shao, Ming-Ming Cheng, Deng-Ping Fan

Comments: Technical Report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[235] arXiv:2512.07065 [pdf, ps, other]: Title: Persistent Homology-Guided Frequency Filtering for Image Compression

Authors: Anil Chintapalli, Peter Tenholder, Henry Chen, Arjun Rao

Comments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compression

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236] arXiv:2512.07062 [pdf, ps, other]: Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense Prediction

Authors: Changliang Xia, Chengyou Jia, Minnan Luo, Zhuohang Dang, Xin Shen, Bowen Ping

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[237] arXiv:2512.07052 [pdf, ps, other]: Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian Splatting

Authors: Hoang-Nhat Tran, Francesco Di Sario, Gabriele Spadaro, Giuseppe Valenzise, Enzo Tartaglione

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238] arXiv:2512.07051 [pdf, ps, other]: Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation

Authors: Adnan Munir, Shujaat Khan

Comments: 11 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[239] arXiv:2512.07037 [pdf, ps, other]: Title: Evaluating and Preserving High-level Fidelity in Super-Resolution

Authors: Josep M. Rocafort, Shaolin Su, Alexandra Gomez-Villa, Javier Vazquez-Corral

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[240] arXiv:2512.07034 [pdf, ps, other]: Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent Cues

Authors: Tuan-Anh Vu, Hai Nguyen-Truong, Ziqiang Zheng, Binh-Son Hua, Qing Guo, Ivor Tsang, Sai-Kit Yeung

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[241] arXiv:2512.06981 [pdf, ps, other]: Title: Selective Masking based Self-Supervised Learning for Image Semantic Segmentation

Authors: Yuemin Wang, Ian Stavness

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[242] arXiv:2512.06949 [pdf, ps, other]: Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology

Authors: Shravan Venkatraman, Muthu Subash Kavitha, Joe Dhanith P R, V Manikandarajan, Jia Wu

Comments: 19 pages, 5 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243] arXiv:2512.06921 [pdf, ps, other]: Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification

Authors: Ziyang Song, Zelin Zang, Xiaofan Ye, Boqiang Xu, Long Bai, Jinlin Wu, Hongliang Ren, Hongbin Liu, Jiebo Luo, Zhen Lei

Comments: Accepted by IEEE ICIA 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[244] arXiv:2512.06905 [pdf, ps, other]: Title: Scaling Zero-Shot Reference-to-Video Generation

Authors: Zijian Zhou, Shikun Liu, Haozhe Liu, Haonan Qiu, Zhaochong An, Weiming Ren, Zhiheng Liu, Xiaoke Huang, Kam Woh Ng, Tian Xie, Xiao Han, Yuren Cong, Hang Li, Chuyan Zhu, Aditya Patel, Tao Xiang, Sen He

Comments: Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245] arXiv:2512.06888 [pdf, ps, other]: Title: Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation

Authors: Liyang Song, Hardik Bishnoi, Sai Kumar Reddy Manne, Sarah Ostadabbas, Briana J. Taylor, Michael Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[246] arXiv:2512.06886 [pdf, ps, other]: Title: Balanced Learning for Domain Adaptive Semantic Segmentation

Authors: Wangkai Li, Rui Sun, Bohao Liao, Zhaoyang Li, Tianzhu Zhang

Comments: Accepted by International Conference on Machine Learning (ICML 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[247] arXiv:2512.06885 [pdf, ps, other]: Title: JoPano: Unified Panorama Generation via Joint Modeling

Authors: Wancheng Feng, Chen An, Zhenliang He, Meina Kan, Shiguang Shan, Lukun Wang

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[248] arXiv:2512.06882 [pdf, ps, other]: Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian Fusion

Authors: Yu Zhu, Naoya Chiba, Koichi Hashimoto

Comments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon request

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249] arXiv:2512.06877 [pdf, ps, other]: Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene Classification

Authors: Mohammed Q. Alkhatib, Ali Jamali, Swalpa Kumar Roy

Comments: Accepted and presented in ICSPIS

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250] arXiv:2512.06870 [pdf, ps, other]: Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding Perspective

Authors: Wangkai Li, Rui Sun, Zhaoyang Li, Tianzhu Zhang

Comments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[251] arXiv:2512.06866 [pdf, ps, other]: Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior

Authors: Yulin Li, Haokun Gui, Ziyang Fan, Junjie Wang, Bin Kang, Bin Chen, Zhuotao Tian

Comments: Accepted by NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[252] arXiv:2512.06865 [pdf, ps, other]: Title: Spatial Retrieval Augmented Autonomous Driving

Authors: Xiaosong Jia, Chenhe Zhang, Yule Jiang, Songbur Wong, Zhiyuan Zhang, Chen Chen, Shaofeng Zhang, Xuanhe Zhou, Xue Yang, Junchi Yan, Yu-Gang Jiang

Comments: Demo Page: this https URL with open sourced code, dataset, and checkpoints

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253] arXiv:2512.06864 [pdf, ps, other]: Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-Training

Authors: Kaixuan Lu, Mehmet Onurcan Kaya, Dim P. Papadopoulos

Comments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[254] arXiv:2512.06862 [pdf, ps, other]: Title: Omni-Referring Image Segmentation

Authors: Qiancheng Zheng, Yunhang Shen, Gen Luo, Baiyang Song, Xing Sun, Xiaoshuai Sun, Yiyi Zhou, Rongrong Ji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[255] arXiv:2512.06849 [pdf, ps, other]: Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CT

Authors: Matan Atad, Alexander W. Marka, Lisa Steinhelfer, Anna Curto-Vilalta, Yannik Leonhardt, Sarah C. Foreman, Anna-Sophia Walburga Dietrich, Robert Graf, Alexandra S. Gersing, Bjoern Menze, Daniel Rueckert, Jan S. Kirschke, Hendrik Möller

Comments: In submission

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[256] arXiv:2512.06845 [pdf, ps, other]: Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly Detection

Authors: Satoshi Hashimoto, Hitoshi Nishimura, Yanan Wang, Mori Kurokawa

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257] arXiv:2512.06840 [pdf, ps, other]: Title: CADE: Continual Weakly-supervised Video Anomaly Detection with Ensembles

Authors: Satoshi Hashimoto, Tatsuya Konishi, Tomoya Kaichi, Kazunori Matsumoto, Mori Kurokawa

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258] arXiv:2512.06838 [pdf, ps, other]: Title: SparseCoop: Cooperative Perception with Kinematic-Grounded Queries

Authors: Jiahao Wang, Zhongwei Jiang, Wenchao Sun, Jiaru Zhong, Haibao Yu, Yuner Zhang, Chenyang Lu, Chuang Zhang, Lei He, Shaobing Xu, Jianqiang Wang

Comments: Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259] arXiv:2512.06818 [pdf, ps, other]: Title: MeshSplatting: Differentiable Rendering with Opaque Meshes

Authors: Jan Held, Sanghyun Son, Renaud Vandeghen, Daniel Rebain, Matheus Gadelha, Yi Zhou, Anthony Cioppa, Ming C. Lin, Marc Van Droogenbroeck, Andrea Tagliasacchi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260] arXiv:2512.06811 [pdf, ps, other]: Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language Models

Authors: Xiang Lin, Weixin Li, Shu Guo, Lihong Wang, Di Huang

Comments: Accepted by AAAI 2026(Oral)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[261] arXiv:2512.06810 [pdf, ps, other]: Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning

Authors: Yueqian Wang, Songxiang Liu, Disong Wang, Nuo Xu, Guanglu Wan, Huishuai Zhang, Dongyan Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[262] arXiv:2512.06802 [pdf, ps, other]: Title: VDOT: Efficient Unified Video Creation via Optimal Transport Distillation

Authors: Yutong Wang, Haiyu Zhang, Tianfan Xue, Yu Qiao, Yaohui Wang, Chang Xu, Xinyuan Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263] arXiv:2512.06793 [pdf, ps, other]: Title: Generalized Geometry Encoding Volume for Real-time Stereo Matching

Authors: Jiaxin Liu, Gangwei Xu, Xianqi Wang, Chengliang Zhang, Xin Yang

Comments: Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264] arXiv:2512.06783 [pdf, ps, other]: Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-Videos

Authors: Tobias Leuthold, Michele Xiloyannis, Yves Zimmermann

Comments: 16 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[265] arXiv:2512.06774 [pdf, ps, other]: Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian Splatting

Authors: Longjie Zhao, Ziming Hong, Zhenyang Ren, Runnan Chen, Mingming Gong, Tongliang Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[266] arXiv:2512.06769 [pdf, ps, other]: Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial Understanding

Authors: Hang Yin, Xiaomin He, PeiWen Yuan, Yiwei Li, Jiayi Shi, Wenxiao Fan, Shaoxiong Feng, Kan Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[267] arXiv:2512.06763 [pdf, ps, other]: Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control Algorithms

Authors: Chengyang Yan, Mitch Bryson, Donald G. Dansereau

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268] arXiv:2512.06759 [pdf, ps, other]: Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language Priors

Authors: Wenbo Lyu, Yingjun Du, Jinglin Zhao, Xianton Zhen, Ling Shao

Comments: 12 pages,13figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[269] arXiv:2512.06750 [pdf, ps, other]: Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and Enhancement

Authors: Weiqi Li, Xuanyu Zhang, Bin Chen, Jingfen Xie, Yan Wang, Kexin Zhang, Junlin Li, Li Zhang, Jian Zhang, Shijie Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[270] arXiv:2512.06746 [pdf, ps, other]: Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image Detection

Authors: Ruoxin Chen, Jiahui Gao, Kaiqing Lin, Keyue Zhang, Yandan Zhao, Isabel Guan, Taiping Yao, Shouhong Ding

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[271] arXiv:2512.06738 [pdf, ps, other]: Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain Adaptation

Authors: M Yashwanth, Sampath Koti, Arunabh Singh, Shyam Marjit, Anirban Chakraborty

Comments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[272] arXiv:2512.06736 [pdf, ps, other]: Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton Data

Authors: Jiaxing Fan, Jiaojiao Liu, Wenkong Wang, Yang Zhang, Xin Ma, Jichen Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273] arXiv:2512.06726 [pdf, ps, other]: Title: The Role of Entropy in Visual Grounding: Analysis and Optimization

Authors: Shuo Li, Jiajun Sun, Zhihao Zhang, Xiaoran Fan, Senjie Jin, Hui Li, Yuming Yang, Junjie Ye, Lixing Shen, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[274] arXiv:2512.06689 [pdf, ps, other]: Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and Separation

Authors: Jisoo Park, Seonghak Lee, Guisik Kim, Taewoo Kim, Junseok Kwon

Comments: Accepted to ASRU 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
[275] arXiv:2512.06684 [pdf, ps, other]: Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy

Authors: Yumeng He, Zanwei Zhou, Yekun Zheng, Chen Liang, Yunbo Wang, Xiaokang Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[276] arXiv:2512.06674 [pdf, ps, other]: Title: RunawayEvil: Jailbreaking the Image-to-Video Generative Models

Authors: Songping Wang, Rufan Qian, Yueming Lyu, Qinglong Liu, Linzhuang Zou, Jie Qin, Songhua Liu, Caifeng Shan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[277] arXiv:2512.06673 [pdf, ps, other]: Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and Reasoning

Authors: Shida Gao, Feng Xue, Xiangfeng Wang, Anlong Ming, Teng Long, Yihua Shao, Haozhe Wang, Zhaowen Lin, Wei Wang, Nicu Sebe

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[278] arXiv:2512.06663 [pdf, ps, other]: Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks

Authors: Yu Qi, Yumeng Zhang, Chenting Gong, Xiao Tan, Weiming Zhang, Wei Zhang, Jingdong Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279] arXiv:2512.06662 [pdf, ps, other]: Title: Personalized Image Descriptions from Attention Sequences

Authors: Ruoyu Xue, Hieu Le, Jingyi Xu, Sounak Mondal, Abe Leite, Gregory Zelinsky, Minh Hoai, Dimitris Samaras

Comments: 10 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[280] arXiv:2512.06657 [pdf, ps, other]: Title: TextMamba: Scene Text Detector with Mamba

Authors: Qiyan Zhao, Yue Yan, Da-Han Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[281] arXiv:2512.06642 [pdf, ps, other]: Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution

Authors: Achmad Ardani Prasha, Clavino Ourizqi Rachmadi, Muhamad Fauzan Ibnu Syahlan, Naufal Rahfi Anugerah, Nanda Garin Raditya, Putri Amelia, Sabrina Laila Mutiara, Hilman Syachr Ramadhan

Comments: 21 pages, 7 figures, 3 table

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[282] arXiv:2512.06613 [pdf, ps, other]: Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic Approach

Authors: Yueying Ke

Comments: 10 pages, 6 figures, 2 tables, IEEE conference format. Submitted as course project

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283] arXiv:2512.06612 [pdf, ps, other]: Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics

Authors: Kazuya Nishimura, Haruka Hirose, Ryoma Bise, Kaito Shiku, Yasuhiro Kojima

Comments: Neurips 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284] arXiv:2512.06598 [pdf, ps, other]: Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake Champlain

Authors: Muhammad Adil, Patrick J. Clemins, Andrew W. Schroth, Panagiotis D. Oikonomou, Donna M. Rizzo, Peter D. F. Isles, Xiaohan Zhang, Kareem I. Hannoun, Scott Turnbull, Noah B. Beckage, Asim Zia, Safwan Wshah

Comments: 23 pages, 15 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285] arXiv:2512.06581 [pdf, ps, other]: Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding

Authors: Yuhao Su, Anwesa Choudhuri, Zhongpai Gao, Benjamin Planche, Van Nguyen Nguyen, Meng Zheng, Yuhan Shen, Arun Innanje, Terrence Chen, Ehsan Elhamifar, Ziyan Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286] arXiv:2512.06575 [pdf, ps, other]: Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules

Authors: Fariza Dahes

Comments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LG

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[287] arXiv:2512.06565 [pdf, ps, other]: Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose Estimation

Authors: Xiujin Liu

Comments: 1 figures, 2 tables, 14pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288] arXiv:2512.06562 [pdf, ps, other]: Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities

Authors: Dung Thuy Nguyen, Quang Nguyen, Preston K. Robinette, Eli Jiang, Taylor T. Johnson, Kevin Leach

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[289] arXiv:2512.06560 [pdf, ps, other]: Title: Bridging spatial awareness and global context in medical image segmentation

Authors: Dalia Alzu'bi, A. Ben Hamza

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290] arXiv:2512.06531 [pdf, ps, other]: Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI Images

Authors: Sayan Das (1), Arghadip Biswas (2) ((1) IIIT Delhi, (2) Jadavpur University)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[291] arXiv:2512.06530 [pdf, ps, other]: Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-Generalization

Authors: Mohammed Wattad, Tamir Shor, Alex Bronstein

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[292] arXiv:2512.06521 [pdf, ps, other]: Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife Images

Authors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)

Comments: 31 pages + appendix

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[293] arXiv:2512.06504 [pdf, ps, other]: Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion

Authors: Andrii Lysyi, Anatoliy Sachenko, Pavlo Radiuk, Mykola Lysyi, Oleksandr Melnychenko, Diana Zahorodnia

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[294] arXiv:2512.06485 [pdf, ps, other]: Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based Interaction

Authors: Kush Revankar, Shreyas Deshpande, Araham Sayeed, Ansh Tandale, Sarika Bobde

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295] arXiv:2512.06447 [pdf, ps, other]: Title: Towards Stable Cross-Domain Depression Recognition under Missing Modalities

Authors: Jiuyi Chen, Mingkui Tan, Haifeng Lu, Qiuna Xu, Zhihua Wang, Runhao Zeng, Xiping Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296] arXiv:2512.06438 [pdf, ps, other]: Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars

Authors: Ramazan Fazylov, Sergey Zagoruyko, Aleksandr Parkin, Stamatis Lefkimmiatis, Ivan Laptev

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297] arXiv:2512.06434 [pdf, ps, other]: Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular Screening

Authors: Lucas R. Mareque, Ricardo L. Armentano, Leandro J. Cymberknop

Comments: 8 pages, 2 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[298] arXiv:2512.06426 [pdf, ps, other]: Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range Recognition

Authors: Nzakiese Mbongo, Kailash A. Hambarde, Hugo Proença

Comments: 12 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[299] arXiv:2512.06424 [pdf, ps, other]: Title: DragMesh: Interactive 3D Generation Made Easy

Authors: Tianshan Zhang, Zeyu Zhang, Hao Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300] arXiv:2512.06422 [pdf, ps, other]: Title: A Perception CNN for Facial Expression Recognition

Authors: Chunwei Tian, Jingyuan Xie, Lingjun Li, Wangmeng Zuo, Yanning Zhang, David Zhang

Comments: in IEEE Transactions on Image Processing (2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301] arXiv:2512.06421 [pdf, ps, other]: Title: Rethinking Training Dynamics in Scale-wise Autoregressive Generation

Authors: Gengze Zhou, Chongjian Ge, Hao Tan, Feng Liu, Yicong Hong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[302] arXiv:2512.06400 [pdf, ps, other]: Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene Enhancement

Authors: Jing Tao, Yonghong Zong, Banglei Guana, Pengju Sun, Taihang Lei, Yang Shanga, Qifeng Yu

Comments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303] arXiv:2512.06379 [pdf, ps, other]: Title: OCFER-Net: Recognizing Facial Expression in Online Learning System

Authors: Yi Huo, Lei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304] arXiv:2512.06377 [pdf, ps, other]: Title: VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System

Authors: Yi Huo, Yun Ge

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[305] arXiv:2512.06376 [pdf, ps, other]: Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework

Authors: Xinhao Xiang, Abhijeet Rastogi, Jiawei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306] arXiv:2512.06373 [pdf, ps, other]: Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning

Authors: Yuji Wang, Wenlong Liu, Jingxuan Niu, Haoji Zhang, Yansong Tang

Comments: The project page is [this url](this https URL)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307] arXiv:2512.06368 [pdf, ps, other]: Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos

Authors: Weitao Xiong, Zhiyuan Yuan, Jiahao Lu, Chengfeng Zhao, Peng Li, Yuan Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308] arXiv:2512.06363 [pdf, ps, other]: Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection

Authors: Jiabao Guo, Yadian Wang, Hui Ma, Yuhao Fu, Ju Jia, Hui Liu, Shengeng Tang, Lechao Cheng, Yunfeng Diao, Ajian Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[309] arXiv:2512.06358 [pdf, ps, other]: Title: Rectifying Latent Space for Generative Single-Image Reflection Removal

Authors: Mingjia Li, Jin Hu, Hainuo Wang, Qiming Hu, Jiarui Wang, Xiaojie Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310] arXiv:2512.06353 [pdf, ps, other]: Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search

Authors: Kaicheng Yang, Kaisen Yang, Baiting Wu, Xun Zhang, Qianrui Yang, Haotong Qin, He Zhang, Yulun Zhang

Comments: Code and Supplementary Material could be found at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[311] arXiv:2512.06345 [pdf, ps, other]: Title: CLUENet: Cluster Attention Makes Neural Networks Have Eyes

Authors: Xiangshuai Song, Jun-Jie Huang, Tianrui Liu, Ke Liang, Chang Tang

Comments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial Intelligence

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312] arXiv:2512.06344 [pdf, ps, other]: Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate

Authors: Kaile Wang, Lijun He, Haisheng Fu, Haixia Bi, Fan Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313] arXiv:2512.06332 [pdf, ps, other]: Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks

Authors: Jeffrey Gu, Minkyu Jeon, Ambri Ma, Serena Yeung-Levy, Ellen D. Zhong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314] arXiv:2512.06330 [pdf, ps, other]: Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening

Authors: Haoyu Zhang, Junhan Luo, Yugang Cao, Siran Peng, Jie Huang, Liangjian-Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[315] arXiv:2512.06328 [pdf, ps, other]: Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models

Authors: Jiahao Li, Yusheng Luo, Yunzhong Lou, Xiangdong Zhou

Comments: Accepted as an Oral presentation at AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316] arXiv:2512.06306 [pdf, ps, other]: Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation

Authors: Haoxian Zhou, Chuanzhi Xu, Langyi Chen, Haodong Chen, Yuk Ying Chung, Qiang Qu, Xaoming Chen, Weidong Cai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[317] arXiv:2512.06290 [pdf, ps, other]: Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification

Authors: Yiheng Huang, Shuang She, Zewei Wei, Jianmin Lin, Ming Yang, Wenyin Liu

Comments: 17 pages, 5 figures

Journal-ref: ICDAR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318] arXiv:2512.06282 [pdf, ps, other]: Title: A Sleep Monitoring System Based on Audio, Video and Depth Information

Authors: Lyn Chao-ling Chen, Kuan-Wen Chen, Yi-Ping Hung

Comments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[319] arXiv:2512.06281 [pdf, ps, other]: Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models

Authors: Hengzhuang Li, Xinsong Zhang, Qiming Peng, Bin Luo, Han Hu, Dengyang Jiang, Han-Jia Ye, Teng Zhang, Hai Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320] arXiv:2512.06276 [pdf, ps, other]: Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension

Authors: Tianyi Gao, Hao Li, Han Fang, Xin Wei, Xiaodong Dong, Hongbo Sun, Ye Yuan, Zhongjiang He, Jinglin Xu, Jingmin Xin, Hao Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[321] arXiv:2512.06275 [pdf, ps, other]: Title: FacePhys: State of the Heart Learning

Authors: Kegang Wang, Jiankai Tang, Yuntao Wang, Xin Liu, Yuxuan Fan, Jiatong Ji, Yuanchun Shi, Daniel McDuff

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322] arXiv:2512.06269 [pdf, ps, other]: Title: TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting

Authors: Quan Tran, Tuan Dang

Comments: 10 pages

Journal-ref: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323] arXiv:2512.06258 [pdf, ps, other]: Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs

Authors: Chaoyang Wang, Yangfan He, Yiyang Zhou, Yixuan Wang, Jiaqi Liu, Peng Xia, Zhengzhong Tu, Mohit Bansal, Huaxiu Yao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324] arXiv:2512.06255 [pdf, ps, other]: Title: Language-driven Fine-grained Retrieval

Authors: Shijie Wang, Xin Yu, Yadan Luo, Zijian Wang, Pengfei Zhang, Zi Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325] arXiv:2512.06251 [pdf, ps, other]: Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks

Authors: Fangzhou Lin, Yuping Wang, Yuliang Guo, Zixun Huang, Xinyu Huang, Haichong Zhang, Kazunori Yamada, Zhengzhong Tu, Liu Ren, Ziming Zhang

Comments: 12 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326] arXiv:2512.06232 [pdf, ps, other]: Title: Opinion: Learning Intuitive Physics May Require More than Visual Data

Authors: Ellen Su, Solim Legris, Todd M. Gureckis, Mengye Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[327] arXiv:2512.06230 [pdf, ps, other]: Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis Tracking

Authors: Pranav Balakrishnan, Sidisha Barik, Sean M. O'Rourke, Benjamin M. Marlin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328] arXiv:2512.06221 [pdf, ps, other]: Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility Study

Authors: Alena Makarova

Comments: 15 pages, 13 figures. Reproducibility study

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329] arXiv:2512.06206 [pdf, ps, other]: Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning

Authors: Akis Linardos, Sarthak Pati, Ujjwal Baid, Brandon Edwards, Patrick Foley, Kevin Ta, Verena Chung, Micah Sheller, Muhammad Irfan Khan, Mojtaba Jafaritadi, Elina Kontio, Suleiman Khan, Leon Mächler, Ivan Ezhov, Suprosanna Shit, Johannes C. Paetzold, Gustav Grimberg, Manuel A. Nickel, David Naccache, Vasilis Siomos, Jonathan Passerat-Palmbach, Giacomo Tarroni, Daewoon Kim, Leonard L. Klausmann, Prashant Shah, Bjoern Menze, Dimitrios Makris, Spyridon Bakas

Comments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URL

Journal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[330] arXiv:2512.06190 [pdf, ps, other]: Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food Drying

Authors: Shichen Li, Ahmadreza Eslaminia, Chenhui Shao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[331] arXiv:2512.06185 [pdf, ps, other]: Title: SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

Authors: Ankit Gupta, Christoph Adami, Emily Dolson (Michigan State University)

Comments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary material

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[332] arXiv:2512.06179 [pdf, ps, other]: Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction

Authors: Shilin Hu, Jingyi Xu, Sagnik Das, Dimitris Samaras, Hieu Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[333] arXiv:2512.06174 [pdf, ps, other]: Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction

Authors: Shilin Hu, Jingyi Xu, Akshat Dave, Dimitris Samaras, Hieu Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334] arXiv:2512.06171 [pdf, ps, other]: Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection

Authors: Jessica Plassmann, Nicolas Schuler, Michael Schuth, Georg von Freymann

Comments: 11 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335] arXiv:2512.06158 [pdf, ps, other]: Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation

Authors: Su Sun, Cheng Zhao, Himangi Mittal, Gaurav Mittal, Rohith Kukkala, Yingjie Victor Chen, Mei Chen

Comments: 15 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[336] arXiv:2512.06105 [pdf, ps, other]: Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report Generation

Authors: Junwen Zheng, Xinran Xu, Li Rong Wang, Chang Cai, Lucinda Siyun Tan, Dingyuan Wang, Hong Liang Tey, Xiuyi Fan

Comments: AAAI-26-AIA

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[337] arXiv:2512.06103 [pdf, ps, other]: Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack Detection

Authors: Raghavendra Ramachandra, Sushma Venkatesh

Comments: Accepted in IEEE T-BIOM

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[338] arXiv:2512.06096 [pdf, ps, other]: Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous Driving

Authors: Karthik Mohan, Sonam Singh, Amit Arvind Kale

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[339] arXiv:2512.06080 [pdf, ps, other]: Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light

Authors: Tzofi Klinghoffer, Siddharth Somasundaram, Xiaoyu Xiang, Yuchen Fan, Christian Richardt, Akshat Dave, Ramesh Raskar, Rakesh Ranjan

Comments: SIGGRAPH Asia 2025. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[340] arXiv:2512.06065 [pdf, ps, other]: Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing

Authors: Runjia Li, Moayed Haji-Ali, Ashkan Mirzaei, Chaoyang Wang, Arpit Sahni, Ivan Skorokhodov, Aliaksandr Siarohin, Tomas Jakab, Junlin Han, Sergey Tulyakov, Philip Torr, Willi Menapace

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[341] arXiv:2512.06058 [pdf, ps, other]: Title: Representation Learning for Point Cloud Understanding

Authors: Siming Yan

Comments: 181 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[342] arXiv:2512.06032 [pdf, ps, other]: Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation

Authors: Ranjan Sapkota, Konstantinos I. Roumeliotis, Manoj Karkee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[343] arXiv:2512.06024 [pdf, ps, other]: Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensing

Authors: Jiabin Liu, Zihao Zhou, Jialei Yan, Anxin Guo, Alvise Benetazzo, Hui Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
[344] arXiv:2512.06020 [pdf, ps, other]: Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation

Authors: Wenyi Mo, Tianyu Zhang, Yalong Bai, Ligong Han, Ying Ba, Dimitris N. Metaxas

Comments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[345] arXiv:2512.06014 [pdf, ps, other]: Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets

Authors: Jiho Shin, Dominic Marshall, Matthieu Komorowski

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[346] arXiv:2512.06013 [pdf, ps, other]: Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViT

Authors: Wenhao Li, Chengwei Ma, Weixin Mao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[347] arXiv:2512.06012 [pdf, ps, other]: Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive Manufacturing

Authors: Emmanuel Akeweje, Conall Kirk, Chi-Wai Chan, Denis Dowling, Mimi Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[348] arXiv:2512.06010 [pdf, other]: Title: Fast and Flexible Robustness Certificates for Semantic Segmentation

Authors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[349] arXiv:2512.06006 [pdf, ps, other]: Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization

Authors: Xuefei (Julie) Wang, Kai A. Horstmann, Ethan Lin, Jonathan Chen, Alexander R. Farhang, Sophia Stiles, Atharva Sehgal, Jonathan Light, David Van Valen, Yisong Yue, Jennifer J. Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[350] arXiv:2512.06003 [pdf, ps, other]: Title: PrunedCaps: A Case For Primary Capsules Discrimination

Authors: Ramin Sharifi, Pouya Shiri, Amirali Baniasadi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[351] arXiv:2512.05996 [pdf, ps, other]: Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and Counting

Authors: Yi Liu, Jingyu Song, Vedanth Kallakuri, Katherine A. Skinner

Comments: 18 pages, under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
[352] arXiv:2512.05993 [pdf, ps, other]: Title: Domain-Specific Foundation Model Improves AI-Based Analysis of Neuropathology

Authors: Ruchika Verma, Shrishtee Kandoi, Robina Afzal, Shengjia Chen, Jannes Jegminat, Michael W. Karlovich, Melissa Umphlett, Timothy E. Richardson, Kevin Clare, Quazi Hossain, Jorge Samanamud, Phyllis L. Faust, Elan D. Louis, Ann C. McKee, Thor D. Stein, Jonathan D. Cherry, Jesse Mez, Anya C. McGoldrick, Dalilah D. Quintana Mora, Melissa J. Nirenberg, Ruth H. Walker, Yolfrankcis Mendez, Susan Morgello, Dennis W. Dickson, Melissa E. Murray, Carlos Cordon-Cardo, Nadejda M. Tsankova, Jamie M. Walker, Diana K. Dangoor, Stephanie McQuillan, Emma L. Thorn, Claudia De Sanctis, Shuying Li, Thomas J. Fuchs, Kurt Farrell, John F. Crary, Gabriele Campanella

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[353] arXiv:2512.05991 [pdf, ps, other]: Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head

Authors: Chang Liu, Tianjiao Jing, Chengcheng Ma, Xuanqi Zhou, Zhengxuan Lian, Qin Jin, Hongliang Yuan, Shi-Sheng Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[354] arXiv:2512.05988 [pdf, ps, other]: Title: VG3T: Visual Geometry Grounded Gaussian Transformer

Authors: Junho Kim, Seongwon Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[355] arXiv:2512.05987 [pdf, ps, other]: Title: Adaptive Dataset Quantization: A New Direction for Dataset Pruning

Authors: Chenyue Yu, Jianyu Yu

Comments: Accepted by ICCPR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[356] arXiv:2512.05969 [pdf, ps, other]: Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices

Authors: Hokin Deng

Comments: See $\href{this https URL}{results}$ and $\href{this https URL}{code}$

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[357] arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]: Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs

Authors: Sujoy Nath, Arkaprabha Basu, Sharanya Dasgupta, Swagatam Das

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[358] arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]: Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray Segmentation

Authors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Sharipov Hotam Beknazarovich, Farzona S. Ataeva, Qurbonaliev Alisher, Yuanjie Zheng

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[359] arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]: Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics Framework

Authors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Alimov Ruslan, Lutfuloev Mazbutdzhon, Ismoilov Shuhratjon, Yuanjie Zheng

Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[360] arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]: Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning Models

Authors: Shimin Zhang, Xianwei Chen, Yufan Shen, Ziyuan Ye, Jibin Wu

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[361] arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]: Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces

Authors: Nikita Gabdullin

Comments: 9 pages, 5 figures, 1 table, 4 equations

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[362] arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]: Title: Human Geometry Distribution for 3D Animation Generation

Authors: Xiangjun Tang, Biao Zhang, Peter Wonka

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[363] arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]: Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World Models

Authors: Chenwei Shi, Xueyu Luan

Comments: 23 pages, 8 figures, 3 tables

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
[364] arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]: Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models

Authors: Haidong Kang, Jun Du, Lihong Lin

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[365] arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]: Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness Likelihood

Authors: Gilhyun Nam, Taewon Kim, Joonhyun Jeong, Eunho Yang

Comments: Accepted to WACV 2026

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[366] arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]: Title: A Geometric Unification of Concept Learning with Concept Cones

Authors: Alexandre Rocchi--Henry, Thomas Fel, Gianni Franchi

Comments: 22 pages

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[367] arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]: Title: Affine Subspace Models and Clustering for Patch-Based Image Denoising

Authors: Tharindu Wickremasinghe, Marco F. Duarte

Comments: Asilomar Conference on Signals, Systems, and Computers 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[368] arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]: Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty Metrics

Authors: Tianyi Ren, Daniel Low, Pittra Jaengprajak, Juampablo Heras Rivera, Jacob Ruzevick, Mehmet Kurt

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[369] arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]: Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers

Authors: Jonghyun Park, Jong Chul Ye

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[370] arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]: Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket Search

Authors: Tanay Arora, Christof Teuscher

Comments: This work plans to be submitted to the IEEE for possible publication

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[371] arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]: Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning

Authors: Nithin Sivakumaran, Justin Chih-Yao Chen, David Wan, Yue Zhang, Jaehong Yoon, Elias Stengel-Eskin, Mohit Bansal

Comments: Code: this https URL

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[372] arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]: Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous Driving

Authors: Zebin Xing, Yupeng Zheng, Qichao Zhang, Zhixing Ding, Pengxuan Yang, Songen Gu, Zhongpu Xia, Dongbin Zhao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[373] arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]: Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep Analysis

Authors: Sakib Mostafa, Lei Xing, Md. Tauhidul Islam

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[374] arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]: Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme Patients

Authors: Krishna Arun, Moinak Bhattachrya, Paras Goel

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[375] arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]: Title: VideoVLA: Video Generators Can Be Generalizable Robot Manipulators

Authors: Yichao Shen, Fangyun Wei, Zhiying Du, Yaobo Liang, Yan Lu, Jiaolong Yang, Nanning Zheng, Baining Guo

Comments: Project page: this https URL

Journal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[376] arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]: Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge

Authors: Ilia Larchenko, Gleb Zarin, Akash Karnatak

Comments: 2025 NeurIPS Behavior Challenge 1st place solution

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[377] arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]: Title: Dynamic Visual SLAM using a General 3D Prior

Authors: Xingguang Zhong, Liren Jin, Marija Popović, Jens Behley, Cyrill Stachniss

Comments: 8 pages

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[378] arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]: Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge Devices

Authors: Sepyan Purnama Kristanto, Lutfi Hakim, Hermansyah

Comments: 9Pages, 3 figure, Politeknik Negeri Banyuwangi

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[379] arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]: Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association

Authors: Zhihua Fang, Shumei Tao, Junxu Wang, Liang He

Comments: FAME 2026 Technical Report

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[380] arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]: Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics

Authors: Nikhil Verma, Joonas Linnosmaa, Espinosa-Leal Leonardo, Napat Vajragupta

Comments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-concept

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[381] arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]: Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG Data

Authors: Lin Yang, Xiang Li, Xin Ma, Xinxin Zhao

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)

[ total of 749 entries: 1-250 | 132-381 | 382-631 | 632-749 ]
[ showing 250 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 131

Tue, 9 Dec 2025 (showing first 250 of 259 entries)