Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 663

[ total of 603 entries: 1-50 | ... | 404-453 | 454-503 | 504-553 | 554-603 ]
[ showing 50 entries per page: fewer | more | all ]

Wed, 24 Dec 2025 (continued, showing last 50 of 86 entries)

[554] arXiv:2512.20148 [pdf, ps, other]: Title: Enhancing annotations for 5D apple pose estimation through 3D Gaussian Splatting (3DGS)

Authors: Robert van de Ven, Trim Bresilla, Bram Nelissen, Ard Nieuwenhuizen, Eldert J. van Henten, Gert Kootstra

Comments: 33 pages, excluding appendices. 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[555] arXiv:2512.20128 [pdf, ps, other]: Title: milliMamba: Specular-Aware Human Pose Estimation via Dual mmWave Radar with Multi-Frame Mamba Fusion

Authors: Niraj Prakash Kini, Shiau-Rung Tsai, Guan-Hsun Lin, Wen-Hsiao Peng, Ching-Wen Ma, Jenq-Neng Hwang

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556] arXiv:2512.20120 [pdf, ps, other]: Title: HEART-VIT: Hessian-Guided Efficient Dynamic Attention and Token Pruning in Vision Transformer

Authors: Mohammad Helal Uddin, Liam Seymour, Sabur Baidya

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[557] arXiv:2512.20117 [pdf, ps, other]: Title: DDAVS: Disentangled Audio Semantics and Delayed Bidirectional Alignment for Audio-Visual Segmentation

Authors: Jingqi Tian, Yiheng Du, Haoji Zhang, Yuji Wang, Isaac Ning Lee, Xulong Bai, Tianrui Zhu, Jingxuan Niu, Yansong Tang

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[558] arXiv:2512.20113 [src]: Title: Multi Modal Attention Networks with Uncertainty Quantification for Automated Concrete Bridge Deck Delamination Detection

Authors: Alireza Moayedikia, Sattar Dorafshan

Comments: the authors are going to substantially edit the paper

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[559] arXiv:2512.20107 [pdf, ps, other]: Title: UMAMI: Unifying Masked Autoregressive Models and Deterministic Rendering for View Synthesis

Authors: Thanh-Tung Le, Tuan Pham, Tung Nguyen, Deying Kong, Xiaohui Xie, Stephan Mandt

Comments: Accepted to NeurIPS 2025. The first two authors contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[560] arXiv:2512.20105 [pdf, ps, other]: Title: LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs

Authors: Haiyun Wei, Fan Lu, Yunwei Zhu, Zehan Zheng, Weiyi Xue, Lin Shao, Xudong Zhang, Ya Wu, Rong Fu, Guang Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561] arXiv:2512.20104 [pdf, ps, other]: Title: Effect of Activation Function and Model Optimizer on the Performance of Human Activity Recognition System Using Various Deep Learning Models

Authors: Subrata Kumer Paula, Dewan Nafiul Islam Noora, Rakhi Rani Paula, Md. Ekramul Hamidb, Fahmid Al Faridc, Hezerul Abdul Karimd, Md. Maruf Al Hossain Princee, Abu Saleh Musa Miahb

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562] arXiv:2512.20088 [pdf, ps, other]: Title: Item Region-based Style Classification Network (IRSN): A Fashion Style Classifier Based on Domain Knowledge of Fashion Experts

Authors: Jinyoung Choi, Youngchae Kwon, Injung Kim

Comments: This is a pre-print of an article published in Applied Intelligence. The final authenticated version is available online at: this https URL

Journal-ref: Applied Intelligence, Vol. 54, pp. 6197-6209 (2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[563] arXiv:2512.20070 [pdf, ps, other]: Title: Progressive Learned Image Compression for Machine Perception

Authors: Jungwoo Kim, Jun-Hyuk Kim, Jong-Seok Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[564] arXiv:2512.20042 [pdf, ps, other]: Title: Beyond Vision: Contextually Enriched Image Captioning with Multi-Modal Retrieva

Authors: Nguyen Lam Phu Quy, Pham Phu Hoa, Tran Chi Nguyen, Dao Sy Duy Minh, Nguyen Hoang Minh Ngoc, Huynh Trung Kiet

Comments: 7 pages, 5 figures. System description for the EVENTA Grand Challenge (Track 1) at ACM MM'25

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[565] arXiv:2512.20033 [pdf, ps, other]: Title: FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs

Authors: Andreas Zinonos, Michał Stypułkowski, Antoni Bigata, Stavros Petridis, Maja Pantic, Nikita Drobyshev

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[566] arXiv:2512.20032 [pdf, ps, other]: Title: VALLR-Pin: Uncertainty-Factorized Visual Speech Recognition for Mandarin with Pinyin Guidance

Authors: Chang Sun, Dongliang Xie, Wanpeng Xie, Bo Qin, Hong Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567] arXiv:2512.20029 [pdf, ps, other]: Title: $\text{H}^2$em: Learning Hierarchical Hyperbolic Embeddings for Compositional Zero-Shot Learning

Authors: Lin Li, Jiahui Li, Jiaming Lei, Jun Xiao, Feifei Shao, Long Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[568] arXiv:2512.20026 [pdf, ps, other]: Title: MAPI-GNN: Multi-Activation Plane Interaction Graph Neural Network for Multimodal Medical Diagnosis

Authors: Ziwei Qin, Xuhui Song, Deqing Huang, Na Qin, Jun Li

Comments: Accepted by Proceedings of the AAAI Conference on Artificial Intelligence 40 (AAAI-26)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569] arXiv:2512.20025 [pdf, ps, other]: Title: A Contextual Analysis of Driver-Facing and Dual-View Video Inputs for Distraction Detection in Naturalistic Driving Environments

Authors: Anthony Dontoh, Stephanie Ivey, Armstrong Aboah

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[570] arXiv:2512.20013 [pdf, ps, other]: Title: SegEarth-R2: Towards Comprehensive Language-guided Segmentation for Remote Sensing Images

Authors: Zepeng Xin, Kaiyu Li, Luodi Chen, Wanchen Li, Yuchen Xiao, Hui Qiao, Weizhan Zhang, Deyu Meng, Xiangyong Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571] arXiv:2512.20011 [pdf, ps, other]: Title: PaveSync: A Unified and Comprehensive Dataset for Pavement Distress Analysis and Classification

Authors: Blessing Agyei Kyem, Joshua Kofi Asamoah, Anthony Dontoh, Andrews Danyo, Eugene Denteh, Armstrong Aboah

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572] arXiv:2512.20000 [pdf, ps, other]: Title: Few-Shot-Based Modular Image-to-Video Adapter for Diffusion Models

Authors: Zhenhao Li, Shaohan Yi, Zheng Liu, Leonartinus Gao, Minh Ngoc Le, Ambrose Ling, Zhuoran Wang, Md Amirul Islam, Zhixiang Chi, Yuanhao Yu

Comments: GitHub page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[573] arXiv:2512.19990 [pdf, ps, other]: Title: A Dual-Branch Local-Global Framework for Cross-Resolution Land Cover Mapping

Authors: Peng Gao, Ke Li, Di Wang, Yongshan Zhu, Yiming Zhang, Xuemei Luo, Yifeng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574] arXiv:2512.19989 [pdf, ps, other]: Title: A Novel CNN Gradient Boosting Ensemble for Guava Disease Detection

Authors: Tamim Ahasan Rijon, Yeasin Arafath

Comments: Accepted at IEEE ICCIT 2025. This is the author accepted manuscript

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[575] arXiv:2512.19982 [pdf, ps, other]: Title: WSD-MIL: Window Scale Decay Multiple Instance Learning for Whole Slide Image Classification

Authors: Le Feng, Li Xiao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[576] arXiv:2512.19954 [pdf, ps, other]: Title: HistoWAS: A Pathomics Framework for Large-Scale Feature-Wide Association Studies of Tissue Topology and Patient Outcomes

Authors: Yuechen Yang, Junlin Guo, Yanfan Zhu, Jialin Yue, Junchao Zhu, Yu Wang, Shilin Zhao, Haichun Yang, Xingyi Guo, Jovan Tanevski, Laura Barisoni, Avi Z. Rosenberg, Yuankai Huo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[577] arXiv:2512.19949 [pdf, ps, other]: Title: How Much 3D Do Video Foundation Models Encode?

Authors: Zixuan Huang, Xiang Li, Zhaoyang Lv, James M. Rehg

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[578] arXiv:2512.19943 [pdf, ps, other]: Title: SE360: Semantic Edit in 360$^\circ$ Panoramas via Hierarchical Data Construction

Authors: Haoyi Zhong, Fang-Lue Zhang, Andrew Chalmers, Taehyun Rhee

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[579] arXiv:2512.19941 [pdf, ps, other]: Title: Block-Recurrent Dynamics in Vision Transformers

Authors: Mozes Jacobs, Thomas Fel, Richard Hakim, Alessandra Brondetta, Demba Ba, T. Andy Keller

Comments: 25 pages, 15 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[580] arXiv:2512.19934 [pdf, ps, other]: Title: Vehicle-centric Perception via Multimodal Structured Pre-training

Authors: Wentao Wu, Xiao Wang, Chenglong Li, Jin Tang, Bin Luo

Comments: Journal extension of VehicleMAE (AAAI 2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[581] arXiv:2512.19928 [pdf, ps, other]: Title: Unified Brain Surface and Volume Registration

Authors: S. Mazdak Abulnaga, Andrew Hoopes, Malte Hoffmann, Robin Magnet, Maks Ovsjanikov, Lilla Zöllei, John Guttag, Bruce Fischl, Adrian Dalca

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[582] arXiv:2512.19918 [pdf, ps, other]: Title: Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs

Authors: Houston H. Zhang, Tao Zhang, Baoze Lin, Yuanqi Xue, Yincheng Zhu, Huan Liu, Li Gu, Linfeng Ye, Ziqiang Wang, Xinxin Zuo, Yang Wang, Yuanhao Yu, Zhixiang Chi

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[583] arXiv:2512.19871 [pdf, ps, other]: Title: HyGE-Occ: Hybrid View-Transformation with 3D Gaussian and Edge Priors for 3D Panoptic Occupancy Prediction

Authors: Jong Wook Kim, Wonseok Roh, Ha Dam Baek, Pilhyeon Lee, Jonghyun Choi, Sangpil Kim

Comments: 11 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[584] arXiv:2512.19850 [pdf, ps, other]: Title: RANSAC Scoring Functions: Analysis and Reality Check

Authors: A. Shekhovtsov

Subjects: Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
[585] arXiv:2512.19823 [pdf, ps, other]: Title: Learning to Refocus with Video Diffusion Models

Authors: SaiKiran Tedla, Zhoutong Zhang, Xuaner Zhang, Shumian Xin

Comments: Code and data are available at this https URL . SIGGRAPH Asia 2025, Dec. 2025

Journal-ref: Proceedings of the SIGGRAPH Asia 2025, pp. 1-11, 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[586] arXiv:2512.19817 [pdf, ps, other]: Title: Generating the Past, Present and Future from a Motion-Blurred Image

Authors: SaiKiran Tedla, Kelly Zhu, Trevor Canham, Felix Taubner, Michael S. Brown, Kiriakos N. Kutulakos, David B. Lindell

Comments: Code and data are available at this https URL

Journal-ref: ACM Trans. Graph. (SIGGRAPH Asia 2025), vol. 44, no. 6, pp. 1-15, Dec. 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[587] arXiv:2512.19711 [pdf, ps, other]: Title: PHANTOM: PHysical ANamorphic Threats Obstructing Connected Vehicle Mobility

Authors: Md Nahid Hasan Shuvo, Moinul Hossain

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[588] arXiv:2512.20618 (cross-list from cs.AI) [pdf, ps, other]: Title: LongVideoAgent: Multi-Agent Reasoning with Long Videos

Authors: Runtao Liu, Ziyi Liu, Jiaqi Tang, Yue Ma, Renjie Pi, Jipeng Zhang, Qifeng Chen

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
[589] arXiv:2512.20595 (cross-list from cs.CL) [pdf, ps, other]: Title: Cube Bench: A Benchmark for Spatial Visual Reasoning in MLLMs

Authors: Dhruv Anand, Ehsan Shareghi

Comments: 27 pages, 5 figures, 9 tables. Cube available at this https URL

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[590] arXiv:2512.20464 (cross-list from physics.optics) [pdf, ps, other]: Title: Snapshot 3D image projection using a diffractive decoder

Authors: Cagatay Isil, Alexander Chen, Yuhang Li, F. Onuralp Ardic, Shiqi Chen, Che-Yung Shen, Aydogan Ozcan

Comments: 22 Pages, 8 Figures

Subjects: Optics (physics.optics); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Applied Physics (physics.app-ph)
[591] arXiv:2512.20436 (cross-list from eess.IV) [pdf, ps, other]: Title: Dual-Encoder Transformer-Based Multimodal Learning for Ischemic Stroke Lesion Segmentation Using Diffusion MRI

Authors: Muhammad Usman, Azka Rehman, Muhammad Mutti Ur Rehman, Abd Ur Rehman, Muhammad Umar Farooq

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[592] arXiv:2512.20420 (cross-list from cs.LG) [pdf, ps, other]: Title: Simplifying Multi-Task Architectures Through Task-Specific Normalization

Authors: Mihai Suteu, Ovidiu Serban

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[593] arXiv:2512.20387 (cross-list from cs.AI) [pdf, ps, other]: Title: Generative Digital Twins: Vision-Language Simulation Models for Executable Industrial Systems

Authors: YuChe Hsu, AnJui Wang, TsaiChing Ni, YuanFu Yang

Comments: 10 pages, 9 figures

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[594] arXiv:2512.20374 (cross-list from eess.IV) [pdf, ps, other]: Title: CLIP Based Region-Aware Feature Fusion for Automated BBPS Scoring in Colonoscopy Images

Authors: Yujia Fu, Zhiyu Dong, Tianwen Qian, Chenye Zheng, Danian Ji, Linhai Zhuo

Comments: 12 pages, 9 figures, BMVC 2025 submission

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[595] arXiv:2512.20350 (cross-list from cs.LG) [pdf, ps, other]: Title: Field-Space Attention for Structure-Preserving Earth System Transformers

Authors: Maximilian Witte, Johannes Meuer, Étienne Plésiat, Christopher Kadow

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Mathematical Physics (math-ph)
[596] arXiv:2512.20299 (cross-list from cs.RO) [pdf, ps, other]: Title: KnowVal: A Knowledge-Augmented and Value-Guided Autonomous Driving System

Authors: Zhongyu Xia, Wenhao Chen, Yongtao Wang, Ming-Hsuan Yang

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[597] arXiv:2512.20249 (cross-list from cs.LG) [pdf, ps, other]: Title: Unified Multimodal Brain Decoding via Cross-Subject Soft-ROI Fusion

Authors: Xuanyu Hu

Comments: 15 pages, 2 figures, 4 tables. Submitted to ICPR 2026

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[598] arXiv:2512.20233 (cross-list from cs.LG) [pdf, ps, other]: Title: How I Met Your Bias: Investigating Bias Amplification in Diffusion Models

Authors: Nathan Roos, Ekaterina Iakovleva, Ani Gjergji, Vito Paolo Pastore, Enzo Tartaglione

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[599] arXiv:2512.20145 (cross-list from cs.CL) [pdf, ps, other]: Title: Retrieval-augmented Prompt Learning for Pre-trained Foundation Models

Authors: Xiang Chen, Yixin Ou, Quan Feng, Lei Li, Piji Li, Haibo Ye, Sheng-Jun Huang, Shuofei Qiao, Shumin Deng, Huajun Chen, Ningyu Zhang

Comments: IEEE/ACM Transactions on Audio, Speech and Language Processing

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[600] arXiv:2512.20129 (cross-list from cs.HC) [pdf, ps, other]: Title: Dreamcrafter: Immersive Editing of 3D Radiance Fields Through Flexible, Generative Inputs and Outputs

Authors: Cyrus Vachha, Yixiao Kang, Zach Dive, Ashwat Chidambaram, Anik Gupta, Eunice Jun, Bjoern Hartmann

Comments: CHI 2025, Project page: this https URL

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[601] arXiv:2512.20056 (cross-list from cs.AI) [pdf, ps, other]: Title: Towards Generative Location Awareness for Disaster Response: A Probabilistic Cross-view Geolocalization Approach

Authors: Hao Li, Fabian Deuser, Wenping Yin, Steffen Knoblauch, Wufan Zhao, Filip Biljecki, Yong Xue, Wei Huang

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[602] arXiv:2512.19731 (cross-list from cs.LG) [pdf, ps, other]: Title: Exploring Deep-to-Shallow Transformable Neural Networks for Intelligent Embedded Systems

Authors: Xiangzhong Luo, Weichen Liu

Comments: Accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[603] arXiv:2512.18099 (cross-list from eess.AS) [pdf, ps, other]: Title: SAM Audio: Segment Anything in Audio

Authors: Bowen Shi, Andros Tjandra, John Hoffman, Helin Wang, Yi-Chiao Wu, Luya Gao, Julius Richter, Matt Le, Apoorv Vyas, Sanyuan Chen, Christoph Feichtenhofer, Piotr Dollár, Wei-Ning Hsu, Ann Lee

Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV)

[ total of 603 entries: 1-50 | ... | 404-453 | 454-503 | 504-553 | 554-603 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2601, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 663

Wed, 24 Dec 2025 (continued, showing last 50 of 86 entries)