Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 524

[ total of 603 entries: 1-50 | ... | 375-424 | 425-474 | 475-524 | 525-574 | 575-603 ]
[ showing 50 entries per page: fewer | more | all ]

Wed, 24 Dec 2025 (continued, showing 50 of 86 entries)

[525] arXiv:2512.20557 [pdf, ps, other]: Title: Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Authors: Shengchao Zhou, Yuxin Chen, Yuying Ge, Wei Huang, Jiehong Lin, Ying Shan, Xiaojuan Qi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[526] arXiv:2512.20556 [pdf, ps, other]: Title: Multi-Grained Text-Guided Image Fusion for Multi-Exposure and Multi-Focus Scenarios

Authors: Mingwei Tang, Jiahao Nie, Guang Yang, Ziqing Cui, Jie Li

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527] arXiv:2512.20538 [pdf, ps, other]: Title: AlignPose: Generalizable 6D Pose Estimation via Multi-view Feature-metric Alignment

Authors: Anna Šárová Mikeštíková, Médéric Fourmy, Martin Cífka, Josef Sivic, Vladimir Petrik

Comments: 18 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528] arXiv:2512.20531 [pdf, ps, other]: Title: SirenPose: Dynamic Scene Reconstruction via Geometric Supervision

Authors: Kaitong Cai, Jensen Zhang, Jing Yang, Keze Wang

Comments: Under submission

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[529] arXiv:2512.20501 [pdf, ps, other]: Title: Bridging Modalities and Transferring Knowledge: Enhanced Multimodal Understanding and Recognition

Authors: Gorjan Radevski

Comments: Ph.D. manuscript; Supervisors/Mentors: Marie-Francine Moens and Tinne Tuytelaars

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530] arXiv:2512.20487 [pdf, ps, other]: Title: Multi-temporal Adaptive Red-Green-Blue and Long-Wave Infrared Fusion for You Only Look Once-Based Landmine Detection from Unmanned Aerial Systems

Authors: James E. Gallagher, Edward J. Oughton, Jana Kosecka

Comments: 21 pages with 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[531] arXiv:2512.20479 [pdf, ps, other]: Title: UTDesign: A Unified Framework for Stylized Text Editing and Generation in Graphic Design Images

Authors: Yiming Zhao, Yuanpeng Gao, Yuxuan Luo, Jiwei Duan, Shisong Lin, Longfei Xiong, Zhouhui Lian

Comments: 22 pages, 25 figures, SIGGRAPH Asia 2025, Conference Paper

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532] arXiv:2512.20451 [pdf, ps, other]: Title: Beyond Motion Pattern: An Empirical Study of Physical Forces for Human Motion Understanding

Authors: Anh Dao, Manh Tran, Yufei Zhang, Xiaoming Liu, Zijun Cui

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533] arXiv:2512.20432 [pdf, ps, other]: Title: High Dimensional Data Decomposition for Anomaly Detection of Textured Images

Authors: Ji Song, Xing Wang, Jianguo Wu, Xiaowei Yue

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[534] arXiv:2512.20431 [pdf, ps, other]: Title: Skin Lesion Classification Using a Soft Voting Ensemble of Convolutional Neural Networks

Authors: Abdullah Al Shafi, Abdul Muntakim, Pintu Chandra Shill, Rowzatul Zannat, Abdullah Al-Amin

Comments: Authors' version of the paper published in proceedings of ECCE, DOI: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[535] arXiv:2512.20417 [pdf, ps, other]: Title: Chain-of-Anomaly Thoughts with Large Vision-Language Models

Authors: Pedro Domingos, João Pereira, Vasco Lopes, João Neves, David Semedo

Comments: 2 pages, 3 figures, 1 table. Accepted for RECPAD 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[536] arXiv:2512.20409 [pdf, ps, other]: Title: DETACH : Decomposed Spatio-Temporal Alignment for Exocentric Video and Ambient Sensors with Staged Learning

Authors: Junho Yoon, Jaemo Jung, Hyunju Kim, Dongman Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[537] arXiv:2512.20377 [pdf, ps, other]: Title: SmartSplat: Feature-Smart Gaussians for Scalable Compression of Ultra-High-Resolution Images

Authors: Linfei Li, Lin Zhang, Zhong Wang, Ying Shen

Comments: Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538] arXiv:2512.20376 [pdf, ps, other]: Title: Linking Faces and Voices Across Languages: Insights from the FAME 2026 Challenge

Authors: Marta Moscati, Ahmed Abdullah, Muhammad Saad Saeed, Shah Nawaz, Rohan Kumar Das, Muhammad Zaigham Zaheer, Junaid Mir, Muhammad Haroon Yousaf, Khalid Mahmood Malik, Markus Schedl

Comments: Accepted at ICASSP 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[539] arXiv:2512.20362 [pdf, ps, other]: Title: CRAFT: Continuous Reasoning and Agentic Feedback Tuning for Multimodal Text-to-Image Generation

Authors: V. Kovalev, A. Kuvshinov, A. Buzovkin, D. Pokidov, D. Timonin

Comments: 37 pages, 42 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540] arXiv:2512.20340 [pdf, ps, other]: Title: The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection

Authors: Qingdong He, Xueqin Chen, Yanjie Pan, Peng Tang, Pengcheng Xu, Zhenye Gan, Chengjie Wang, Xiaobin Hu, Jiangning Zhang, Yabiao Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[541] arXiv:2512.20296 [pdf, ps, other]: Title: TAVID: Text-Driven Audio-Visual Interactive Dialogue Generation

Authors: Ji-Hoon Kim, Junseok Ahn, Doyeop Kwak, Joon Son Chung, Shinji Watanabe

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[542] arXiv:2512.20288 [pdf, ps, other]: Title: UbiQVision: Quantifying Uncertainty in XAI for Image Recognition

Authors: Akshat Dubey, Aleksandar Anžel, Bahar İlgen, Georges Hattab

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[543] arXiv:2512.20260 [pdf, ps, other]: Title: ${D}^{3}${ETOR}: ${D}$ebate-Enhanced Pseudo Labeling and Frequency-Aware Progressive ${D}$ebiasing for Weakly-Supervised Camouflaged Object ${D}$etection with Scribble Annotations

Authors: Jiawei Ge, Jiuxin Cao, Xinyi Li, Xuelin Zhu, Chang Liu, Bo Liu, Chen Feng, Ioannis Patras

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[544] arXiv:2512.20257 [pdf, ps, other]: Title: LADLE-MM: Limited Annotation based Detector with Learned Ensembles for Multimodal Misinformation

Authors: Daniele Cardullo, Simone Teglia, Irene Amerini

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545] arXiv:2512.20255 [pdf, ps, other]: Title: BiCoR-Seg: Bidirectional Co-Refinement Framework for High-Resolution Remote Sensing Image Segmentation

Authors: Jinghao Shi, Jianing Song

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546] arXiv:2512.20251 [pdf, ps, other]: Title: Degradation-Aware Metric Prompting for Hyperspectral Image Restoration

Authors: Binfeng Wang, Di Wang, Haonan Guo, Ying Fu, Jing Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[547] arXiv:2512.20236 [pdf, ps, other]: Title: IndicDLP: A Foundational Dataset for Multi-Lingual and Multi-Domain Document Layout Parsing

Authors: Oikantik Nath, Sahithi Kukkala, Mitesh Khapra, Ravi Kiran Sarvadevabhatla

Comments: Accepted in ICDAR 2025 (Oral Presentation) - Best Student Paper Runner-Up Award

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548] arXiv:2512.20217 [pdf, ps, other]: Title: LiteFusion: Taming 3D Object Detectors from Vision-Based to Multi-Modal with Minimal Adaptation

Authors: Xiangxuan Ren, Zhongdao Wang, Pin Tang, Guoqing Wang, Jilai Zheng, Chao Ma

Comments: 13 pages, 9 figures, 8 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549] arXiv:2512.20213 [pdf, ps, other]: Title: JDPNet: A Network Based on Joint Degradation Processing for Underwater Image Enhancement

Authors: Tao Ye, Hongbin Ren, Chongbing Zhang, Haoran Chen, Xiaosong Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550] arXiv:2512.20194 [pdf, ps, other]: Title: Generative Latent Coding for Ultra-Low Bitrate Image Compression

Authors: Zhaoyang Jia, Jiahao Li, Bin Li, Houqiang Li, Yan Lu

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[551] arXiv:2512.20174 [pdf, ps, other]: Title: Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark

Authors: Hao Guo, Xugong Qin, Jun Jie Ou Yang, Peng Zhang, Gangyan Zeng, Yubo Li, Hailun Lin

Comments: CVPR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Information Retrieval (cs.IR)
[552] arXiv:2512.20157 [pdf, ps, other]: Title: AMoE: Agglomerative Mixture-of-Experts Vision Foundation Model

Authors: Sofian Chaybouti, Sanath Narayan, Yasser Dahou, Phúc H. Lê Khac, Ankit Singh, Ngoc Dung Huynh, Wamiq Reyaz Para, Hilde Kuehne, Hakim Hacid

Comments: 17 pages, 8 figures, 11 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[553] arXiv:2512.20153 [pdf, ps, other]: Title: CoDi -- an exemplar-conditioned diffusion model for low-shot counting

Authors: Grega Šuštar, Jer Pelhan, Alan Lukežič, Matej Kristan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554] arXiv:2512.20148 [pdf, ps, other]: Title: Enhancing annotations for 5D apple pose estimation through 3D Gaussian Splatting (3DGS)

Authors: Robert van de Ven, Trim Bresilla, Bram Nelissen, Ard Nieuwenhuizen, Eldert J. van Henten, Gert Kootstra

Comments: 33 pages, excluding appendices. 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[555] arXiv:2512.20128 [pdf, ps, other]: Title: milliMamba: Specular-Aware Human Pose Estimation via Dual mmWave Radar with Multi-Frame Mamba Fusion

Authors: Niraj Prakash Kini, Shiau-Rung Tsai, Guan-Hsun Lin, Wen-Hsiao Peng, Ching-Wen Ma, Jenq-Neng Hwang

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[556] arXiv:2512.20120 [pdf, ps, other]: Title: HEART-VIT: Hessian-Guided Efficient Dynamic Attention and Token Pruning in Vision Transformer

Authors: Mohammad Helal Uddin, Liam Seymour, Sabur Baidya

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[557] arXiv:2512.20117 [pdf, ps, other]: Title: DDAVS: Disentangled Audio Semantics and Delayed Bidirectional Alignment for Audio-Visual Segmentation

Authors: Jingqi Tian, Yiheng Du, Haoji Zhang, Yuji Wang, Isaac Ning Lee, Xulong Bai, Tianrui Zhu, Jingxuan Niu, Yansong Tang

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[558] arXiv:2512.20113 [src]: Title: Multi Modal Attention Networks with Uncertainty Quantification for Automated Concrete Bridge Deck Delamination Detection

Authors: Alireza Moayedikia, Sattar Dorafshan

Comments: the authors are going to substantially edit the paper

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[559] arXiv:2512.20107 [pdf, ps, other]: Title: UMAMI: Unifying Masked Autoregressive Models and Deterministic Rendering for View Synthesis

Authors: Thanh-Tung Le, Tuan Pham, Tung Nguyen, Deying Kong, Xiaohui Xie, Stephan Mandt

Comments: Accepted to NeurIPS 2025. The first two authors contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[560] arXiv:2512.20105 [pdf, ps, other]: Title: LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs

Authors: Haiyun Wei, Fan Lu, Yunwei Zhu, Zehan Zheng, Weiyi Xue, Lin Shao, Xudong Zhang, Ya Wu, Rong Fu, Guang Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[561] arXiv:2512.20104 [pdf, ps, other]: Title: Effect of Activation Function and Model Optimizer on the Performance of Human Activity Recognition System Using Various Deep Learning Models

Authors: Subrata Kumer Paula, Dewan Nafiul Islam Noora, Rakhi Rani Paula, Md. Ekramul Hamidb, Fahmid Al Faridc, Hezerul Abdul Karimd, Md. Maruf Al Hossain Princee, Abu Saleh Musa Miahb

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562] arXiv:2512.20088 [pdf, ps, other]: Title: Item Region-based Style Classification Network (IRSN): A Fashion Style Classifier Based on Domain Knowledge of Fashion Experts

Authors: Jinyoung Choi, Youngchae Kwon, Injung Kim

Comments: This is a pre-print of an article published in Applied Intelligence. The final authenticated version is available online at: this https URL

Journal-ref: Applied Intelligence, Vol. 54, pp. 6197-6209 (2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[563] arXiv:2512.20070 [pdf, ps, other]: Title: Progressive Learned Image Compression for Machine Perception

Authors: Jungwoo Kim, Jun-Hyuk Kim, Jong-Seok Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[564] arXiv:2512.20042 [pdf, ps, other]: Title: Beyond Vision: Contextually Enriched Image Captioning with Multi-Modal Retrieva

Authors: Nguyen Lam Phu Quy, Pham Phu Hoa, Tran Chi Nguyen, Dao Sy Duy Minh, Nguyen Hoang Minh Ngoc, Huynh Trung Kiet

Comments: 7 pages, 5 figures. System description for the EVENTA Grand Challenge (Track 1) at ACM MM'25

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[565] arXiv:2512.20033 [pdf, ps, other]: Title: FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs

Authors: Andreas Zinonos, Michał Stypułkowski, Antoni Bigata, Stavros Petridis, Maja Pantic, Nikita Drobyshev

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[566] arXiv:2512.20032 [pdf, ps, other]: Title: VALLR-Pin: Uncertainty-Factorized Visual Speech Recognition for Mandarin with Pinyin Guidance

Authors: Chang Sun, Dongliang Xie, Wanpeng Xie, Bo Qin, Hong Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567] arXiv:2512.20029 [pdf, ps, other]: Title: $\text{H}^2$em: Learning Hierarchical Hyperbolic Embeddings for Compositional Zero-Shot Learning

Authors: Lin Li, Jiahui Li, Jiaming Lei, Jun Xiao, Feifei Shao, Long Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[568] arXiv:2512.20026 [pdf, ps, other]: Title: MAPI-GNN: Multi-Activation Plane Interaction Graph Neural Network for Multimodal Medical Diagnosis

Authors: Ziwei Qin, Xuhui Song, Deqing Huang, Na Qin, Jun Li

Comments: Accepted by Proceedings of the AAAI Conference on Artificial Intelligence 40 (AAAI-26)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569] arXiv:2512.20025 [pdf, ps, other]: Title: A Contextual Analysis of Driver-Facing and Dual-View Video Inputs for Distraction Detection in Naturalistic Driving Environments

Authors: Anthony Dontoh, Stephanie Ivey, Armstrong Aboah

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[570] arXiv:2512.20013 [pdf, ps, other]: Title: SegEarth-R2: Towards Comprehensive Language-guided Segmentation for Remote Sensing Images

Authors: Zepeng Xin, Kaiyu Li, Luodi Chen, Wanchen Li, Yuchen Xiao, Hui Qiao, Weizhan Zhang, Deyu Meng, Xiangyong Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571] arXiv:2512.20011 [pdf, ps, other]: Title: PaveSync: A Unified and Comprehensive Dataset for Pavement Distress Analysis and Classification

Authors: Blessing Agyei Kyem, Joshua Kofi Asamoah, Anthony Dontoh, Andrews Danyo, Eugene Denteh, Armstrong Aboah

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572] arXiv:2512.20000 [pdf, ps, other]: Title: Few-Shot-Based Modular Image-to-Video Adapter for Diffusion Models

Authors: Zhenhao Li, Shaohan Yi, Zheng Liu, Leonartinus Gao, Minh Ngoc Le, Ambrose Ling, Zhuoran Wang, Md Amirul Islam, Zhixiang Chi, Yuanhao Yu

Comments: GitHub page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[573] arXiv:2512.19990 [pdf, ps, other]: Title: A Dual-Branch Local-Global Framework for Cross-Resolution Land Cover Mapping

Authors: Peng Gao, Ke Li, Di Wang, Yongshan Zhu, Yiming Zhang, Xuemei Luo, Yifeng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[574] arXiv:2512.19989 [pdf, ps, other]: Title: A Novel CNN Gradient Boosting Ensemble for Guava Disease Detection

Authors: Tamim Ahasan Rijon, Yeasin Arafath

Comments: Accepted at IEEE ICCIT 2025. This is the author accepted manuscript

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

[ total of 603 entries: 1-50 | ... | 375-424 | 425-474 | 475-524 | 525-574 | 575-603 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2601, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 524

Wed, 24 Dec 2025 (continued, showing 50 of 86 entries)