Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 438

[ total of 754 entries: 1-100 | ... | 139-238 | 239-338 | 339-438 | 439-538 | 539-638 | 639-738 | 739-754 ]
[ showing 100 entries per page: fewer | more | all ]

Tue, 9 Dec 2025 (continued, showing last 87 of 259 entries)

[439] arXiv:2512.06377 [pdf, ps, other]: Title: VAD-Net: Multidimensional Facial Expression Recognition in Intelligent Education System

Authors: Yi Huo, Yun Ge

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[440] arXiv:2512.06376 [pdf, ps, other]: Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework

Authors: Xinhao Xiang, Abhijeet Rastogi, Jiawei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[441] arXiv:2512.06373 [pdf, ps, other]: Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement Learning

Authors: Yuji Wang, Wenlong Liu, Jingxuan Niu, Haoji Zhang, Yansong Tang

Comments: The project page is [this url](this https URL)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[442] arXiv:2512.06368 [pdf, ps, other]: Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos

Authors: Weitao Xiong, Zhiyuan Yuan, Jiahao Lu, Chengfeng Zhao, Peng Li, Yuan Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[443] arXiv:2512.06363 [pdf, ps, other]: Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack Detection

Authors: Jiabao Guo, Yadian Wang, Hui Ma, Yuhao Fu, Ju Jia, Hui Liu, Shengeng Tang, Lechao Cheng, Yunfeng Diao, Ajian Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[444] arXiv:2512.06358 [pdf, ps, other]: Title: Rectifying Latent Space for Generative Single-Image Reflection Removal

Authors: Mingjia Li, Jin Hu, Hainuo Wang, Qiming Hu, Jiarui Wang, Xiaojie Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[445] arXiv:2512.06353 [pdf, ps, other]: Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search

Authors: Kaicheng Yang, Kaisen Yang, Baiting Wu, Xun Zhang, Qianrui Yang, Haotong Qin, He Zhang, Yulun Zhang

Comments: Code and Supplementary Material could be found at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[446] arXiv:2512.06345 [pdf, ps, other]: Title: CLUENet: Cluster Attention Makes Neural Networks Have Eyes

Authors: Xiangshuai Song, Jun-Jie Huang, Tianrui Liu, Ke Liang, Chang Tang

Comments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial Intelligence

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[447] arXiv:2512.06344 [pdf, ps, other]: Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low Bitrate

Authors: Kaile Wang, Lijun He, Haisheng Fu, Haixia Bi, Fan Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[448] arXiv:2512.06332 [pdf, ps, other]: Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworks

Authors: Jeffrey Gu, Minkyu Jeon, Ambri Ma, Serena Yeung-Levy, Ellen D. Zhong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[449] arXiv:2512.06330 [pdf, ps, other]: Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for Pansharpening

Authors: Haoyu Zhang, Junhan Luo, Yugang Cao, Siran Peng, Jie Huang, Liangjian-Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[450] arXiv:2512.06328 [pdf, ps, other]: Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models

Authors: Jiahao Li, Yusheng Luo, Yunzhong Lou, Xiangdong Zhou

Comments: Accepted as an Oral presentation at AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[451] arXiv:2512.06306 [pdf, ps, other]: Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation

Authors: Haoxian Zhou, Chuanzhi Xu, Langyi Chen, Haodong Chen, Yuk Ying Chung, Qiang Qu, Xaoming Chen, Weidong Cai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[452] arXiv:2512.06290 [pdf, ps, other]: Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke Classification

Authors: Yiheng Huang, Shuang She, Zewei Wei, Jianmin Lin, Ming Yang, Wenyin Liu

Comments: 17 pages, 5 figures

Journal-ref: ICDAR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[453] arXiv:2512.06282 [pdf, ps, other]: Title: A Sleep Monitoring System Based on Audio, Video and Depth Information

Authors: Lyn Chao-ling Chen, Kuan-Wen Chen, Yi-Ping Hung

Comments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[454] arXiv:2512.06281 [pdf, ps, other]: Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language Models

Authors: Hengzhuang Li, Xinsong Zhang, Qiming Peng, Bin Luo, Han Hu, Dengyang Jiang, Han-Jia Ye, Teng Zhang, Hai Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[455] arXiv:2512.06276 [pdf, ps, other]: Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension

Authors: Tianyi Gao, Hao Li, Han Fang, Xin Wei, Xiaodong Dong, Hongbo Sun, Ye Yuan, Zhongjiang He, Jinglin Xu, Jingmin Xin, Hao Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[456] arXiv:2512.06275 [pdf, ps, other]: Title: FacePhys: State of the Heart Learning

Authors: Kegang Wang, Jiankai Tang, Yuntao Wang, Xin Liu, Yuxuan Fan, Jiatong Ji, Yuanchun Shi, Daniel McDuff

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[457] arXiv:2512.06269 [pdf, ps, other]: Title: TriaGS: Differentiable Triangulation-Guided Geometric Consistency for 3D Gaussian Splatting

Authors: Quan Tran, Tuan Dang

Comments: 10 pages

Journal-ref: WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[458] arXiv:2512.06258 [pdf, ps, other]: Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMs

Authors: Chaoyang Wang, Yangfan He, Yiyang Zhou, Yixuan Wang, Jiaqi Liu, Peng Xia, Zhengzhong Tu, Mohit Bansal, Huaxiu Yao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[459] arXiv:2512.06255 [pdf, ps, other]: Title: Language-driven Fine-grained Retrieval

Authors: Shijie Wang, Xin Yu, Yadan Luo, Zijian Wang, Pengfei Zhang, Zi Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[460] arXiv:2512.06251 [pdf, ps, other]: Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow Networks

Authors: Fangzhou Lin, Yuping Wang, Yuliang Guo, Zixun Huang, Xinyu Huang, Haichong Zhang, Kazunori Yamada, Zhengzhong Tu, Liu Ren, Ziming Zhang

Comments: 12 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[461] arXiv:2512.06232 [pdf, ps, other]: Title: Opinion: Learning Intuitive Physics May Require More than Visual Data

Authors: Ellen Su, Solim Legris, Todd M. Gureckis, Mengye Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[462] arXiv:2512.06230 [pdf, ps, other]: Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis Tracking

Authors: Pranav Balakrishnan, Sidisha Barik, Sean M. O'Rourke, Benjamin M. Marlin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[463] arXiv:2512.06221 [pdf, ps, other]: Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility Study

Authors: Alena Makarova

Comments: 15 pages, 13 figures. Reproducibility study

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[464] arXiv:2512.06206 [pdf, ps, other]: Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated Learning

Authors: Akis Linardos, Sarthak Pati, Ujjwal Baid, Brandon Edwards, Patrick Foley, Kevin Ta, Verena Chung, Micah Sheller, Muhammad Irfan Khan, Mojtaba Jafaritadi, Elina Kontio, Suleiman Khan, Leon Mächler, Ivan Ezhov, Suprosanna Shit, Johannes C. Paetzold, Gustav Grimberg, Manuel A. Nickel, David Naccache, Vasilis Siomos, Jonathan Passerat-Palmbach, Giacomo Tarroni, Daewoon Kim, Leonard L. Klausmann, Prashant Shah, Bjoern Menze, Dimitrios Makris, Spyridon Bakas

Comments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URL

Journal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[465] arXiv:2512.06190 [pdf, ps, other]: Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food Drying

Authors: Shichen Li, Ahmadreza Eslaminia, Chenhui Shao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[466] arXiv:2512.06185 [pdf, ps, other]: Title: SPOOF: Simple Pixel Operations for Out-of-Distribution Fooling

Authors: Ankit Gupta, Christoph Adami, Emily Dolson (Michigan State University)

Comments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary material

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[467] arXiv:2512.06179 [pdf, ps, other]: Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light Direction

Authors: Shilin Hu, Jingyi Xu, Sagnik Das, Dimitris Samaras, Hieu Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[468] arXiv:2512.06174 [pdf, ps, other]: Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light Direction

Authors: Shilin Hu, Jingyi Xu, Akshat Dave, Dimitris Samaras, Hieu Le

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[469] arXiv:2512.06171 [pdf, ps, other]: Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect Detection

Authors: Jessica Plassmann, Nicolas Schuler, Michael Schuth, Georg von Freymann

Comments: 11 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[470] arXiv:2512.06158 [pdf, ps, other]: Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model Animation

Authors: Su Sun, Cheng Zhao, Himangi Mittal, Gaurav Mittal, Rohith Kukkala, Yingjie Victor Chen, Mei Chen

Comments: 15 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[471] arXiv:2512.06105 [pdf, ps, other]: Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report Generation

Authors: Junwen Zheng, Xinran Xu, Li Rong Wang, Chang Cai, Lucinda Siyun Tan, Dingyuan Wang, Hong Liang Tey, Xiuyi Fan

Comments: AAAI-26-AIA

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[472] arXiv:2512.06103 [pdf, ps, other]: Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack Detection

Authors: Raghavendra Ramachandra, Sushma Venkatesh

Comments: Accepted in IEEE T-BIOM

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[473] arXiv:2512.06096 [pdf, ps, other]: Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous Driving

Authors: Karthik Mohan, Sonam Singh, Amit Arvind Kale

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[474] arXiv:2512.06080 [pdf, ps, other]: Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce Light

Authors: Tzofi Klinghoffer, Siddharth Somasundaram, Xiaoyu Xiang, Yuchen Fan, Christian Richardt, Akshat Dave, Ramesh Raskar, Rakesh Ranjan

Comments: SIGGRAPH Asia 2025. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[475] arXiv:2512.06065 [pdf, ps, other]: Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing

Authors: Runjia Li, Moayed Haji-Ali, Ashkan Mirzaei, Chaoyang Wang, Arpit Sahni, Ivan Skorokhodov, Aliaksandr Siarohin, Tomas Jakab, Junlin Han, Sergey Tulyakov, Philip Torr, Willi Menapace

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[476] arXiv:2512.06058 [pdf, ps, other]: Title: Representation Learning for Point Cloud Understanding

Authors: Siming Yan

Comments: 181 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[477] arXiv:2512.06032 [pdf, ps, other]: Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image Segmentation

Authors: Ranjan Sapkota, Konstantinos I. Roumeliotis, Manoj Karkee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[478] arXiv:2512.06024 [pdf, ps, other]: Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensing

Authors: Jiabin Liu, Zihao Zhou, Jialei Yan, Anxin Guo, Alvise Benetazzo, Hui Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
[479] arXiv:2512.06020 [pdf, ps, other]: Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation

Authors: Wenyi Mo, Tianyu Zhang, Yalong Bai, Ligong Han, Ying Ba, Dimitris N. Metaxas

Comments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[480] arXiv:2512.06014 [pdf, ps, other]: Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets

Authors: Jiho Shin, Dominic Marshall, Matthieu Komorowski

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[481] arXiv:2512.06013 [pdf, ps, other]: Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViT

Authors: Wenhao Li, Chengwei Ma, Weixin Mao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[482] arXiv:2512.06012 [pdf, ps, other]: Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive Manufacturing

Authors: Emmanuel Akeweje, Conall Kirk, Chi-Wai Chan, Denis Dowling, Mimi Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[483] arXiv:2512.06010 [pdf, other]: Title: Fast and Flexible Robustness Certificates for Semantic Segmentation

Authors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[484] arXiv:2512.06006 [pdf, ps, other]: Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow Optimization

Authors: Xuefei (Julie) Wang, Kai A. Horstmann, Ethan Lin, Jonathan Chen, Alexander R. Farhang, Sophia Stiles, Atharva Sehgal, Jonathan Light, David Van Valen, Yisong Yue, Jennifer J. Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[485] arXiv:2512.06003 [pdf, ps, other]: Title: PrunedCaps: A Case For Primary Capsules Discrimination

Authors: Ramin Sharifi, Pouya Shiri, Amirali Baniasadi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[486] arXiv:2512.05996 [pdf, ps, other]: Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and Counting

Authors: Yi Liu, Jingyu Song, Vedanth Kallakuri, Katherine A. Skinner

Comments: 18 pages, under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
[487] arXiv:2512.05993 [pdf, ps, other]: Title: Domain-Specific Foundation Model Improves AI-Based Analysis of Neuropathology

Authors: Ruchika Verma, Shrishtee Kandoi, Robina Afzal, Shengjia Chen, Jannes Jegminat, Michael W. Karlovich, Melissa Umphlett, Timothy E. Richardson, Kevin Clare, Quazi Hossain, Jorge Samanamud, Phyllis L. Faust, Elan D. Louis, Ann C. McKee, Thor D. Stein, Jonathan D. Cherry, Jesse Mez, Anya C. McGoldrick, Dalilah D. Quintana Mora, Melissa J. Nirenberg, Ruth H. Walker, Yolfrankcis Mendez, Susan Morgello, Dennis W. Dickson, Melissa E. Murray, Carlos Cordon-Cardo, Nadejda M. Tsankova, Jamie M. Walker, Diana K. Dangoor, Stephanie McQuillan, Emma L. Thorn, Claudia De Sanctis, Shuying Li, Thomas J. Fuchs, Kurt Farrell, John F. Crary, Gabriele Campanella

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[488] arXiv:2512.05991 [pdf, ps, other]: Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head

Authors: Chang Liu, Tianjiao Jing, Chengcheng Ma, Xuanqi Zhou, Zhengxuan Lian, Qin Jin, Hongliang Yuan, Shi-Sheng Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[489] arXiv:2512.05988 [pdf, ps, other]: Title: VG3T: Visual Geometry Grounded Gaussian Transformer

Authors: Junho Kim, Seongwon Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[490] arXiv:2512.05987 [pdf, ps, other]: Title: Adaptive Dataset Quantization: A New Direction for Dataset Pruning

Authors: Chenyue Yu, Jianyu Yu

Comments: Accepted by ICCPR 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[491] arXiv:2512.05969 [pdf, ps, other]: Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices

Authors: Hokin Deng

Comments: See $\href{this https URL}{results}$ and $\href{this https URL}{code}$

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[492] arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]: Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs

Authors: Sujoy Nath, Arkaprabha Basu, Sharanya Dasgupta, Swagatam Das

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[493] arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]: Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray Segmentation

Authors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Sharipov Hotam Beknazarovich, Farzona S. Ataeva, Qurbonaliev Alisher, Yuanjie Zheng

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[494] arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]: Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics Framework

Authors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Alimov Ruslan, Lutfuloev Mazbutdzhon, Ismoilov Shuhratjon, Yuanjie Zheng

Subjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[495] arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]: Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning Models

Authors: Shimin Zhang, Xianwei Chen, Yufan Shen, Ziyuan Ye, Jibin Wu

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[496] arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]: Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spaces

Authors: Nikita Gabdullin

Comments: 9 pages, 5 figures, 1 table, 4 equations

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[497] arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]: Title: Human Geometry Distribution for 3D Animation Generation

Authors: Xiangjun Tang, Biao Zhang, Peter Wonka

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[498] arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]: Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World Models

Authors: Chenwei Shi, Xueyu Luan

Comments: 23 pages, 8 figures, 3 tables

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
[499] arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]: Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models

Authors: Haidong Kang, Jun Du, Lihong Lin

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[500] arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]: Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness Likelihood

Authors: Gilhyun Nam, Taewon Kim, Joonhyun Jeong, Eunho Yang

Comments: Accepted to WACV 2026

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[501] arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]: Title: A Geometric Unification of Concept Learning with Concept Cones

Authors: Alexandre Rocchi--Henry, Thomas Fel, Gianni Franchi

Comments: 22 pages

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[502] arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]: Title: Affine Subspace Models and Clustering for Patch-Based Image Denoising

Authors: Tharindu Wickremasinghe, Marco F. Duarte

Comments: Asilomar Conference on Signals, Systems, and Computers 2025

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[503] arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]: Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty Metrics

Authors: Tianyi Ren, Daniel Low, Pittra Jaengprajak, Juampablo Heras Rivera, Jacob Ruzevick, Mehmet Kurt

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[504] arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]: Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem Solvers

Authors: Jonghyun Park, Jong Chul Ye

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[505] arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]: Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket Search

Authors: Tanay Arora, Christof Teuscher

Comments: This work plans to be submitted to the IEEE for possible publication

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[506] arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]: Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning

Authors: Nithin Sivakumaran, Justin Chih-Yao Chen, David Wan, Yue Zhang, Jaehong Yoon, Elias Stengel-Eskin, Mohit Bansal

Comments: Code: this https URL

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[507] arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]: Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous Driving

Authors: Zebin Xing, Yupeng Zheng, Qichao Zhang, Zhixing Ding, Pengxuan Yang, Songen Gu, Zhongpu Xia, Dongbin Zhao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[508] arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]: Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep Analysis

Authors: Sakib Mostafa, Lei Xing, Md. Tauhidul Islam

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[509] arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]: Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme Patients

Authors: Krishna Arun, Moinak Bhattachrya, Paras Goel

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[510] arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]: Title: VideoVLA: Video Generators Can Be Generalizable Robot Manipulators

Authors: Yichao Shen, Fangyun Wei, Zhiying Du, Yaobo Liang, Yan Lu, Jiaolong Yang, Nanning Zheng, Baining Guo

Comments: Project page: this https URL

Journal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[511] arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]: Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge

Authors: Ilia Larchenko, Gleb Zarin, Akash Karnatak

Comments: 2025 NeurIPS Behavior Challenge 1st place solution

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[512] arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]: Title: Dynamic Visual SLAM using a General 3D Prior

Authors: Xingguang Zhong, Liren Jin, Marija Popović, Jens Behley, Cyrill Stachniss

Comments: 8 pages

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[513] arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]: Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge Devices

Authors: Sepyan Purnama Kristanto, Lutfi Hakim, Hermansyah

Comments: 9Pages, 3 figure, Politeknik Negeri Banyuwangi

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[514] arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]: Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association

Authors: Zhihua Fang, Shumei Tao, Junxu Wang, Liang He

Comments: FAME 2026 Technical Report

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
[515] arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]: Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics

Authors: Nikhil Verma, Joonas Linnosmaa, Espinosa-Leal Leonardo, Napat Vajragupta

Comments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-concept

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[516] arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]: Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG Data

Authors: Lin Yang, Xiang Li, Xin Ma, Xinxin Zhao

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[517] arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]: Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods

Authors: Panagiota Kiourti, Anu Singh, Preeti Duraipandian, Weichao Zhou, Wenchao Li

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[518] arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]: Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine Learning

Authors: Camellia Zakaria, Aryan Sadeghi, Weaam Jaafar, Junshi Xu, Alex Mariakakis, Marianne Hatzopoulou

Comments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
[519] arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]: Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network

Authors: Xiao Li

Comments: in Chinese language

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[520] arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]: Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment

Authors: Ruicheng Zhang, Mingyang Zhang, Jun Zhou, Zhangrui Guo, Xiaofan Liu, Zunnan Xu, Zhizhou Zhong, Puxin Yan, Haocheng Luo, Xiu Li

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[521] arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]: Title: Vector Quantization using Gaussian Variational Autoencoder

Authors: Tongda Xu, Wendi Zheng, Jiajun He, Jose Miguel Hernandez-Lobato, Yan Wang, Ya-Qin Zhang, Jie Tang

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[522] arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]: Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation

Authors: Xiaojun Jia, Jie Liao, Qi Guo, Teng Ma, Simeng Qin, Ranjie Duan, Tianlin Li, Yihao Huang, Zhitao Zeng, Dongxian Wu, Yiming Li, Wenqi Ren, Xiaochun Cao, Yang Liu

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[523] arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]: Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind Travelers

Authors: Hochul Hwang, Soowan Yang, Jahir Sadik Monon, Nicholas A Giudice, Sunghoon Ivan Lee, Joydeep Biswas, Donghyun Kim

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[524] arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]: Title: Semantic Temporal Single-photon LiDAR

Authors: Fang Li, Tonglin Mu, Shuling Li, Junran Guo, Keyuan Li, Jianing Li, Ziyang Luo, Xiaodong Fan, Ye Chen, Yunfeng Liu, Hong Cai, Lip Ket Chin, Jinbei Zhang, Shihai Sun

Comments: 14 pages, 5 figures. And any comment is welcome

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
[525] arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]: Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation

Authors: Azeez Idris, Abdurahman Ali Mohammed, Samuel Fanijo

Comments: NeurIPS Black in AI workshop - 2022

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Mon, 8 Dec 2025 (showing first 13 of 94 entries)

[526] arXiv:2512.05965 [pdf, ps, other]: Title: EditThinker: Unlocking Iterative Reasoning for Any Image Editor

Authors: Hongyu Li, Manyuan Zhang, Dian Zheng, Ziyu Guo, Yimeng Jia, Kaituo Feng, Hao Yu, Yexin Liu, Yan Feng, Peng Pei, Xunliang Cai, Linjiang Huang, Hongsheng Li, Si Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[527] arXiv:2512.05960 [pdf, ps, other]: Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image Enhancement

Authors: Munsif Ali, Najmul Hassan, Lucia Ventura, Davide Di Bari, Simonepietro Canese

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[528] arXiv:2512.05941 [pdf, ps, other]: Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI Grounding

Authors: Zhiyuan Jiang, Shenghao Xie, Wenyi Li, Wenqiang Zu, Peihang Li, Jiahao Qiu, Siqi Pei, Lei Ma, Tiejun Huang, Mengdi Wang, Shilong Liu

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[529] arXiv:2512.05937 [pdf, ps, other]: Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV Perception

Authors: Anne Sielemann, Valentin Barner, Stefan Wolf, Masoud Roschani, Jens Ziehn, Juergen Beyerer

Comments: 8 pages, 2 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[530] arXiv:2512.05936 [pdf, ps, other]: Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign Recognition

Authors: Anne Sielemann, Lena Loercher, Max-Lion Schumacher, Stefan Wolf, Masoud Roschani, Jens Ziehn

Comments: 8 pages, 8 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[531] arXiv:2512.05928 [pdf, ps, other]: Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face Recognition

Authors: Pedro Vidal, Bernardo Biesseck, Luiz E. L. Coelho, Roger Granada, David Menotti

Comments: 18 pages, 17 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[532] arXiv:2512.05927 [pdf, ps, other]: Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty

Authors: Zhiting Mei, Tenny Yin, Micah Baker, Ola Shorinwa, Anirudha Majumdar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[533] arXiv:2512.05922 [pdf, ps, other]: Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology Segmentation

Authors: Khang Le, Anh Mai Vu, Thi Kim Trang Vo, Ha Thach, Ngoc Bui Lam Quang, Thanh-Huy Nguyen, Minh H. N. Le, Zhu Han, Chandra Mohan, Hien Van Nguyen

Comments: Note: Khang Le and Anh Mai Vu contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[534] arXiv:2512.05920 [pdf, ps, other]: Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery Prediction

Authors: Jiawen Yang, Yihui Cao, Xuanyu Tian, Yuyao Zhang, Hongjiang Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[535] arXiv:2512.05905 [pdf, ps, other]: Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

Authors: Wenhao Yan, Sheng Ye, Zhuoyi Yang, Jiayan Teng, ZhenHui Dong, Kairui Wen, Xiaotao Gu, Yong-Jin Liu, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[536] arXiv:2512.05866 [pdf, ps, other]: Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator

Authors: Md. Mahbub Hasan Akash, Aria Tasnim Mridula, Sheekar Banerjee, Ishtiak Al Mamoon

Comments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[537] arXiv:2512.05859 [pdf, ps, other]: Title: Edit-aware RAW Reconstruction

Authors: Abhijith Punnappurath, Luxi Zhao, Ke Zhao, Hue Nguyen, Radek Grzeszczuk, Michael S. Brown

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538] arXiv:2512.05853 [pdf, ps, other]: Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack

Authors: Shiji Zhao, Shukun Xiong, Yao Huang, Yan Jin, Zhenyu Wu, Jiyang Guan, Ranjie Duan, Jialing Tao, Hui Xue, Xingxing Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 754 entries: 1-100 | ... | 139-238 | 239-338 | 339-438 | 439-538 | 539-638 | 639-738 | 739-754 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 438

Tue, 9 Dec 2025 (continued, showing last 87 of 259 entries)

Mon, 8 Dec 2025 (showing first 13 of 94 entries)