Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 28

[ total of 747 entries: 1-100 | 29-128 | 129-228 | 229-328 | 329-428 | ... | 729-747 ]
[ showing 100 entries per page: fewer | more | all ]

Mon, 15 Dec 2025 (continued, showing last 76 of 104 entries)

[29] arXiv:2512.11534 [pdf, ps, other]: Title: HFS: Holistic Query-Aware Frame Selection for Efficient Video Reasoning

Authors: Yiqing Yang, Kin-Man Lam

Comments: 18 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)
[30] arXiv:2512.11524 [pdf, ps, other]: Title: Super-Resolved Canopy Height Mapping from Sentinel-2 Time Series Using LiDAR HD Reference Data across Metropolitan France

Authors: Ekaterina Kalinicheva, Florian Helen, Stéphane Mermoz, Florian Mouret, Milena Planells

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[31] arXiv:2512.11510 [pdf, ps, other]: Title: Reconstruction as a Bridge for Event-Based Visual Question Answering

Authors: Hanyue Lou, Jiayi Zhou, Yang Zhang, Boyu Li, Yi Wang, Guangnan Ye, Boxin Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32] arXiv:2512.11508 [pdf, ps, other]: Title: On Geometric Understanding and Learned Data Priors in VGGT

Authors: Jelena Bratulić, Sudhanshu Mittal, Thomas Brox, Christian Rupprecht

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2512.11507 [pdf, ps, other]: Title: SSA3D: Text-Conditioned Assisted Self-Supervised Framework for Automatic Dental Abutment Design

Authors: Mianjie Zheng, Xinquan Yang, Along He, Xuguang Li, Feilie Zhong, Xuefen Liu, Kun Tang, Zhicheng Zhang, Linlin Shen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2512.11503 [pdf, ps, other]: Title: TSkel-Mamba: Temporal Dynamic Modeling via State Space Model for Human Skeleton-based Action Recognition

Authors: Yanan Liu, Jun Liu, Hao Zhang, Dan Xu, Hossein Rahmani, Mohammed Bennamoun, Qiuhong Ke

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35] arXiv:2512.11490 [pdf, ps, other]: Title: VLM2GeoVec: Toward Universal Multimodal Embeddings for Remote Sensing

Authors: Emanuel Sánchez Aimar, Gulnaz Zhambulova, Fahad Shahbaz Khan, Yonghao Xu, Michael Felsberg

Comments: 21 pages, 7 figures, under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
[36] arXiv:2512.11480 [pdf, ps, other]: Title: CADMorph: Geometry-Driven Parametric CAD Editing via a Plan-Generate-Verify Loop

Authors: Weijian Ma, Shizhao Sun, Ruiyu Wang, Jiang Bian

Comments: NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[37] arXiv:2512.11465 [pdf, ps, other]: Title: DOS: Distilling Observable Softmaps of Zipfian Prototypes for Self-Supervised Point Representation

Authors: Mohamed Abdelsamad, Michael Ulrich, Bin Yang, Miao Zhang, Yakov Miron, Abhinav Valada

Comments: AAAI-26

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[38] arXiv:2512.11464 [pdf, ps, other]: Title: Exploring MLLM-Diffusion Information Transfer with MetaCanvas

Authors: Han Lin, Xichen Pan, Ziqi Huang, Ji Hou, Jialiang Wang, Weifeng Chen, Zecheng He, Felix Juefei-Xu, Junzhe Sun, Zhipeng Fan, Ali Thabet, Mohit Bansal, Chu Wang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[39] arXiv:2512.11458 [pdf, ps, other]: Title: Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation

Authors: Jingmin Zhu, Anqi Zhu, Hossein Rahmani, Jun Liu, Mohammed Bennamoun, Qiuhong Ke

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[40] arXiv:2512.11446 [pdf, ps, other]: Title: YawDD+: Frame-level Annotations for Accurate Yawn Prediction

Authors: Ahmed Mujtaba, Gleb Radchenko, Marc Masana, Radu Prodan

Comments: This paper is submitted at European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2512.11438 [pdf, ps, other]: Title: Flowception: Temporally Expansive Flow Matching for Video Generation

Authors: Tariq Berrada Ifriqi, John Nguyen, Karteek Alahari, Jakob Verbeek, Ricky T. Q. Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[42] arXiv:2512.11423 [pdf, ps, other]: Title: JoyAvatar: Real-time and Infinite Audio-Driven Avatar Generation with Autoregressive Diffusion

Authors: Chaochao Li, Ruikui Wang, Liangbo Zhou, Jinheng Feng, Huaishao Luo, Huan Zhang, Youzheng Wu, Xiaodong He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43] arXiv:2512.11401 [pdf, ps, other]: Title: Collaborative Reconstruction and Repair for Multi-class Industrial Anomaly Detection

Authors: Qishan Wang, Haofeng Wang, Shuyong Gao, Jia Guo, Li Xiong, Jiaqi Li, Dengxuan Bai, Wenqiang Zhang

Comments: Accepted to Data Intelligence 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[44] arXiv:2512.11395 [pdf, ps, other]: Title: FlowDC: Flow-Based Decoupling-Decay for Complex Image Editing

Authors: Yilei Jiang, Zhen Wang, Yanghao Wang, Jun Yu, Yueting Zhuang, Jun Xiao, Long Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2512.11393 [pdf, ps, other]: Title: The N-Body Problem: Parallel Execution from Single-Person Egocentric Video

Authors: Zhifan Zhu, Yifei Huang, Yoichi Sato, Dima Damen

Comments: project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2512.11373 [pdf, ps, other]: Title: Out-of-Distribution Segmentation via Wasserstein-Based Evidential Uncertainty

Authors: Arnold Brosch, Abdelrahman Eldesokey, Michael Felsberg, Kira Maag

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[47] arXiv:2512.11369 [pdf, ps, other]: Title: Assisted Refinement Network Based on Channel Information Interaction for Camouflaged and Salient Object Detection

Authors: Kuan Wang, Yanjun Qin, Mengge Lu, Liejun Wang, Xiaoming Tao

Comments: 15 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2512.11360 [pdf, ps, other]: Title: Reliable Detection of Minute Targets in High-Resolution Aerial Imagery across Temporal Shifts

Authors: Mohammad Sadegh Gholizadeh, Amir Arsalan Rezapour, Hamidreza Shayegh, Ehsan Pazouki

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[49] arXiv:2512.11356 [pdf, ps, other]: Title: Prior-Enhanced Gaussian Splatting for Dynamic Scene Reconstruction from Casual Video

Authors: Meng-Li Shih, Ying-Huan Chen, Yu-Lun Liu, Brian Curless

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50] arXiv:2512.11354 [pdf, ps, other]: Title: A Multi-Mode Structured Light 3D Imaging System with Multi-Source Information Fusion for Underwater Pipeline Detection

Authors: Qinghan Hu, Haijiang Zhu, Na Sun, Lei Chen, Zhengqiang Fan, Zhiqing Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51] arXiv:2512.11350 [pdf, ps, other]: Title: Surveillance Video-Based Traffic Accident Detection Using Transformer Architecture

Authors: Tanu Singh, Pranamesh Chakraborty, Long T. Truong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[52] arXiv:2512.11340 [pdf, ps, other]: Title: Task-Specific Distance Correlation Matching for Few-Shot Action Recognition

Authors: Fei Long, Yao Zhang, Jiaming Lv, Jiangtao Xie, Peihua Li

Comments: 9 pages. 4 figures, conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[53] arXiv:2512.11336 [pdf, ps, other]: Title: UFVideo: Towards Unified Fine-Grained Video Cooperative Understanding with Large Language Models

Authors: Hewen Pan, Cong Wei, Dashuang Liang, Zepeng Huang, Pengfei Gao, Ziqi Zhou, Lulu Xue, Pengfei Yan, Xiaoming Wei, Minghui Li, Shengshan Hu

Comments: 22 pages, 13 figures, technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[54] arXiv:2512.11335 [pdf, ps, other]: Title: FreqDINO: Frequency-Guided Adaptation for Generalized Boundary-Aware Ultrasound Image Segmentation

Authors: Yixuan Zhang, Qing Xu, Yue Li, Xiangjian He, Qian Zhang, Mainul Haque, Rong Qu, Wenting Duan, Zhen Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[55] arXiv:2512.11327 [pdf, ps, other]: Title: Physics-Informed Video Flare Synthesis and Removal Leveraging Motion Independence between Flare and Scene

Authors: Junqiao Wang, Yuanfei Huang, Hua Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[56] arXiv:2512.11325 [pdf, ps, other]: Title: MLLM Machine Unlearning via Visual Knowledge Distillation

Authors: Yuhang Wang, Zhenxing Niu, Haoxuan Ji, Guangyu He, Haichang Gao, Gang Hua

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[57] arXiv:2512.11321 [pdf, ps, other]: Title: KeyframeFace: From Text to Expressive Facial Keyframes

Authors: Jingchao Wu, Zejian Kang, Haibo Liu, Yuanchen Fei, Xiangru Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[58] arXiv:2512.11319 [pdf, ps, other]: Title: SATMapTR: Satellite Image Enhanced Online HD Map Construction

Authors: Bingyuan Huang, Guanyi Zhao, Qian Xu, Yang Lou, Yung-Hui Li, Jianping Wang

Comments: 9 pages (+ 3 pages of Appendix)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[59] arXiv:2512.11301 [pdf, ps, other]: Title: MultiEgo: A Multi-View Egocentric Video Dataset for 4D Scene Reconstruction

Authors: Bate Li, Houqiang Zhong, Zhengxue Cheng, Qiang Hu, Qiang Wang, Li Song, Wenjun Zhang

Comments: ACM MM 2025 Dataset Track

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[60] arXiv:2512.11296 [pdf, ps, other]: Title: Few-Shot VLM-Based G-Code and HMI Verification in CNC Machining

Authors: Yasaman Hashem Pour, Nazanin Mahjourian, Vinh Nguyen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
[61] arXiv:2512.11293 [pdf, ps, other]: Title: Autoregressive Video Autoencoder with Decoupled Temporal and Spatial Context

Authors: Cuifeng Shen, Lumin Xu, Xingguo Zhu, Gengdai Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[62] arXiv:2512.11284 [pdf, ps, other]: Title: RcAE: Recursive Reconstruction Framework for Unsupervised Industrial Anomaly Detection

Authors: Rongcheng Wu, Hao Zhu, Shiying Zhang, Mingzhe Wang, Zhidong Li, Hui Li, Jianlong Zhou, Jiangtao Cui, Fang Chen, Pingyang Sun, Qiyu Liao, Ye Lin

Comments: 19 pages, 7 figures, to be published in AAAI-26

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63] arXiv:2512.11274 [pdf, ps, other]: Title: FilmWeaver: Weaving Consistent Multi-Shot Videos with Cache-Guided Autoregressive Diffusion

Authors: Xiangyang Luo, Qingyu Li, Xiaokun Liu, Wenyu Qin, Miao Yang, Meng Wang, Pengfei Wan, Di Zhang, Kun Gai, Shao-Lun Huang

Comments: AAAI-2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64] arXiv:2512.11267 [pdf, ps, other]: Title: Evaluating the Efficacy of Sentinel-2 versus Aerial Imagery in Serrated Tussock Classification

Authors: Rezwana Sultana, Manzur Murshed, Kathryn Sheffield, Singarayer Florentine, Tsz-Kwan Lee, Shyh Wei Teng

Comments: Accepted in Earthsense 2025 (IEEE INTERNATIONAL CONFERENCE ON NEXT-GEN TECHNOLOGIES OF ARTIFICIAL INTELLIGENCE AND GEOSCIENCE REMOTE SENSING)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[65] arXiv:2512.11260 [pdf, ps, other]: Title: Do We Need Reformer for Vision? An Experimental Comparison with Vision Transformers

Authors: Ali El Bellaj, Mohammed-Amine Cheddadi, Rhassan Berber

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66] arXiv:2512.11253 [pdf, ps, other]: Title: PersonaLive! Expressive Portrait Image Animation for Live Streaming

Authors: Zhiyuan Li, Chi-Man Pun, Chen Fang, Jue Wang, Xiaodong Cun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[67] arXiv:2512.11239 [pdf, ps, other]: Title: Cross-modal Prompting for Balanced Incomplete Multi-modal Emotion Recognition

Authors: Wen-Jue He, Xiaofeng Zhu, Zheng Zhang

Comments: Accepted by AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[68] arXiv:2512.11237 [pdf, ps, other]: Title: WildCap: Facial Appearance Capture in the Wild via Hybrid Inverse Rendering

Authors: Yuxuan Han, Xin Ming, Tianxiao Li, Zhuofan Shen, Qixuan Zhang, Lan Xu, Feng Xu

Comments: Technical report. project page: this https URL; code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[69] arXiv:2512.11234 [pdf, ps, other]: Title: RoomPilot: Controllable Synthesis of Interactive Indoor Environments via Multimodal Semantic Parsing

Authors: Wentang Chen, Shougao Zhang, Yiman Zhang, Tianhao Zhou, Ruihui Li

Comments: 20 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70] arXiv:2512.11229 [pdf, ps, other]: Title: REST: Diffusion-based Real-time End-to-end Streaming Talking Head Generation via ID-Context Caching and Asynchronous Streaming Distillation

Authors: Haotian Wang, Yuzhe Weng, Xinyi Yu, Jun Du, Haoran Xu, Xiaoyan Wu, Shan He, Bing Yin, Cong Liu, Qingfeng Liu

Comments: 10pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[71] arXiv:2512.11226 [pdf, ps, other]: Title: FutureX: Enhance End-to-End Autonomous Driving via Latent Chain-of-Thought World Model

Authors: Hongbin Lin, Yiming Yang, Yifan Zhang, Chaoda Zheng, Jie Feng, Sheng Wang, Zhennan Wang, Shijia Chen, Boyang Wang, Yu Zhang, Xianming Liu, Shuguang Cui, Zhen Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72] arXiv:2512.11225 [pdf, ps, other]: Title: VFMF: World Modeling by Forecasting Vision Foundation Model Features

Authors: Gabrijel Boduljak, Yushi Lan, Christian Rupprecht, Andrea Vedaldi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[73] arXiv:2512.11215 [pdf, ps, other]: Title: SmokeBench: Evaluating Multimodal Large Language Models for Wildfire Smoke Detection

Authors: Tianye Qi, Weihao Li, Nick Barnes

Comments: Accepted to WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[74] arXiv:2512.11203 [pdf, ps, other]: Title: AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path

Authors: Zhengyang Yu, Akio Hayakawa, Masato Ishii, Qingtao Yu, Takashi Shibuya, Jing Zhang, Yuki Mitsufuji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[75] arXiv:2512.11199 [pdf, ps, other]: Title: CADKnitter: Compositional CAD Generation from Text and Geometry Guidance

Authors: Tri Le, Khang Nguyen, Baoru Huang, Tung D. Ta, Anh Nguyen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[76] arXiv:2512.11189 [pdf, ps, other]: Title: Multi-task Learning with Extended Temporal Shift Module for Temporal Action Localization

Authors: Anh-Kiet Duong, Petra Gomez-Krämer

Comments: BinEgo360@ICCV25

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77] arXiv:2512.11186 [pdf, ps, other]: Title: Lightweight 3D Gaussian Splatting Compression via Video Codec

Authors: Qi Yang, Geert Van Der Auwera, Zhu Li

Comments: Accepted by DCC2026 Oral

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78] arXiv:2512.11167 [pdf, ps, other]: Title: Image Tiling for High-Resolution Reasoning: Balancing Local Detail with Global Context

Authors: Anatole Jacquin de Margerie, Alexis Roger, Irina Rish

Comments: Accepted in AAAI 2025 Workshop on Reproducible AI

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[79] arXiv:2512.11141 [pdf, ps, other]: Title: Learning complete and explainable visual representations from itemized text supervision

Authors: Yiwei Lyu, Chenhui Zhao, Soumyanil Banerjee, Shixuan Liu, Akshay Rao, Akhil Kondepudi, Honglak Lee, Todd C. Hollon

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[80] arXiv:2512.11130 [pdf, ps, other]: Title: Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching

Authors: Bowen Wen, Shaurya Dewan, Stan Birchfield

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[81] arXiv:2512.11121 [pdf, ps, other]: Title: Learning from a Generative Oracle: Domain Adaptation for Restoration

Authors: Yuyang Hu, Mojtaba Sahraee-Ardakan, Arpit Bansal, Kangfu Mei, Christian Qi, Peyman Milanfar, Mauricio Delbracio

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[82] arXiv:2512.11104 [pdf, ps, other]: Title: Information-driven Fusion of Pathology Foundation Models for Enhanced Disease Characterization

Authors: Brennan Flannery, Thomas DeSilvio, Jane Nguyen, Satish E. Viswanath

Comments: 29 Pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83] arXiv:2512.11099 [pdf, ps, other]: Title: VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction

Authors: Weitai Kang, Jason Kuen, Mengwei Ren, Zijun Wei, Yan Yan, Kangning Liu

Comments: 8 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[84] arXiv:2512.11098 [pdf, ps, other]: Title: Vision-Language Models for Infrared Industrial Sensing in Additive Manufacturing Scene Description

Authors: Nazanin Mahjourian, Vinh Nguyen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[85] arXiv:2512.11076 [pdf, ps, other]: Title: E-CHUM: Event-based Cameras for Human Detection and Urban Monitoring

Authors: Jack Brady, Andrew Dailey, Kristen Schang, Zo Vic Shong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[86] arXiv:2512.11061 [pdf, ps, other]: Title: VDAWorld: World Modelling via VLM-Directed Abstraction and Simulation

Authors: Felix O'Mahony, Roberto Cipolla, Ayush Tewari

Comments: Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[87] arXiv:2512.11060 [pdf, ps, other]: Title: Synthetic Vasculature and Pathology Enhance Vision-Language Model Reasoning

Authors: Chenjun Li, Cheng Wan, Laurin Lux, Alexander Berger, Richard B. Rosen, Martin J. Menten, Johannes C. Paetzold

Comments: 23 pages, 8 figures, 6 tables. Full paper under review for MIDL 2026 (Medical Imaging with Deep Learning)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[88] arXiv:2512.11057 [pdf, ps, other]: Title: Weakly Supervised Tuberculosis Localization in Chest X-rays through Knowledge Distillation

Authors: Marshal Ashif Shawkat, Moidul Hasan, Taufiq Hasan

Comments: 18 pages, 9 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[89] arXiv:2512.11016 [pdf, ps, other]: Title: SoccerMaster: A Vision Foundation Model for Soccer Understanding

Authors: Haolin Yang, Jiayuan Rao, Haoning Wu, Weidi Xie

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[90] arXiv:2512.11015 [pdf, ps, other]: Title: Leveraging Text Guidance for Enhancing Demographic Fairness in Gender Classification

Authors: Anoop Krishnan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[91] arXiv:2512.11797 (cross-list from cs.RO) [pdf, ps, other]: Title: AnchorDream: Repurposing Video Diffusion for Embodiment-Aware Robot Data Synthesis

Authors: Junjie Ye, Rong Xue, Basile Van Hoorick, Pavel Tokmakov, Muhammad Zubair Irshad, Yue Wang, Vitor Guizilini

Comments: Project page: this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[92] arXiv:2512.11745 (cross-list from eess.IV) [pdf, ps, other]: Title: mViSE: A Visual Search Engine for Analyzing Multiplex IHC Brain Tissue Images

Authors: Liqiang Huang, Rachel W. Mills, Saikiran Mandula, Lin Bai, Mahtab Jeyhani, John Redell, Hien Van Nguyen, Saurabh Prasad, Dragan Maric, Badrinath Roysam

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[93] arXiv:2512.11695 (cross-list from physics.flu-dyn) [pdf, ps, other]: Title: Particle Image Velocimetry Refinement via Consensus ADMM

Authors: Alan Bonomi, Francesco Banelli, Antonio Terpin

Comments: Code: this https URL

Subjects: Fluid Dynamics (physics.flu-dyn); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Optimization and Control (math.OC)
[94] arXiv:2512.11676 (cross-list from math.PR) [pdf, ps, other]: Title: Stochastics of shapes and Kunita flows

Authors: Stefan Sommer, Gefan Yang, Elizabeth Louise Baker

Subjects: Probability (math.PR); Computer Vision and Pattern Recognition (cs.CV)
[95] arXiv:2512.11582 (cross-list from cs.LG) [pdf, ps, other]: Title: Brain-Semantoks: Learning Semantic Tokens of Brain Dynamics with a Self-Distilled Foundation Model

Authors: Sam Gijsen, Marc-Andre Schulz, Kerstin Ritter

Comments: Code and pretrained models available at this https URL

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
[96] arXiv:2512.11532 (cross-list from cs.DC) [pdf, ps, other]: Title: Parallax: Runtime Parallelization for Operator Fallbacks in Heterogeneous Edge Systems

Authors: Chong Tang, Hao Dai, Jagmohan Chauhan

Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[97] arXiv:2512.11433 (cross-list from cs.AI) [pdf, other]: Title: Back to the Baseline: Examining Baseline Effects on Explainability Metrics

Authors: Agustin Martin Picard (ANITI), Thibaut Boissin (ANITI), Varshini Subhash, Rémi Cadène (SU), Thomas Fel (ANITI)

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[98] arXiv:2512.11399 (cross-list from cs.CL) [pdf, ps, other]: Title: Minimal Clips, Maximum Salience: Long Video Summarization via Key Moment Extraction

Authors: Galann Pennec, Zhengyuan Liu, Nicholas Asher, Philippe Muller, Nancy F. Chen

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[99] arXiv:2512.11243 (cross-list from cs.LG) [pdf, ps, other]: Title: Task-Aware Multi-Expert Architecture For Lifelong Deep Learning

Authors: Jianyu Wang, Jacob Nean-Hua Sheikh, Cat P. Le, Hoda Bidkhori

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[100] arXiv:2512.11218 (cross-list from cs.RO) [pdf, ps, other]: Title: Seeing to Act, Prompting to Specify: A Bayesian Factorization of Vision Language Action Policy

Authors: Kechun Xu, Zhenjie Zhu, Anzhe Chen, Shuqi Zhao, Qing Huang, Yifei Yang, Haojian Lu, Rong Xiong, Masayoshi Tomizuka, Yue Wang

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[101] arXiv:2512.11194 (cross-list from cs.LG) [pdf, ps, other]: Title: Beyond Memorization: Gradient Projection Enables Selective Learning in Diffusion Models

Authors: Divya Kothandaraman, Jaclyn Pytlarz

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[102] arXiv:2512.11145 (cross-list from cs.LG) [pdf, ps, other]: Title: Autoencoder-based Semi-Supervised Dimensionality Reduction and Clustering for Scientific Ensembles

Authors: Lennard Manuel, Hamid Gadirov, Steffen Frey

Comments: Research Internship Project

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[103] arXiv:2512.11047 (cross-list from cs.RO) [pdf, ps, other]: Title: WholeBodyVLA: Towards Unified Latent VLA for Whole-Body Loco-Manipulation Control

Authors: Haoran Jiang, Jin Chen, Qingwen Bu, Li Chen, Modi Shi, Yanjie Zhang, Delong Li, Chuanzhe Suo, Chuang Wang, Zhihui Peng, Hongyang Li

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[104] arXiv:2512.10966 (cross-list from cs.LG) [pdf, ps, other]: Title: Multimodal Fusion of Regional Brain Experts for Interpretable Alzheimer's Disease Diagnosis

Authors: Farica Zhuang, Dinara Aliyeva, Shu Yang, Zixuan Wen, Duy Duong-Tran, Christos Davatzikos, Tianlong Chen, Song Wang, Li Shen

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Fri, 12 Dec 2025 (showing first 24 of 118 entries)

[105] arXiv:2512.10959 [pdf, ps, other]: Title: StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

Authors: Tjark Behrens, Anton Obukhov, Bingxin Ke, Fabio Tosi, Matteo Poggi, Konrad Schindler

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[106] arXiv:2512.10958 [pdf, ps, other]: Title: WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World

Authors: Ao Liang, Lingdong Kong, Tianyi Yan, Hongsi Liu, Wesley Yang, Ziqi Huang, Wei Yin, Jialong Zuo, Yixuan Hu, Dekai Zhu, Dongyue Lu, Youquan Liu, Guangfeng Jiang, Linfeng Li, Xiangtai Li, Long Zhuo, Lai Xing Ng, Benoit R. Cottereau, Changxin Gao, Liang Pan, Wei Tsang Ooi, Ziwei Liu

Comments: Preprint; 80 pages, 37 figures, 29 tables; Project Page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[107] arXiv:2512.10957 [pdf, ps, other]: Title: SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model

Authors: Yukai Shi, Weiyu Li, Zihao Wang, Hongyang Li, Xingyu Chen, Ping Tan, Lei Zhang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[108] arXiv:2512.10956 [pdf, ps, other]: Title: Empowering Dynamic Urban Navigation with Stereo and Mid-Level Vision

Authors: Wentao Zhou, Xuweiyi Chen, Vignesh Rajagopal, Jeffrey Chen, Rohan Chandra, Zezhou Cheng

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[109] arXiv:2512.10955 [pdf, ps, other]: Title: Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization

Authors: Tsai-Shien Chen, Aliaksandr Siarohin, Guocheng Gordon Qian, Kuan-Chieh Jackson Wang, Egor Nemchinov, Moayed Haji-Ali, Riza Alp Guler, Willi Menapace, Ivan Skorokhodov, Anil Kag, Jun-Yan Zhu, Sergey Tulyakov

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[110] arXiv:2512.10954 [pdf, ps, other]: Title: Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration

Authors: Sicheng Mo, Thao Nguyen, Richard Zhang, Nick Kolkin, Siddharth Srinivasan Iyer, Eli Shechtman, Krishna Kumar Singh, Yong Jae Lee, Bolei Zhou, Yuheng Li

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[111] arXiv:2512.10950 [pdf, ps, other]: Title: E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training

Authors: Qitao Zhao, Hao Tan, Qianqian Wang, Sai Bi, Kai Zhang, Kalyan Sunkavalli, Shubham Tulsiani, Hanwen Jiang

Comments: Project website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[112] arXiv:2512.10949 [pdf, ps, other]: Title: Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

Authors: Yiwen Tang, Zoey Guo, Kaixin Zhu, Ray Zhang, Qizhi Chen, Dongzhi Jiang, Junli Liu, Bohan Zeng, Haoming Song, Delin Qu, Tianyi Bai, Dan Xu, Wentao Zhang, Bin Zhao

Comments: Code is released at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[113] arXiv:2512.10948 [pdf, ps, other]: Title: ClusIR: Towards Cluster-Guided All-in-One Image Restoration

Authors: Shengkai Hu, Jiaqi Ma, Jun Wan, Wenwen Min, Yongcheng Jing, Lefei Zhang, Dacheng Tao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[114] arXiv:2512.10947 [pdf, ps, other]: Title: Towards Efficient and Effective Multi-Camera Encoding for End-to-End Driving

Authors: Jiawei Yang, Ziyu Chen, Yurong You, Yan Wang, Yiming Li, Yuxiao Chen, Boyi Li, Boris Ivanovic, Marco Pavone, Yue Wang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[115] arXiv:2512.10945 [pdf, ps, other]: Title: MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation

Authors: Henghui Ding, Chang Liu, Shuting He, Kaining Ying, Xudong Jiang, Chen Change Loy, Yu-Gang Jiang

Comments: IEEE TPAMI, Project Page: this https URL

Journal-ref: in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 12, pp. 11400-11416, 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[116] arXiv:2512.10943 [pdf, ps, other]: Title: AlcheMinT: Fine-grained Temporal Control for Multi-Reference Consistent Video Generation

Authors: Sharath Girish, Viacheslav Ivanov, Tsai-Shien Chen, Hao Chen, Aliaksandr Siarohin, Sergey Tulyakov

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[117] arXiv:2512.10942 [pdf, ps, other]: Title: VL-JEPA: Joint Embedding Predictive Architecture for Vision-language

Authors: Delong Chen, Mustafa Shukor, Theo Moutakanni, Willy Chung, Jade Yu, Tejaswi Kasarla, Allen Bolourchi, Yann LeCun, Pascale Fung

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118] arXiv:2512.10941 [pdf, ps, other]: Title: Mull-Tokens: Modality-Agnostic Latent Thinking

Authors: Arijit Ray, Ahmed Abdelkader, Chengzhi Mao, Bryan A. Plummer, Kate Saenko, Ranjay Krishna, Leonidas Guibas, Wen-Sheng Chu

Comments: Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[119] arXiv:2512.10940 [pdf, ps, other]: Title: OmniView: An All-Seeing Diffusion Model for 3D and 4D View Synthesis

Authors: Xiang Fan, Sharath Girish, Vivek Ramanujan, Chaoyang Wang, Ashkan Mirzaei, Petr Sushko, Aliaksandr Siarohin, Sergey Tulyakov, Ranjay Krishna

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[120] arXiv:2512.10939 [pdf, ps, other]: Title: GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting

Authors: Madhav Agarwal, Mingtian Zhang, Laura Sevilla-Lara, Steven McDonagh

Comments: IEEE/CVF Winter Conference on Applications of Computer Vision 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2512.10935 [pdf, ps, other]: Title: Any4D: Unified Feed-Forward Metric 4D Reconstruction

Authors: Jay Karhade, Nikhil Keetha, Yuchen Zhang, Tanisha Gupta, Akash Sharma, Sebastian Scherer, Deva Ramanan

Comments: Project Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[122] arXiv:2512.10932 [pdf, ps, other]: Title: BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models

Authors: Shengao Wang, Wenqi Wang, Zecheng Wang, Max Whitton, Michael Wakeham, Arjun Chandra, Joey Huang, Pengyue Zhu, Helen Chen, David Li, Jeffrey Li, Shawn Li, Andrew Zagula, Amy Zhao, Andrew Zhu, Sayaka Nakamura, Yuki Yamamoto, Jerry Jun Yokono, Aaron Mueller, Bryan A. Plummer, Kate Saenko, Venkatesh Saligrama, Boqing Gong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[123] arXiv:2512.10927 [pdf, ps, other]: Title: FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos

Authors: Yulu Gan, Ligeng Zhu, Dandan Shan, Baifeng Shi, Hongxu Yin, Boris Ivanovic, Song Han, Trevor Darrell, Jitendra Malik, Marco Pavone, Boyi Li

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[124] arXiv:2512.10894 [pdf, ps, other]: Title: DuetSVG: Unified Multimodal SVG Generation with Internal Visual Guidance

Authors: Peiying Zhang, Nanxuan Zhao, Matthew Fisher, Yiran Xu, Jing Liao, Difan Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[125] arXiv:2512.10888 [pdf, ps, other]: Title: PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction

Authors: Brandon Smock, Valerie Faucon-Morin, Max Sokolov, Libin Liang, Tayyibah Khanam, Maury Courtland

Comments: 15 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[126] arXiv:2512.10881 [pdf, ps, other]: Title: MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos

Authors: Kehong Gong, Zhengyu Wen, Weixia He, Mingxi Xu, Qi Wang, Ning Zhang, Zhengyu Li, Dongze Lian, Wei Zhao, Xiaoyu He, Mingyuan Zhang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2512.10867 [pdf, ps, other]: Title: From Macro to Micro: Benchmarking Microscopic Spatial Intelligence on Molecules via Vision-Language Models

Authors: Zongzhao Li, Xiangzhe Kong, Jiahui Su, Zongyang Ma, Mingze Li, Songyou Li, Yuelin Zhang, Yu Rong, Tingyang Xu, Deli Zhao, Wenbing Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2512.10863 [pdf, ps, other]: Title: MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

Authors: Jingli Lin, Runsen Xu, Shaohao Zhu, Sihan Yang, Peizhou Cao, Yunlong Ran, Miao Hu, Chenming Zhu, Yiman Xie, Yilin Long, Wenbo Hu, Dahua Lin, Tai Wang, Jiangmiao Pang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

[ total of 747 entries: 1-100 | 29-128 | 129-228 | 229-328 | 329-428 | ... | 729-747 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 28

Mon, 15 Dec 2025 (continued, showing last 76 of 104 entries)

Fri, 12 Dec 2025 (showing first 24 of 118 entries)