Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 536

[ total of 759 entries: 1-100 | ... | 237-336 | 337-436 | 437-536 | 537-636 | 637-736 | 737-759 ]
[ showing 100 entries per page: fewer | more | all ]

Thu, 4 Dec 2025 (continued, showing last 82 of 130 entries)

[537] arXiv:2512.03667 [pdf, ps, other]: Title: Colon-X: Advancing Intelligent Colonoscopy from Multimodal Understanding to Clinical Reasoning

Authors: Ge-Peng Ji, Jingyi Liu, Deng-Ping Fan, Nick Barnes

Comments: Technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538] arXiv:2512.03666 [pdf, ps, other]: Title: ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos

Authors: Qi'ao Xu, Tianwen Qian, Yuqian Fu, Kailing Li, Yang Jiao, Jiacheng Zhang, Xiaoling Wang, Liang He

Comments: 26 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[539] arXiv:2512.03663 [pdf, ps, other]: Title: Multi-Scale Visual Prompting for Lightweight Small-Image Classification

Authors: Salim Khazem

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540] arXiv:2512.03643 [pdf, ps, other]: Title: Optical Context Compression Is Just (Bad) Autoencoding

Authors: Ivan Yee Lee, Cheng Yang, Taylor Berg-Kirkpatrick

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[541] arXiv:2512.03640 [pdf, ps, other]: Title: MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms

Authors: Jiahao Zhang, Xiao Zhao, Guangyu Gao

Journal-ref: MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15521. Springer, Singapore

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[542] arXiv:2512.03625 [pdf, ps, other]: Title: FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features

Authors: Zhigang Yang, Yuan Liu, Jiawei Zhang, Puning Zhang, Xinqiang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543] arXiv:2512.03621 [pdf, ps, other]: Title: ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation

Authors: Yaokun Li, Shuaixian Wang, Mantang Guo, Jiehui Huang, Taojun Ding, Mu Hu, Kaixuan Wang, Shaojie Shen, Guang Tan

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[544] arXiv:2512.03619 [pdf, ps, other]: Title: LAMP: Language-Assisted Motion Planning for Controllable Video Generation

Authors: Muhammed Burak Kizil, Enes Sanli, Niloy J. Mitra, Erkut Erdem, Aykut Erdem, Duygu Ceylan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545] arXiv:2512.03601 [pdf, ps, other]: Title: Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

Authors: Haoran Zhou, Gim Hee Lee

Comments: Accepted to NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546] arXiv:2512.03598 [pdf, ps, other]: Title: Memory-Guided Point Cloud Completion for Dental Reconstruction

Authors: Jianan Sun, Yukang Huang, Dongzhihan Wang, Mingyu Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547] arXiv:2512.03597 [pdf, ps, other]: Title: HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation

Authors: Fuchen Zheng, Xinyi Chen, Weixuan Li, Quanjun Li, Junhua Zhou, Xiaojiao Guo, Xuhang Chen, Chi-Man Pun, Shoujun Zhou

Comments: 6 pages, 4 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548] arXiv:2512.03593 [pdf, ps, other]: Title: CloseUpAvatar: High-Fidelity Animatable Full-Body Avatars with Mixture of Multi-Scale Textures

Authors: David Svitov, Pietro Morerio, Lourdes Agapito, Alessio Del Bue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549] arXiv:2512.03592 [pdf, ps, other]: Title: Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding

Authors: Guang Yang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550] arXiv:2512.03590 [pdf, ps, other]: Title: Beyond Boundary Frames: Audio-Visual Semantic Guidance for Context-Aware Video Interpolation

Authors: Yuchen Deng, Xiuyang Wu, Hai-Tao Zheng, Jie Wang, Feidiao Yang, Yuxing Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[551] arXiv:2512.03580 [pdf, ps, other]: Title: Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes

Authors: Malte Bleeker, Mauro Gotsch

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[552] arXiv:2512.03577 [pdf, ps, other]: Title: Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning

Authors: Yizhi Zhang, Lei Fan, Zhulin Tao, Donglin Di, Yang Song, Sidong Liu, Cong Cong

Comments: 6 pages, 2 figures. Camera-ready version accepted for IEEE BIBM 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[553] arXiv:2512.03575 [pdf, ps, other]: Title: UniComp: Rethinking Video Compression Through Informational Uniqueness

Authors: Chao Yuan, Shimin Chen, Minliang Lin, Limeng Qiao, Guanglu Wan, Lin Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554] arXiv:2512.03574 [pdf, ps, other]: Title: Global-Local Aware Scene Text Editing

Authors: Fuxiang Yang, Tonghua Su, Donglin Di, Yin Chen, Xiangqian Wu, Zhongjie Wang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[555] arXiv:2512.03566 [pdf, ps, other]: Title: GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models

Authors: Hao Sun, Lei Fan, Donglin Di, Shaohui Liu

Comments: Accepted by ACM MM Asia2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[556] arXiv:2512.03558 [pdf, ps, other]: Title: CartoMapQA: A Fundamental Benchmark Dataset Evaluating Vision-Language Models on Cartographic Map Understanding

Authors: Huy Quang Ung, Guillaume Habault, Yasutaka Nishimura, Hao Niu, Roberto Legaspi, Tomoki Oya, Ryoichi Kojima, Masato Taya, Chihiro Ono, Atsunori Minamikawa, Yan Liu

Comments: Accepted at SIGSPATIAL 2025 (Best paper candidates), 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[557] arXiv:2512.03553 [pdf, ps, other]: Title: Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching

Authors: Wei Chee Yew, Hailun Xu, Sanjay Saha, Xiaotian Fan, Hiok Hian Ong, David Yuchen Wang, Kanchan Sarkar, Zhenheng Yang, Danhui Guan

Comments: Accepted at KDD 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[558] arXiv:2512.03542 [pdf, ps, other]: Title: V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time Intervention

Authors: Nan Sun, Zhenyu Zhang, Xixun Lin, Kun Wang, Yanmin Shang, Naibin Gu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang, Yanan Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[559] arXiv:2512.03540 [pdf, ps, other]: Title: CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation

Authors: Ruoxuan Zhang, Bin Wen, Hongxia Xie, Yi Yao, Songhan Zuo, Jian-Yu Jiang-Lin, Hong-Han Shuai, Wen-Huang Cheng

Comments: Accepted by ACM Multimedia 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[560] arXiv:2512.03534 [pdf, ps, other]: Title: Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Authors: Subin Kim, Sangwoo Mo, Mamshad Nayeem Rizve, Yiran Xu, Difan Liu, Jinwoo Shin, Tobias Hinz

Comments: Visualizations are available at the website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[561] arXiv:2512.03532 [pdf, ps, other]: Title: OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation

Authors: Zhishan Zhou, Siyuan Wei, Zengran Wang, Chunjie Wang, Xiaosheng Yan, Xiao Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562] arXiv:2512.03520 [pdf, ps, other]: Title: FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation

Authors: Yiyi Cai, Yuhan Wu, Kunhang Li, You Zhou, Bo Zheng, Haiyang Liu

Comments: 15 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[563] arXiv:2512.03510 [pdf, ps, other]: Title: CSMapping: Scalable Crowdsourced Semantic Mapping and Topology Inference for Autonomous Driving

Authors: Zhijian Qiao, Zehuan Yu, Tong Li, Chih-Chung Chou, Wenchao Ding, Shaojie Shen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[564] arXiv:2512.03509 [pdf, ps, other]: Title: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model

Authors: Kwaku Opoku-Ware, Gideon Opoku

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[565] arXiv:2512.03508 [pdf, ps, other]: Title: Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation

Authors: Seogkyu Jeon, Kibeom Hong, Hyeran Byun

Comments: ICCV 2025 (poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[566] arXiv:2512.03500 [pdf, ps, other]: Title: EEA: Exploration-Exploitation Agent for Long Video Understanding

Authors: Te Yang, Xiangyu Zhu, Bo Wang, Quan Chen, Peng Jiang, Zhen Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567] arXiv:2512.03499 [pdf, ps, other]: Title: NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation

Authors: Renqi Chen, Haoyang Su, Shixiang Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[568] arXiv:2512.03479 [pdf, ps, other]: Title: Towards Object-centric Understanding for Instructional Videos

Authors: Wenliang Guo, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569] arXiv:2512.03477 [pdf, ps, other]: Title: Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis

Authors: Zijian Gu, Yuxi Liu, Zhenhao Zhang, Song Wang

Comments: 10 pages, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[570] arXiv:2512.03474 [pdf, ps, other]: Title: Procedural Mistake Detection via Action Effect Modeling

Authors: Wenliang Guo, Yujiang Pu, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571] arXiv:2512.03470 [pdf, ps, other]: Title: Difference Decomposition Networks for Infrared Small Target Detection

Authors: Chen Hu, Mingyu Zhou, Shuai Yuan, Hongbo Hu, Xiangyu Qiu, Junhai Luo, Tian Pu, Xiyin Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572] arXiv:2512.03463 [pdf, ps, other]: Title: Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models

Authors: Shojiro Yamabe, Futa Waseda, Daiki Shiono, Tsubasa Takahashi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[573] arXiv:2512.03454 [pdf, ps, other]: Title: Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles

Authors: Haicheng Liao, Huanming Shen, Bonan Wang, Yongkang Li, Yihong Tang, Chengyue Wang, Dingyi Zhuang, Kehua Chen, Hai Yang, Chengzhong Xu, Zhenning Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[574] arXiv:2512.03453 [pdf, ps, other]: Title: GeoVideo: Introducing Geometric Regularization into Video Generation Model

Authors: Yunpeng Bai, Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[575] arXiv:2512.03451 [pdf, ps, other]: Title: GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers

Authors: Zhiye Song, Steve Dai, Ben Keller, Brucek Khailany

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[576] arXiv:2512.03450 [pdf, ps, other]: Title: KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models

Authors: Rhys Newbury, Juyan Zhang, Tin Tran, Hanna Kurniawati, Dana Kulić

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[577] arXiv:2512.03449 [src]: Title: LM-CartSeg: Automated Segmentation of Lateral and Medial Cartilage and Subchondral Bone for Radiomics Analysis

Authors: Tongxu Zhang

Comments: The manuscript represents only a preliminary and substantially incompleted exploration. The author has decided not to stand by these results, and a thoroughly revised and significantly different version will be developed separately. Therefore this version is withdrawn and should not be cited

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[578] arXiv:2512.03445 [pdf, ps, other]: Title: Multi-Aspect Knowledge-Enhanced Medical Vision-Language Pretraining with Multi-Agent Data Generation

Authors: Xieji Li, Siyuan Yan, Yingsheng Liu, H. Peter Soyer, Monika Janda, Victoria Mar, Zongyuan Ge

Comments: 10 pages. Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[579] arXiv:2512.03430 [pdf, ps, other]: Title: Label-Efficient Hyperspectral Image Classification via Spectral FiLM Modulation of Low-Level Pretrained Diffusion Features

Authors: Yuzhen Hu, Biplab Banerjee, Saurabh Prasad

Comments: Accepted to the ICML 2025 TerraBytes Workshop (June 9, 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[580] arXiv:2512.03427 [pdf, ps, other]: Title: Generalization Evaluation of Deep Stereo Matching Methods for UAV-Based Forestry Applications

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[581] arXiv:2512.03424 [pdf, ps, other]: Title: DM3D: Deformable Mamba via Offset-Guided Gaussian Sequencing for Point Cloud Understanding

Authors: Bin Liu, Chunyang Wang, Xuelian Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[582] arXiv:2512.03418 [pdf, ps, other]: Title: YOLOA: Real-Time Affordance Detection via LLM Adapter

Authors: Yuqi Ji, Junjie Ke, Lihuo He, Jun Liu, Kaifan Zhang, Yu-Kun Lai, Guiguang Ding, Xinbo Gao

Comments: 13 pages, 9 figures, conference

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[583] arXiv:2512.03405 [pdf, ps, other]: Title: ViDiC: Video Difference Captioning

Authors: Jiangtao Wu, Shihao Li, Zhaozhou Bian, Jialu Chen, Runzhe Wen, An Ping, Yiwen He, Jiakai Wang, Yuanxing Zhang, Jiaheng Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[584] arXiv:2512.03404 [pdf, ps, other]: Title: MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification

Authors: Yujian Zhao, Hankun Liu, Guanglin Niu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[585] arXiv:2512.03370 [pdf, ps, other]: Title: ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding

Authors: Lingjun Zhao, Yandong Luo, James Hay, Lu Gan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[586] arXiv:2512.03369 [pdf, ps, other]: Title: FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread Forecasting

Authors: Nan Zhou, Huandong Wang, Jiahao Li, Han Li, Yali Song, Qiuhua Wang, Yong Li, Xinlei Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[587] arXiv:2512.03359 [pdf, ps, other]: Title: A Hybrid Deep Learning Framework with Explainable AI for Lung Cancer Classification with DenseNet169 and SVM

Authors: Md Rashidul Islam, Bakary Gibba, Altagi Abdallah Bakheit Abdelgadir

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[588] arXiv:2512.03350 [pdf, ps, other]: Title: SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation

Authors: Yu Yuan, Tharindu Wickremasinghe, Zeeshan Nadir, Xijun Wang, Yiheng Chi, Stanley H. Chan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[589] arXiv:2512.03346 [pdf, ps, other]: Title: Hierarchical Attention for Sparse Volumetric Anomaly Detection in Subclinical Keratoconus

Authors: Lynn Kandakji, William Woof, Nikolas Pontikos

Comments: 16 pages, 7 figures, 6 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[590] arXiv:2512.03345 [pdf, ps, other]: Title: HalluGen: Synthesizing Realistic and Controllable Hallucinations for Evaluating Image Restoration

Authors: Seunghoi Kim, Henry F. J. Tregidgo, Chen Jin, Matteo Figini, Daniel C. Alexander

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[591] arXiv:2512.03339 [pdf, ps, other]: Title: ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in Echocardiography

Authors: Yeganeh Ghamary, Victoria Wu, Hooman Vaseli, Christina Luong, Teresa Tsang, Siavash Bigdeli, Purang Abolmaesumi

Comments: 11 pages, Accepted in IMIMIC Workshop at MICCAI 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[592] arXiv:2512.03335 [pdf, ps, other]: Title: Step-by-step Layered Design Generation

Authors: Faizan Farooq Khan, K J Joseph, Koustava Goswami, Mohamed Elhoseiny, Balaji Vasan Srinivasan

Journal-ref: AAAI 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[593] arXiv:2512.03317 [pdf, ps, other]: Title: NavMapFusion: Diffusion-based Fusion of Navigation Maps for Online Vectorized HD Map Construction

Authors: Thomas Monninger, Zihan Zhang, Steffen Staab, Sihao Ding

Comments: Accepted to 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2026)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[594] arXiv:2512.03284 [pdf, ps, other]: Title: SpatialReasoner: Active Perception for Large-Scale 3D Scene Understanding

Authors: Hongpei Zheng, Shijie Li, Yanran Li, Hujun Yin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[595] arXiv:2512.03257 [pdf, ps, other]: Title: PyroFocus: A Deep Learning Approach to Real-Time Wildfire Detection in Multispectral Remote Sensing Imagery

Authors: Mark Moussa, Andre Williams, Seth Roffe, Douglas Morton

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[596] arXiv:2512.03247 [pdf, ps, other]: Title: PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement

Authors: Haitian Zheng, Yuan Yao, Yongsheng Yu, Yuqian Zhou, Jiebo Luo, Zhe Lin

Comments: Published in the Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[597] arXiv:2512.03245 [pdf, ps, other]: Title: 2-Shots in the Dark: Low-Light Denoising with Minimal Data Acquisition

Authors: Liying Lu, Raphaël Achddou, Sabine Süsstrunk

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[598] arXiv:2512.03237 [pdf, ps, other]: Title: LLM-Guided Material Inference for 3D Point Clouds

Authors: Nafiseh Izadyar, Teseo Schneider

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[599] arXiv:2512.03233 [pdf, ps, other]: Title: Object Counting with GPT-4o and GPT-5: A Comparative Study

Authors: Richard Füzesséry, Kaziwa Saleh, Sándor Szénási, Zoltán Vámossy

Comments: 5 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[600] arXiv:2512.03210 [pdf, ps, other]: Title: Flux4D: Flow-based Unsupervised 4D Reconstruction

Authors: Jingkang Wang, Henry Che, Yun Chen, Ze Yang, Lily Goli, Sivabalan Manivasagam, Raquel Urtasun

Comments: NeurIPS 2025. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[601] arXiv:2512.03199 [pdf, ps, other]: Title: Does Head Pose Correction Improve Biometric Facial Recognition?

Authors: Justin Norman, Hany Farid

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[602] arXiv:2512.03182 [pdf, ps, other]: Title: Drainage: A Unifying Framework for Addressing Class Uncertainty

Authors: Yasser Taha, Grégoire Montavon, Nils Körber

Comments: 16 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[603] arXiv:2512.03126 [pdf, ps, other]: Title: Hierarchical Process Reward Models are Symbolic Vision Learners

Authors: Shan Zhang, Aotian Chen, Kai Zou, Jindong Gu, Yuan Xue, Anton van den Hengel

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[604] arXiv:2512.04076 (cross-list from cs.GR) [pdf, ps, other]: Title: Radiance Meshes for Volumetric Reconstruction

Authors: Alexander Mai, Trevor Hedstrom, George Kopanas, Janne Kontkanen, Falko Kuester, Jonathan T. Barron

Comments: Website: half-potato.gitlab.io/rm

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[605] arXiv:2512.04032 (cross-list from cs.CL) [pdf, ps, other]: Title: Jina-VLM: Small Multilingual Vision Language Model

Authors: Andreas Koukounas, Georgios Mastrapas, Florian Hönicke, Sedigheh Eslami, Guillaume Roncari, Scott Martens, Han Xiao

Comments: 18 pages, 1-7 main content, 13-18 appendix for tables and dataset

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[606] arXiv:2512.03995 (cross-list from cs.RO) [pdf, ps, other]: Title: Artificial Microsaccade Compensation: Stable Vision for an Ornithopter

Authors: Levi Burner, Guido de Croon, Yiannis Aloimonos

Comments: 29 pages, 5 figures, 2 tables, under review

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[607] arXiv:2512.03962 (cross-list from eess.IV) [pdf, ps, other]: Title: Tada-DIP: Input-adaptive Deep Image Prior for One-shot 3D Image Reconstruction

Authors: Evan Bell, Shijun Liang, Ismail Alkhouri, Saiprasad Ravishankar

Comments: 6 pages, 8 figures, 2025 Asilomar Conference on Signals, Systems, and Computers. Code is available at github.com/evanbell02/Tada-DIP/

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[608] arXiv:2512.03656 (cross-list from cs.LG) [pdf, ps, other]: Title: Cyclical Temporal Encoding and Hybrid Deep Ensembles for Multistep Energy Forecasting

Authors: Salim Khazem, Houssam Kanso

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[609] arXiv:2512.03556 (cross-list from cs.RO) [pdf, ps, other]: Title: RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL

Authors: Yinzhou Tang, Yu Shang, Yinuo Chen, Bingwen Wei, Xin Zhang, Shu'ang Yu, Liangzhi Shi, Chao Yu, Chen Gao, Wei Wu, Yong Li

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[610] arXiv:2512.03522 (cross-list from cs.RO) [pdf, ps, other]: Title: MSG-Loc: Multi-Label Likelihood-based Semantic Graph Matching for Object-Level Global Localization

Authors: Gihyeon Lee, Jungwoo Lee, Juwon Kim, Young-Sik Shin, Younggun Cho

Comments: Accepted in IEEE Robotics and Automation Letters (2025)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[611] arXiv:2512.03514 (cross-list from cs.IR) [pdf, ps, other]: Title: M3DR: Towards Universal Multilingual Multimodal Document Retrieval

Authors: Adithya S Kolavi, Vyoman Jain

Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[612] arXiv:2512.03422 (cross-list from cs.RO) [pdf, ps, other]: Title: What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models

Authors: Tianchen Deng, Yue Pan, Shenghai Yuan, Dong Li, Chen Wang, Mingrui Li, Long Chen, Lihua Xie, Danwei Wang, Jingchuan Wang, Javier Civera, Hesheng Wang, Weidong Chen

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[613] arXiv:2512.03216 (cross-list from physics.ins-det) [pdf, ps, other]: Title: Kaleidoscopic Scintillation Event Imaging

Authors: Alex Bocchieri, John Mamish, David Appleyard, Andreas Velten

Subjects: Instrumentation and Detectors (physics.ins-det); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[614] arXiv:2512.03173 (cross-list from cs.CY) [pdf, ps, other]: Title: Culture Affordance Atlas: Reconciling Object Diversity Through Functional Mapping

Authors: Joan Nwatu, Longju Bai, Oana Ignat, Rada Mihalcea

Journal-ref: AAAI 2026 Social Impact Track

Subjects: Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[615] arXiv:2512.03166 (cross-list from cs.RO) [pdf, ps, other]: Title: Multi-Agent Reinforcement Learning and Real-Time Decision-Making in Robotic Soccer for Virtual Environments

Authors: Aya Taourirte, Md Sohag Mia

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[616] arXiv:2512.03111 (cross-list from q-bio.GN) [pdf, ps, other]: Title: PanFoMa: A Lightweight Foundation Model and Benchmark for Pan-Cancer

Authors: Xiaoshui Huang, Tianlin Zhu, Yifan Zuo, Xue Xia, Zonghan Wu, Jiebin Yan, Dingli Hua, Zongyi Xu, Yuming Fang, Jian Zhang

Comments: Accepted by AAAI 2026

Subjects: Genomics (q-bio.GN); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[617] arXiv:2512.03054 (cross-list from cs.LG) [pdf, ps, other]: Title: Energy-Efficient Federated Learning via Adaptive Encoder Freezing for MRI-to-CT Conversion: A Green AI-Guided Research

Authors: Ciro Benito Raggio, Lucia Migliorelli, Nils Skupien, Mathias Krohmer Zabaleta, Oliver Blanck, Francesco Cicone, Giuseppe Lucio Cascini, Paolo Zaffino, Maria Francesca Spadea

Comments: 22 pages, 13 figures

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Medical Physics (physics.med-ph)
[618] arXiv:2512.03052 (cross-list from cs.GR) [pdf, ps, other]: Title: LATTICE: Democratize High-Fidelity 3D Generation at Scale

Authors: Zeqiang Lai, Yunfei Zhao, Zibo Zhao, Haolin Liu, Qingxiang Lin, Jingwei Huang, Chunchao Guo, Xiangyu Yue

Comments: Technical Report

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)

Wed, 3 Dec 2025 (showing first 18 of 141 entries)

[619] arXiv:2512.03046 [pdf, ps, other]: Title: MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues

Authors: Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Shuailei Ma, Ka Leong Cheng, Wen Wang, Qingyan Bai, Yuxuan Zhang, Yanhong Zeng, Yixuan Li, Xing Zhu, Yujun Shen, Qifeng Chen

Comments: Code and demo available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[620] arXiv:2512.03045 [pdf, ps, other]: Title: CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models

Authors: Minkyung Kwon, Jinhyeok Choi, Jiho Park, Seonghu Jeon, Jinhyuk Jang, Junyoung Seo, Minseop Kwak, Jin-Hwa Kim, Seungryong Kim

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[621] arXiv:2512.03043 [pdf, ps, other]: Title: OneThinker: All-in-one Reasoning Model for Image and Video

Authors: Kaituo Feng, Manyuan Zhang, Hongyu Li, Kaixuan Fan, Shuang Chen, Yilei Jiang, Dian Zheng, Peiwen Sun, Yiyuan Zhang, Haoze Sun, Yan Feng, Peng Pei, Xunliang Cai, Xiangyu Yue

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[622] arXiv:2512.03042 [pdf, ps, other]: Title: PPTArena: A Benchmark for Agentic PowerPoint Editing

Authors: Michael Ofengenden, Yunze Man, Ziqi Pang, Yu-Xiong Wang

Comments: Project webpage: this https URL GitHub: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[623] arXiv:2512.03041 [pdf, ps, other]: Title: MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

Authors: Qinghe Wang, Xiaoyu Shi, Baolu Li, Weikang Bian, Quande Liu, Huchuan Lu, Xintao Wang, Pengfei Wan, Kun Gai, Xu Jia

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[624] arXiv:2512.03040 [pdf, ps, other]: Title: Video4Spatial: Towards Visuospatial Intelligence with Context-Guided Video Generation

Authors: Zeqi Xiao, Yiwei Zhao, Lingxiao Li, Yushi Lan, Yu Ning, Rahul Garg, Roshni Cooper, Mohammad H. Taghavi, Xingang Pan

Comments: Project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[625] arXiv:2512.03036 [pdf, ps, other]: Title: ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

Authors: Mengchen Zhang, Qi Chen, Tong Wu, Zihan Liu, Dahua Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[626] arXiv:2512.03034 [pdf, ps, other]: Title: MAViD: A Multimodal Framework for Audio-Visual Dialogue Understanding and Generation

Authors: Youxin Pang, Jiajun Liu, Lingfeng Tan, Yong Zhang, Feng Gao, Xiang Deng, Zhuoliang Kang, Xiaoming Wei, Yebin Liu

Comments: Our project website is this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[627] arXiv:2512.03020 [pdf, ps, other]: Title: Unrolled Networks are Conditional Probability Flows in MRI Reconstruction

Authors: Kehan Qi, Saumya Gupta, Qingqiao Hu, Weimin Lyu, Chao Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[628] arXiv:2512.03018 [pdf, ps, other]: Title: AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry

Authors: Xiang Xu, Pradeep Kumar Jayaraman, Joseph G. Lambourne, Yilin Liu, Durvesh Malpure, Pete Meltzer

Comments: Accepted to Siggraph Asia 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[629] arXiv:2512.03014 [pdf, ps, other]: Title: Instant Video Models: Universal Adapters for Stabilizing Image-Based Networks

Authors: Matthew Dutson, Nathan Labiosa, Yin Li, Mohit Gupta

Comments: NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[630] arXiv:2512.03013 [pdf, ps, other]: Title: In-Context Sync-LoRA for Portrait Video Editing

Authors: Sagi Polaczek, Or Patashnik, Ali Mahdavi-Amiri, Daniel Cohen-Or

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[631] arXiv:2512.03010 [pdf, ps, other]: Title: SurfFill: Completion of LiDAR Point Clouds via Gaussian Surfel Splatting

Authors: Svenja Strobel, Matthias Innmann, Bernhard Egger, Marc Stamminger, Linus Franke

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Robotics (cs.RO)
[632] arXiv:2512.03004 [pdf, ps, other]: Title: DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images

Authors: Xiaoxue Chen, Ziyi Xiong, Yuantao Chen, Gen Li, Nan Wang, Hongcheng Luo, Long Chen, Haiyang Sun, Bing Wang, Guang Chen, Hangjun Ye, Hongyang Li, Ya-Qin Zhang, Hao Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[633] arXiv:2512.03000 [pdf, ps, other]: Title: DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling

Authors: Kairun Wen, Yuzhi Huang, Runyu Chen, Hui Zheng, Yunlong Lin, Panwang Pan, Chenxin Li, Wenyan Cong, Jian Zhang, Junbin Lu, Chenguo Lin, Dilin Wang, Zhicheng Yan, Hongyu Xu, Justin Theiss, Yue Huang, Xinghao Ding, Rakesh Ranjan, Zhiwen Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[634] arXiv:2512.02993 [pdf, ps, other]: Title: TEXTRIX: Latent Attribute Grid for Native Texture Generation and Beyond

Authors: Yifei Zeng, Yajie Bao, Jiachen Qian, Shuang Wu, Youtian Lin, Hao Zhu, Buyu Li, Feihu Zhang, Xun Cao, Yao Yao

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[635] arXiv:2512.02991 [pdf, ps, other]: Title: GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection

Authors: Md Sohag Mia, Md Nahid Hasan, Tawhid Ahmed, Muhammad Abdullah Adnan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[636] arXiv:2512.02982 [pdf, ps, other]: Title: U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences

Authors: Xiang Xu, Ao Liang, Youquan Liu, Linfeng Li, Lingdong Kong, Ziwei Liu, Qingshan Liu

Comments: Preprint; 19 pages, 7 figures, 8 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

[ total of 759 entries: 1-100 | ... | 237-336 | 337-436 | 437-536 | 537-636 | 637-736 | 737-759 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 536

Thu, 4 Dec 2025 (continued, showing last 82 of 130 entries)

Wed, 3 Dec 2025 (showing first 18 of 141 entries)