Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 279

[ total of 778 entries: 1-50 | ... | 130-179 | 180-229 | 230-279 | 280-329 | 330-379 | 380-429 | 430-479 | ... | 730-778 ]
[ showing 50 entries per page: fewer | more | all ]

Thu, 4 Dec 2025 (continued, showing 50 of 130 entries)

[280] arXiv:2512.03663 [pdf, ps, other]: Title: Multi-Scale Visual Prompting for Lightweight Small-Image Classification

Authors: Salim Khazem

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[281] arXiv:2512.03643 [pdf, ps, other]: Title: Optical Context Compression Is Just (Bad) Autoencoding

Authors: Ivan Yee Lee, Cheng Yang, Taylor Berg-Kirkpatrick

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[282] arXiv:2512.03640 [pdf, ps, other]: Title: MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms

Authors: Jiahao Zhang, Xiao Zhao, Guangyu Gao

Journal-ref: MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15521. Springer, Singapore

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[283] arXiv:2512.03625 [pdf, ps, other]: Title: FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features

Authors: Zhigang Yang, Yuan Liu, Jiawei Zhang, Puning Zhang, Xinqiang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284] arXiv:2512.03621 [pdf, ps, other]: Title: ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation

Authors: Yaokun Li, Shuaixian Wang, Mantang Guo, Jiehui Huang, Taojun Ding, Mu Hu, Kaixuan Wang, Shaojie Shen, Guang Tan

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285] arXiv:2512.03619 [pdf, ps, other]: Title: LAMP: Language-Assisted Motion Planning for Controllable Video Generation

Authors: Muhammed Burak Kizil, Enes Sanli, Niloy J. Mitra, Erkut Erdem, Aykut Erdem, Duygu Ceylan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[286] arXiv:2512.03601 [pdf, ps, other]: Title: Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

Authors: Haoran Zhou, Gim Hee Lee

Comments: Accepted to NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[287] arXiv:2512.03598 [pdf, ps, other]: Title: Memory-Guided Point Cloud Completion for Dental Reconstruction

Authors: Jianan Sun, Yukang Huang, Dongzhihan Wang, Mingyu Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288] arXiv:2512.03597 [pdf, ps, other]: Title: HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation

Authors: Fuchen Zheng, Xinyi Chen, Weixuan Li, Quanjun Li, Junhua Zhou, Xiaojiao Guo, Xuhang Chen, Chi-Man Pun, Shoujun Zhou

Comments: 6 pages, 4 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289] arXiv:2512.03593 [pdf, ps, other]: Title: CloseUpAvatar: High-Fidelity Animatable Full-Body Avatars with Mixture of Multi-Scale Textures

Authors: David Svitov, Pietro Morerio, Lourdes Agapito, Alessio Del Bue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290] arXiv:2512.03592 [pdf, ps, other]: Title: Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding

Authors: Guang Yang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291] arXiv:2512.03590 [pdf, ps, other]: Title: Beyond Boundary Frames: Audio-Visual Semantic Guidance for Context-Aware Video Interpolation

Authors: Yuchen Deng, Xiuyang Wu, Hai-Tao Zheng, Jie Wang, Feidiao Yang, Yuxing Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[292] arXiv:2512.03580 [pdf, ps, other]: Title: Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes

Authors: Malte Bleeker, Mauro Gotsch

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[293] arXiv:2512.03577 [pdf, ps, other]: Title: Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning

Authors: Yizhi Zhang, Lei Fan, Zhulin Tao, Donglin Di, Yang Song, Sidong Liu, Cong Cong

Comments: 6 pages, 2 figures. Camera-ready version accepted for IEEE BIBM 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[294] arXiv:2512.03575 [pdf, ps, other]: Title: UniComp: Rethinking Video Compression Through Informational Uniqueness

Authors: Chao Yuan, Shimin Chen, Minliang Lin, Limeng Qiao, Guanglu Wan, Lin Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295] arXiv:2512.03574 [pdf, ps, other]: Title: Global-Local Aware Scene Text Editing

Authors: Fuxiang Yang, Tonghua Su, Donglin Di, Yin Chen, Xiangqian Wu, Zhongjie Wang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296] arXiv:2512.03566 [pdf, ps, other]: Title: GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models

Authors: Hao Sun, Lei Fan, Donglin Di, Shaohui Liu

Comments: Accepted by ACM MM Asia2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[297] arXiv:2512.03558 [pdf, ps, other]: Title: CartoMapQA: A Fundamental Benchmark Dataset Evaluating Vision-Language Models on Cartographic Map Understanding

Authors: Huy Quang Ung, Guillaume Habault, Yasutaka Nishimura, Hao Niu, Roberto Legaspi, Tomoki Oya, Ryoichi Kojima, Masato Taya, Chihiro Ono, Atsunori Minamikawa, Yan Liu

Comments: Accepted at SIGSPATIAL 2025 (Best paper candidates), 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[298] arXiv:2512.03553 [pdf, ps, other]: Title: Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching

Authors: Wei Chee Yew, Hailun Xu, Sanjay Saha, Xiaotian Fan, Hiok Hian Ong, David Yuchen Wang, Kanchan Sarkar, Zhenheng Yang, Danhui Guan

Comments: Accepted at KDD 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[299] arXiv:2512.03542 [pdf, ps, other]: Title: V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time Intervention

Authors: Nan Sun, Zhenyu Zhang, Xixun Lin, Kun Wang, Yanmin Shang, Naibin Gu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang, Yanan Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[300] arXiv:2512.03540 [pdf, ps, other]: Title: CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation

Authors: Ruoxuan Zhang, Bin Wen, Hongxia Xie, Yi Yao, Songhan Zuo, Jian-Yu Jiang-Lin, Hong-Han Shuai, Wen-Huang Cheng

Comments: Accepted by ACM Multimedia 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[301] arXiv:2512.03534 [pdf, ps, other]: Title: Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Authors: Subin Kim, Sangwoo Mo, Mamshad Nayeem Rizve, Yiran Xu, Difan Liu, Jinwoo Shin, Tobias Hinz

Comments: Visualizations are available at the website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[302] arXiv:2512.03532 [pdf, ps, other]: Title: OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation

Authors: Zhishan Zhou, Siyuan Wei, Zengran Wang, Chunjie Wang, Xiaosheng Yan, Xiao Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303] arXiv:2512.03520 [pdf, ps, other]: Title: FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation

Authors: Yiyi Cai, Yuhan Wu, Kunhang Li, You Zhou, Bo Zheng, Haiyang Liu

Comments: 15 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304] arXiv:2512.03510 [pdf, ps, other]: Title: CSMapping: Scalable Crowdsourced Semantic Mapping and Topology Inference for Autonomous Driving

Authors: Zhijian Qiao, Zehuan Yu, Tong Li, Chih-Chung Chou, Wenchao Ding, Shaojie Shen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[305] arXiv:2512.03509 [pdf, ps, other]: Title: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model

Authors: Kwaku Opoku-Ware, Gideon Opoku

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306] arXiv:2512.03508 [pdf, ps, other]: Title: Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation

Authors: Seogkyu Jeon, Kibeom Hong, Hyeran Byun

Comments: ICCV 2025 (poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307] arXiv:2512.03500 [pdf, ps, other]: Title: EEA: Exploration-Exploitation Agent for Long Video Understanding

Authors: Te Yang, Xiangyu Zhu, Bo Wang, Quan Chen, Peng Jiang, Zhen Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308] arXiv:2512.03499 [pdf, ps, other]: Title: NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation

Authors: Renqi Chen, Haoyang Su, Shixiang Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[309] arXiv:2512.03479 [pdf, ps, other]: Title: Towards Object-centric Understanding for Instructional Videos

Authors: Wenliang Guo, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310] arXiv:2512.03477 [pdf, ps, other]: Title: Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis

Authors: Zijian Gu, Yuxi Liu, Zhenhao Zhang, Song Wang

Comments: 10 pages, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[311] arXiv:2512.03474 [pdf, ps, other]: Title: Procedural Mistake Detection via Action Effect Modeling

Authors: Wenliang Guo, Yujiang Pu, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312] arXiv:2512.03470 [pdf, ps, other]: Title: Difference Decomposition Networks for Infrared Small Target Detection

Authors: Chen Hu, Mingyu Zhou, Shuai Yuan, Hongbo Hu, Xiangyu Qiu, Junhai Luo, Tian Pu, Xiyin Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313] arXiv:2512.03463 [pdf, ps, other]: Title: Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models

Authors: Shojiro Yamabe, Futa Waseda, Daiki Shiono, Tsubasa Takahashi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[314] arXiv:2512.03454 [pdf, ps, other]: Title: Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles

Authors: Haicheng Liao, Huanming Shen, Bonan Wang, Yongkang Li, Yihong Tang, Chengyue Wang, Dingyi Zhuang, Kehua Chen, Hai Yang, Chengzhong Xu, Zhenning Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[315] arXiv:2512.03453 [pdf, ps, other]: Title: GeoVideo: Introducing Geometric Regularization into Video Generation Model

Authors: Yunpeng Bai, Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316] arXiv:2512.03451 [pdf, ps, other]: Title: GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers

Authors: Zhiye Song, Steve Dai, Ben Keller, Brucek Khailany

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[317] arXiv:2512.03450 [pdf, ps, other]: Title: KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models

Authors: Rhys Newbury, Juyan Zhang, Tin Tran, Hanna Kurniawati, Dana Kulić

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[318] arXiv:2512.03449 [src]: Title: LM-CartSeg: Automated Segmentation of Lateral and Medial Cartilage and Subchondral Bone for Radiomics Analysis

Authors: Tongxu Zhang

Comments: The manuscript represents only a preliminary and substantially incompleted exploration. The author has decided not to stand by these results, and a thoroughly revised and significantly different version will be developed separately. Therefore this version is withdrawn and should not be cited

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[319] arXiv:2512.03445 [pdf, ps, other]: Title: Multi-Aspect Knowledge-Enhanced Medical Vision-Language Pretraining with Multi-Agent Data Generation

Authors: Xieji Li, Siyuan Yan, Yingsheng Liu, H. Peter Soyer, Monika Janda, Victoria Mar, Zongyuan Ge

Comments: 10 pages. Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320] arXiv:2512.03430 [pdf, ps, other]: Title: Label-Efficient Hyperspectral Image Classification via Spectral FiLM Modulation of Low-Level Pretrained Diffusion Features

Authors: Yuzhen Hu, Biplab Banerjee, Saurabh Prasad

Comments: Accepted to the ICML 2025 TerraBytes Workshop (June 9, 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321] arXiv:2512.03427 [pdf, ps, other]: Title: Generalization Evaluation of Deep Stereo Matching Methods for UAV-Based Forestry Applications

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322] arXiv:2512.03424 [pdf, ps, other]: Title: DM3D: Deformable Mamba via Offset-Guided Gaussian Sequencing for Point Cloud Understanding

Authors: Bin Liu, Chunyang Wang, Xuelian Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323] arXiv:2512.03418 [pdf, ps, other]: Title: YOLOA: Real-Time Affordance Detection via LLM Adapter

Authors: Yuqi Ji, Junjie Ke, Lihuo He, Jun Liu, Kaifan Zhang, Yu-Kun Lai, Guiguang Ding, Xinbo Gao

Comments: 13 pages, 9 figures, conference

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[324] arXiv:2512.03405 [pdf, ps, other]: Title: ViDiC: Video Difference Captioning

Authors: Jiangtao Wu, Shihao Li, Zhaozhou Bian, Jialu Chen, Runzhe Wen, An Ping, Yiwen He, Jiakai Wang, Yuanxing Zhang, Jiaheng Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325] arXiv:2512.03404 [pdf, ps, other]: Title: MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification

Authors: Yujian Zhao, Hankun Liu, Guanglin Niu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326] arXiv:2512.03370 [pdf, ps, other]: Title: ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding

Authors: Lingjun Zhao, Yandong Luo, James Hay, Lu Gan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[327] arXiv:2512.03369 [pdf, ps, other]: Title: FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread Forecasting

Authors: Nan Zhou, Huandong Wang, Jiahao Li, Han Li, Yali Song, Qiuhua Wang, Yong Li, Xinlei Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[328] arXiv:2512.03359 [pdf, ps, other]: Title: A Hybrid Deep Learning Framework with Explainable AI for Lung Cancer Classification with DenseNet169 and SVM

Authors: Md Rashidul Islam, Bakary Gibba, Altagi Abdallah Bakheit Abdelgadir

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329] arXiv:2512.03350 [pdf, ps, other]: Title: SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation

Authors: Yu Yuan, Tharindu Wickremasinghe, Zeeshan Nadir, Xijun Wang, Yiheng Chi, Stanley H. Chan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 778 entries: 1-50 | ... | 130-179 | 180-229 | 230-279 | 280-329 | 330-379 | 380-429 | 430-479 | ... | 730-778 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 279

Thu, 4 Dec 2025 (continued, showing 50 of 130 entries)