Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 488

[ total of 759 entries: 1-100 | ... | 189-288 | 289-388 | 389-488 | 489-588 | 589-688 | 689-759 ]
[ showing 100 entries per page: fewer | more | all ]

Thu, 4 Dec 2025 (showing first 100 of 130 entries)

[489] arXiv:2512.04085 [pdf, ps, other]: Title: Unique Lives, Shared World: Learning from Single-Life Videos

Authors: Tengda Han, Sayna Ebrahimi, Dilara Gokay, Li Yang Ku, Maks Ovsjanikov, Iva Babukova, Daniel Zoran, Viorica Patraucean, Joao Carreira, Andrew Zisserman, Dima Damen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[490] arXiv:2512.04084 [pdf, ps, other]: Title: SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows

Authors: Qinyu Zhao, Guangting Zheng, Tao Yang, Rui Zhu, Xingjian Leng, Stephen Gould, Liang Zheng

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[491] arXiv:2512.04082 [pdf, ps, other]: Title: PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

Authors: Jiazhe Wei, Ken Li, Tianyu Lao, Haofan Wang, Liang Wang, Caifeng Shan, Chenyang Si

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[492] arXiv:2512.04069 [pdf, ps, other]: Title: SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL

Authors: Siyi Chen, Mikaela Angelina Uy, Chan Hee Song, Faisal Ladhak, Adithyavairavan Murali, Qing Qu, Stan Birchfield, Valts Blukis, Jonathan Tremblay

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[493] arXiv:2512.04048 [pdf, ps, other]: Title: Stable Signer: Hierarchical Sign Language Generative Model

Authors: Sen Fang, Yalin Feng, Hongbin Zhong, Yanxin Zhang, Dimitris N. Metaxas

Comments: 12 pages, 7 figures. More Demo at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Computers and Society (cs.CY)
[494] arXiv:2512.04040 [pdf, ps, other]: Title: RELIC: Interactive Video World Model with Long-Horizon Memory

Authors: Yicong Hong, Yiqun Mei, Chongjian Ge, Yiran Xu, Yang Zhou, Sai Bi, Yannick Hold-Geoffroy, Mike Roberts, Matthew Fisher, Eli Shechtman, Kalyan Sunkavalli, Feng Liu, Zhengqi Li, Hao Tan

Comments: 22 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[495] arXiv:2512.04039 [pdf, ps, other]: Title: Fast & Efficient Normalizing Flows and Applications of Image Generative Models

Authors: Sandeep Nagar

Comments: PhD Thesis

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[496] arXiv:2512.04025 [pdf, ps, other]: Title: PSA: Pyramid Sparse Attention for Efficient Video Understanding and Generation

Authors: Xiaolong Li, Youping Gu, Xi Lin, Weijie Wang, Bohan Zhuang

Comments: Tech report

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[497] arXiv:2512.04021 [pdf, ps, other]: Title: C3G: Learning Compact 3D Representations with 2K Gaussians

Authors: Honggyu An, Jaewoo Jung, Mungyeom Kim, Sunghwan Hong, Chaehyun Kim, Kazumi Fukuda, Minkyeong Jeon, Jisang Han, Takuya Narihira, Hyuna Ko, Junsu Kim, Yuki Mitsufuji, Seungryong Kim

Comments: Project Page : this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[498] arXiv:2512.04019 [pdf, ps, other]: Title: Ultra-lightweight Neural Video Representation Compression

Authors: Ho Man Kwan, Tianhao Peng, Ge Gao, Fan Zhang, Mike Nilsson, Andrew Gower, David Bull

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[499] arXiv:2512.04015 [pdf, ps, other]: Title: Learning Group Actions In Disentangled Latent Image Representations

Authors: Farhana Hossain Swarnali, Miaomiao Zhang, Tonmoy Hossain

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[500] arXiv:2512.04012 [pdf, ps, other]: Title: Emergent Outlier View Rejection in Visual Geometry Grounded Transformers

Authors: Jisang Han, Sunghwan Hong, Jaewoo Jung, Wooseok Jang, Honggyu An, Qianqian Wang, Seungryong Kim, Chen Feng

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[501] arXiv:2512.04007 [pdf, ps, other]: Title: On the Temporality for Sketch Representation Learning

Authors: Marcelo Isaias de Moraes Junior, Moacir Antonelli Ponti

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[502] arXiv:2512.04000 [pdf, ps, other]: Title: Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding

Authors: Jialuo Li, Bin Li, Jiahao Li, Yan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[503] arXiv:2512.03996 [pdf, ps, other]: Title: Highly Efficient Test-Time Scaling for T2I Diffusion Models with Text Embedding Perturbation

Authors: Hang Xu, Linjiang Huang, Feng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[504] arXiv:2512.03992 [pdf, ps, other]: Title: DIQ-H: Evaluating Hallucination Persistence in VLMs Under Temporal Visual Degradation

Authors: Zexin Lin, Hawen Wan, Yebin Zhong, Xiaoqiang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[505] arXiv:2512.03981 [pdf, ps, other]: Title: DirectDrag: High-Fidelity, Mask-Free, Prompt-Free Drag-based Image Editing via Readout-Guided Feature Alignment

Authors: Sheng-Hao Liao, Shang-Fu Chen, Tai-Ming Huang, Wen-Huang Cheng, Kai-Lung Hua

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[506] arXiv:2512.03979 [pdf, ps, other]: Title: BlurDM: A Blur Diffusion Model for Image Deblurring

Authors: Jin-Ting He, Fu-Jen Tsai, Yan-Tsung Peng, Min-Hung Chen, Chia-Wen Lin, Yen-Yu Lin

Comments: NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[507] arXiv:2512.03964 [pdf, ps, other]: Title: Training for Identity, Inference for Controllability: A Unified Approach to Tuning-Free Face Personalization

Authors: Lianyu Pang, Ji Zhou, Qiping Wang, Baoquan Zhao, Zhenguo Yang, Qing Li, Xudong Mao

Comments: 17 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[508] arXiv:2512.03963 [pdf, ps, other]: Title: TempR1: Improving Temporal Understanding of MLLMs via Temporal-Aware Multi-Task Reinforcement Learning

Authors: Tao Wu, Li Yang, Gen Zhan, Yabin Zhang, Yiting Liao, Junlin Li, Deliang Fu, Li Zhang, Limin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[509] arXiv:2512.03939 [pdf, ps, other]: Title: MUT3R: Motion-aware Updating Transformer for Dynamic 3D Reconstruction

Authors: Guole Shen, Tianchen Deng, Xingrui Qin, Nailin Wang, Jianyu Wang, Yanbo Wang, Yongtao Chen, Hesheng Wang, Jingchuan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[510] arXiv:2512.03932 [pdf, ps, other]: Title: Beyond the Ground Truth: Enhanced Supervision for Image Restoration

Authors: Donghun Ryou, Inju Ha, Sanghyeok Chu, Bohyung Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[511] arXiv:2512.03918 [pdf, ps, other]: Title: UniMo: Unifying 2D Video and 3D Human Motion with an Autoregressive Framework

Authors: Youxin Pang, Yong Zhang, Ruizhi Shao, Xiang Deng, Feng Gao, Xu Xiaoming, Xiaoming Wei, Yebin Liu

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[512] arXiv:2512.03905 [pdf, ps, other]: Title: Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence

Authors: Shuai Yang, Junxin Lin, Yifan Zhou, Ziwei Liu, Chen Change Loy

Comments: Code: this https URL, Project: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[513] arXiv:2512.03883 [pdf, ps, other]: Title: Dual Cross-Attention Siamese Transformer for Rectal Tumor Regrowth Assessment in Watch-and-Wait Endoscopy

Authors: Jorge Tapias Gomez, Despoina Kanata, Aneesh Rangnekar, Christina Lee, Julio Garcia-Aguilar, Joshua Jesse Smith, Harini Veeraraghavan

Comments: 6 pages, 5 figures, 1 table, submitted to ISBI conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[514] arXiv:2512.03869 [pdf, ps, other]: Title: An Automated Framework for Large-Scale Graph-Based Cerebrovascular Analysis

Authors: Daniele Falcetta, Liane S. Canas, Lorenzo Suppa, Matteo Pentassuglia, Jon Cleary, Marc Modat, Sébastien Ourselin, Maria A. Zuluaga

Comments: Submitted to ISBI 2026. 6 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[515] arXiv:2512.03862 [pdf, ps, other]: Title: Diminishing Returns in Self-Supervised Learning

Authors: Oli Bridge, Huey Sun, Botond Branyicskai-Nagy, Charles D'Ornano, Shomit Basu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[516] arXiv:2512.03854 [pdf, ps, other]: Title: Prostate biopsy whole slide image dataset from an underrepresented Middle Eastern population

Authors: Peshawa J. Muhammad Ali, Navin Vincent, Saman S. Abdulla, Han N. Mohammed Fadhl, Anders Blilie, Kelvin Szolnoky, Julia Anna Mielcarz, Xiaoyi Ji, Kimmo Kartasalo, Abdulbasit K. Al-Talabani, Nita Mulliqi

Comments: 13 pages, 2 figures and 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[517] arXiv:2512.03852 [pdf, ps, other]: Title: Traffic Image Restoration under Adverse Weather via Frequency-Aware Mamba

Authors: Liwen Pan, Longguang Wang, Guangwei Gao, Jun Wang, Jun Shi, Juncheng Li

Comments: 12pages, 13 figures, 5tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[518] arXiv:2512.03848 [pdf, ps, other]: Title: PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation

Authors: Hania Ghouse, Maryam Alsharqi, Farhad R. Nezami, Muzammil Behzad

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[519] arXiv:2512.03844 [pdf, ps, other]: Title: CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation

Authors: Letian Zhou, Songhua Liu, Xinchao Wang

Comments: 34 pages, 24 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[520] arXiv:2512.03837 [pdf, ps, other]: Title: Heatmap Pooling Network for Action Recognition from RGB Videos

Authors: Mengyuan Liu, Jinfu Liu, Yongkang Jiang, Bin He

Comments: Final Version of IEEE Transactions on Pattern Analysis and Machine Intelligence

Journal-ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[521] arXiv:2512.03834 [pdf, ps, other]: Title: Lean Unet: A Compact Model for Image Segmentation

Authors: Ture Hassler, Ida Åkerholm, Marcus Nordström, Gabriele Balletti, Orcun Goksel

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[522] arXiv:2512.03827 [pdf, ps, other]: Title: A Robust Camera-based Method for Breath Rate Measurement

Authors: Alexey Protopopov

Comments: 9 pages, 4 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[523] arXiv:2512.03817 [pdf, ps, other]: Title: HieroGlyphTranslator: Automatic Recognition and Translation of Egyptian Hieroglyphs to English

Authors: Ahmed Nasser, Marwan Mohamed, Alaa Sherif, Basmala Mahmoud, Shereen Yehia, Asmaa Saad, Mariam S. El-Rahmany, Ensaf H. Mohamed

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[524] arXiv:2512.03796 [pdf, ps, other]: Title: LSRS: Latent Scale Rejection Sampling for Visual Autoregressive Modeling

Authors: Hong-Kai Zheng, Piji Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[525] arXiv:2512.03794 [pdf, ps, other]: Title: AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition

Authors: Zichuan Lin, Yicheng Liu, Yang Yang, Lvfang Tao, Deheng Ye

Comments: 15 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[526] arXiv:2512.03751 [pdf, ps, other]: Title: Research on Brain Tumor Classification Method Based on Improved ResNet34 Network

Authors: Yufeng Li, Wenchao Zhao, Bo Dang, Weimin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[527] arXiv:2512.03749 [pdf, ps, other]: Title: Fully Unsupervised Self-debiasing of Text-to-Image Diffusion Models

Authors: Korada Sri Vardhana, Shrikrishna Lolla, Soma Biswas

Comments: Accepted at WACV 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[528] arXiv:2512.03746 [pdf, ps, other]: Title: Thinking with Programming Vision: Towards a Unified View for Thinking with Images

Authors: Zirun Guo, Minjie Hong, Feng Zhang, Kai Jia, Tao Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[529] arXiv:2512.03745 [pdf, ps, other]: Title: Dual-level Modality Debiasing Learning for Unsupervised Visible-Infrared Person Re-Identification

Authors: Jiaze Li, Yan Lu, Bin Liu, Guojun Yin, Mang Ye

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[530] arXiv:2512.03730 [pdf, ps, other]: Title: Out-of-the-box: Black-box Causal Attacks on Object Detectors

Authors: Melane Navaratnarajah, David A. Kelly, Hana Chockler

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[531] arXiv:2512.03724 [pdf, ps, other]: Title: PosA-VLA: Enhancing Action Generation via Pose-Conditioned Anchor Attention

Authors: Ziwen Li, Xin Wang, Hanlue Zhang, Runnan Chen, Runqi Lin, Xiao He, Han Huang, Yandong Guo, Fakhri Karray, Tongliang Liu, Mingming Gong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[532] arXiv:2512.03715 [pdf, ps, other]: Title: DINO-RotateMatch: A Rotation-Aware Deep Framework for Robust Image Matching in Large-Scale 3D Reconstruction

Authors: Kaichen Zhang, Tianxiang Sheng, Xuanming Shi

Comments: 9 pages, 5 figures, 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[533] arXiv:2512.03701 [pdf, ps, other]: Title: Structured Uncertainty Similarity Score (SUSS): Learning a Probabilistic, Interpretable, Perceptual Metric Between Images

Authors: Paula Seidler, Neill D. F. Campbell, Ivor J A Simpson

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[534] arXiv:2512.03687 [pdf, ps, other]: Title: Active Visual Perception: Opportunities and Challenges

Authors: Yian Li, Xiaoyu Guo, Hao Zhang, Shuiwang Li, Xiaowei Dai

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[535] arXiv:2512.03683 [pdf, ps, other]: Title: GaussianBlender: Instant Stylization of 3D Gaussians with Disentangled Latent Spaces

Authors: Melis Ocal, Xiaoyan Xing, Yue Li, Ngo Anh Vien, Sezer Karaoglu, Theo Gevers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[536] arXiv:2512.03673 [pdf, ps, other]: Title: ConvRot: Rotation-Based Plug-and-Play 4-bit Quantization for Diffusion Transformers

Authors: Feice Huang, Zuliang Han, Xing Zhou, Yihuang Chen, Lifei Zhu, Haoqian Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[537] arXiv:2512.03667 [pdf, ps, other]: Title: Colon-X: Advancing Intelligent Colonoscopy from Multimodal Understanding to Clinical Reasoning

Authors: Ge-Peng Ji, Jingyi Liu, Deng-Ping Fan, Nick Barnes

Comments: Technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[538] arXiv:2512.03666 [pdf, ps, other]: Title: ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos

Authors: Qi'ao Xu, Tianwen Qian, Yuqian Fu, Kailing Li, Yang Jiao, Jiacheng Zhang, Xiaoling Wang, Liang He

Comments: 26 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[539] arXiv:2512.03663 [pdf, ps, other]: Title: Multi-Scale Visual Prompting for Lightweight Small-Image Classification

Authors: Salim Khazem

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[540] arXiv:2512.03643 [pdf, ps, other]: Title: Optical Context Compression Is Just (Bad) Autoencoding

Authors: Ivan Yee Lee, Cheng Yang, Taylor Berg-Kirkpatrick

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[541] arXiv:2512.03640 [pdf, ps, other]: Title: MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms

Authors: Jiahao Zhang, Xiao Zhao, Guangyu Gao

Journal-ref: MultiMedia Modeling. MMM 2025. Lecture Notes in Computer Science, vol 15521. Springer, Singapore

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[542] arXiv:2512.03625 [pdf, ps, other]: Title: FeatureLens: A Highly Generalizable and Interpretable Framework for Detecting Adversarial Examples Based on Image Features

Authors: Zhigang Yang, Yuan Liu, Jiawei Zhang, Puning Zhang, Xinqiang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[543] arXiv:2512.03621 [pdf, ps, other]: Title: ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation

Authors: Yaokun Li, Shuaixian Wang, Mantang Guo, Jiehui Huang, Taojun Ding, Mu Hu, Kaixuan Wang, Shaojie Shen, Guang Tan

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[544] arXiv:2512.03619 [pdf, ps, other]: Title: LAMP: Language-Assisted Motion Planning for Controllable Video Generation

Authors: Muhammed Burak Kizil, Enes Sanli, Niloy J. Mitra, Erkut Erdem, Aykut Erdem, Duygu Ceylan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[545] arXiv:2512.03601 [pdf, ps, other]: Title: Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

Authors: Haoran Zhou, Gim Hee Lee

Comments: Accepted to NeurIPS 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[546] arXiv:2512.03598 [pdf, ps, other]: Title: Memory-Guided Point Cloud Completion for Dental Reconstruction

Authors: Jianan Sun, Yukang Huang, Dongzhihan Wang, Mingyu Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[547] arXiv:2512.03597 [pdf, ps, other]: Title: HBFormer: A Hybrid-Bridge Transformer for Microtumor and Miniature Organ Segmentation

Authors: Fuchen Zheng, Xinyi Chen, Weixuan Li, Quanjun Li, Junhua Zhou, Xiaojiao Guo, Xuhang Chen, Chi-Man Pun, Shoujun Zhou

Comments: 6 pages, 4 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[548] arXiv:2512.03593 [pdf, ps, other]: Title: CloseUpAvatar: High-Fidelity Animatable Full-Body Avatars with Mixture of Multi-Scale Textures

Authors: David Svitov, Pietro Morerio, Lourdes Agapito, Alessio Del Bue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[549] arXiv:2512.03592 [pdf, ps, other]: Title: Harnessing Hypergraphs in Geometric Deep Learning for 3D RNA Inverse Folding

Authors: Guang Yang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[550] arXiv:2512.03590 [pdf, ps, other]: Title: Beyond Boundary Frames: Audio-Visual Semantic Guidance for Context-Aware Video Interpolation

Authors: Yuchen Deng, Xiuyang Wu, Hai-Tao Zheng, Jie Wang, Feidiao Yang, Yuxing Han

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[551] arXiv:2512.03580 [pdf, ps, other]: Title: Dynamic Optical Test for Bot Identification (DOT-BI): A simple check to identify bots in surveys and online processes

Authors: Malte Bleeker, Mauro Gotsch

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[552] arXiv:2512.03577 [pdf, ps, other]: Title: Cross-Stain Contrastive Learning for Paired Immunohistochemistry and Histopathology Slide Representation Learning

Authors: Yizhi Zhang, Lei Fan, Zhulin Tao, Donglin Di, Yang Song, Sidong Liu, Cong Cong

Comments: 6 pages, 2 figures. Camera-ready version accepted for IEEE BIBM 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[553] arXiv:2512.03575 [pdf, ps, other]: Title: UniComp: Rethinking Video Compression Through Informational Uniqueness

Authors: Chao Yuan, Shimin Chen, Minliang Lin, Limeng Qiao, Guanglu Wan, Lin Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[554] arXiv:2512.03574 [pdf, ps, other]: Title: Global-Local Aware Scene Text Editing

Authors: Fuxiang Yang, Tonghua Su, Donglin Di, Yin Chen, Xiangqian Wu, Zhongjie Wang, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[555] arXiv:2512.03566 [pdf, ps, other]: Title: GAOT: Generating Articulated Objects Through Text-Guided Diffusion Models

Authors: Hao Sun, Lei Fan, Donglin Di, Shaohui Liu

Comments: Accepted by ACM MM Asia2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[556] arXiv:2512.03558 [pdf, ps, other]: Title: CartoMapQA: A Fundamental Benchmark Dataset Evaluating Vision-Language Models on Cartographic Map Understanding

Authors: Huy Quang Ung, Guillaume Habault, Yasutaka Nishimura, Hao Niu, Roberto Legaspi, Tomoki Oya, Ryoichi Kojima, Masato Taya, Chihiro Ono, Atsunori Minamikawa, Yan Liu

Comments: Accepted at SIGSPATIAL 2025 (Best paper candidates), 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[557] arXiv:2512.03553 [pdf, ps, other]: Title: Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching

Authors: Wei Chee Yew, Hailun Xu, Sanjay Saha, Xiaotian Fan, Hiok Hian Ong, David Yuchen Wang, Kanchan Sarkar, Zhenheng Yang, Danhui Guan

Comments: Accepted at KDD 2026

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[558] arXiv:2512.03542 [pdf, ps, other]: Title: V-ITI: Mitigating Hallucinations in Multimodal Large Language Models via Visual Inference-Time Intervention

Authors: Nan Sun, Zhenyu Zhang, Xixun Lin, Kun Wang, Yanmin Shang, Naibin Gu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang, Yanan Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[559] arXiv:2512.03540 [pdf, ps, other]: Title: CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation

Authors: Ruoxuan Zhang, Bin Wen, Hongxia Xie, Yi Yao, Songhan Zuo, Jian-Yu Jiang-Lin, Hong-Han Shuai, Wen-Huang Cheng

Comments: Accepted by ACM Multimedia 2025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[560] arXiv:2512.03534 [pdf, ps, other]: Title: Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Authors: Subin Kim, Sangwoo Mo, Mamshad Nayeem Rizve, Yiran Xu, Difan Liu, Jinwoo Shin, Tobias Hinz

Comments: Visualizations are available at the website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[561] arXiv:2512.03532 [pdf, ps, other]: Title: OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation

Authors: Zhishan Zhou, Siyuan Wei, Zengran Wang, Chunjie Wang, Xiaosheng Yan, Xiao Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[562] arXiv:2512.03520 [pdf, ps, other]: Title: FloodDiffusion: Tailored Diffusion Forcing for Streaming Motion Generation

Authors: Yiyi Cai, Yuhan Wu, Kunhang Li, You Zhou, Bo Zheng, Haiyang Liu

Comments: 15 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[563] arXiv:2512.03510 [pdf, ps, other]: Title: CSMapping: Scalable Crowdsourced Semantic Mapping and Topology Inference for Autonomous Driving

Authors: Zhijian Qiao, Zehuan Yu, Tong Li, Chih-Chung Chou, Wenchao Ding, Shaojie Shen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[564] arXiv:2512.03509 [pdf, ps, other]: Title: AfroBeats Dance Movement Analysis Using Computer Vision: A Proof-of-Concept Framework Combining YOLO and Segment Anything Model

Authors: Kwaku Opoku-Ware, Gideon Opoku

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[565] arXiv:2512.03508 [pdf, ps, other]: Title: Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation

Authors: Seogkyu Jeon, Kibeom Hong, Hyeran Byun

Comments: ICCV 2025 (poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[566] arXiv:2512.03500 [pdf, ps, other]: Title: EEA: Exploration-Exploitation Agent for Long Video Understanding

Authors: Te Yang, Xiangyu Zhu, Bo Wang, Quan Chen, Peng Jiang, Zhen Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[567] arXiv:2512.03499 [pdf, ps, other]: Title: NAS-LoRA: Empowering Parameter-Efficient Fine-Tuning for Visual Foundation Models with Searchable Adaptation

Authors: Renqi Chen, Haoyang Su, Shixiang Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[568] arXiv:2512.03479 [pdf, ps, other]: Title: Towards Object-centric Understanding for Instructional Videos

Authors: Wenliang Guo, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[569] arXiv:2512.03477 [pdf, ps, other]: Title: Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis

Authors: Zijian Gu, Yuxi Liu, Zhenhao Zhang, Song Wang

Comments: 10 pages, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[570] arXiv:2512.03474 [pdf, ps, other]: Title: Procedural Mistake Detection via Action Effect Modeling

Authors: Wenliang Guo, Yujiang Pu, Yu Kong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[571] arXiv:2512.03470 [pdf, ps, other]: Title: Difference Decomposition Networks for Infrared Small Target Detection

Authors: Chen Hu, Mingyu Zhou, Shuai Yuan, Hongbo Hu, Xiangyu Qiu, Junhai Luo, Tian Pu, Xiyin Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[572] arXiv:2512.03463 [pdf, ps, other]: Title: Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models

Authors: Shojiro Yamabe, Futa Waseda, Daiki Shiono, Tsubasa Takahashi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[573] arXiv:2512.03454 [pdf, ps, other]: Title: Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles

Authors: Haicheng Liao, Huanming Shen, Bonan Wang, Yongkang Li, Yihong Tang, Chengyue Wang, Dingyi Zhuang, Kehua Chen, Hai Yang, Chengzhong Xu, Zhenning Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[574] arXiv:2512.03453 [pdf, ps, other]: Title: GeoVideo: Introducing Geometric Regularization into Video Generation Model

Authors: Yunpeng Bai, Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[575] arXiv:2512.03451 [pdf, ps, other]: Title: GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers

Authors: Zhiye Song, Steve Dai, Ben Keller, Brucek Khailany

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[576] arXiv:2512.03450 [pdf, ps, other]: Title: KeyPointDiffuser: Unsupervised 3D Keypoint Learning via Latent Diffusion Models

Authors: Rhys Newbury, Juyan Zhang, Tin Tran, Hanna Kurniawati, Dana Kulić

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[577] arXiv:2512.03449 [src]: Title: LM-CartSeg: Automated Segmentation of Lateral and Medial Cartilage and Subchondral Bone for Radiomics Analysis

Authors: Tongxu Zhang

Comments: The manuscript represents only a preliminary and substantially incompleted exploration. The author has decided not to stand by these results, and a thoroughly revised and significantly different version will be developed separately. Therefore this version is withdrawn and should not be cited

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[578] arXiv:2512.03445 [pdf, ps, other]: Title: Multi-Aspect Knowledge-Enhanced Medical Vision-Language Pretraining with Multi-Agent Data Generation

Authors: Xieji Li, Siyuan Yan, Yingsheng Liu, H. Peter Soyer, Monika Janda, Victoria Mar, Zongyuan Ge

Comments: 10 pages. Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[579] arXiv:2512.03430 [pdf, ps, other]: Title: Label-Efficient Hyperspectral Image Classification via Spectral FiLM Modulation of Low-Level Pretrained Diffusion Features

Authors: Yuzhen Hu, Biplab Banerjee, Saurabh Prasad

Comments: Accepted to the ICML 2025 TerraBytes Workshop (June 9, 2025)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[580] arXiv:2512.03427 [pdf, ps, other]: Title: Generalization Evaluation of Deep Stereo Matching Methods for UAV-Based Forestry Applications

Authors: Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[581] arXiv:2512.03424 [pdf, ps, other]: Title: DM3D: Deformable Mamba via Offset-Guided Gaussian Sequencing for Point Cloud Understanding

Authors: Bin Liu, Chunyang Wang, Xuelian Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[582] arXiv:2512.03418 [pdf, ps, other]: Title: YOLOA: Real-Time Affordance Detection via LLM Adapter

Authors: Yuqi Ji, Junjie Ke, Lihuo He, Jun Liu, Kaifan Zhang, Yu-Kun Lai, Guiguang Ding, Xinbo Gao

Comments: 13 pages, 9 figures, conference

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[583] arXiv:2512.03405 [pdf, ps, other]: Title: ViDiC: Video Difference Captioning

Authors: Jiangtao Wu, Shihao Li, Zhaozhou Bian, Jialu Chen, Runzhe Wen, An Ping, Yiwen He, Jiakai Wang, Yuanxing Zhang, Jiaheng Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[584] arXiv:2512.03404 [pdf, ps, other]: Title: MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification

Authors: Yujian Zhao, Hankun Liu, Guanglin Niu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[585] arXiv:2512.03370 [pdf, ps, other]: Title: ShelfGaussian: Shelf-Supervised Open-Vocabulary Gaussian-based 3D Scene Understanding

Authors: Lingjun Zhao, Yandong Luo, James Hay, Lu Gan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[586] arXiv:2512.03369 [pdf, ps, other]: Title: FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread Forecasting

Authors: Nan Zhou, Huandong Wang, Jiahao Li, Han Li, Yali Song, Qiuhua Wang, Yong Li, Xinlei Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[587] arXiv:2512.03359 [pdf, ps, other]: Title: A Hybrid Deep Learning Framework with Explainable AI for Lung Cancer Classification with DenseNet169 and SVM

Authors: Md Rashidul Islam, Bakary Gibba, Altagi Abdallah Bakheit Abdelgadir

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[588] arXiv:2512.03350 [pdf, ps, other]: Title: SeeU: Seeing the Unseen World via 4D Dynamics-aware Generation

Authors: Yu Yuan, Tharindu Wickremasinghe, Zeeshan Nadir, Xijun Wang, Yiheng Chi, Stanley H. Chan

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 759 entries: 1-100 | ... | 189-288 | 289-388 | 389-488 | 489-588 | 589-688 | 689-759 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 488

Thu, 4 Dec 2025 (showing first 100 of 130 entries)