Computer Vision and Pattern Recognition
Authors and titles for recent submissions, skipping first 131
[ total of 749 entries: 1-250 | 132-381 | 382-631 | 632-749 ][ showing 250 entries per page: fewer | more | all ]
Tue, 9 Dec 2025 (showing first 250 of 259 entries)
- [132] arXiv:2512.07834 [pdf, ps, other]
-
Title: Voxify3D: Pixel Art Meets Volumetric RenderingComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [133] arXiv:2512.07833 [pdf, ps, other]
-
Title: Relational Visual SimilarityAuthors: Thao Nguyen, Sicheng Mo, Krishna Kumar Singh, Yilin Wang, Jing Shi, Nicholas Kolkin, Eli Shechtman, Yong Jae Lee, Yuheng LiComments: Project page, data, and code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [134] arXiv:2512.07831 [pdf, ps, other]
-
Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video GenerationAuthors: Jiehui Huang, Yuechen Zhang, Xu He, Yuan Gao, Zhi Cen, Bin Xia, Yan Zhou, Xin Tao, Pengfei Wan, Jiaya JiaComments: Project Website this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [135] arXiv:2512.07829 [pdf, ps, other]
-
Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [136] arXiv:2512.07826 [pdf, ps, other]
-
Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video EditingAuthors: Haoyang He, Jie Wang, Jiangning Zhang, Zhucun Xue, Xingyuan Bu, Qiangpeng Yang, Shilei Wen, Lei XieComments: 38 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [137] arXiv:2512.07821 [pdf, ps, other]
-
Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion ModelingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [138] arXiv:2512.07807 [pdf, ps, other]
-
Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale ScenesComments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [139] arXiv:2512.07806 [pdf, ps, other]
-
Title: Multi-view Pyramid Transformer: Look Coarser to See BroaderComments: Project page: see this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [140] arXiv:2512.07802 [pdf, ps, other]
-
Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive MemoryAuthors: Zhaochong An, Menglin Jia, Haonan Qiu, Zijian Zhou, Xiaoke Huang, Zhiheng Liu, Weiming Ren, Kumara Kahatapitiya, Ding Liu, Sen He, Chenyang Zhang, Tao Xiang, Fanny Yang, Serge Belongie, Tian XieComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [141] arXiv:2512.07778 [pdf, ps, other]
-
Title: Distribution Matching Variational AutoEncoderSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [142] arXiv:2512.07776 [pdf, ps, other]
-
Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population MonitoringAuthors: Maximilian Schall, Felix Leonard Knöfel, Noah Elias König, Jan Jonas Kubeler, Maximilian von Klinski, Joan Wilhelm Linnemann, Xiaoshi Liu, Iven Jelle Schlegelmilch, Ole Woyciniuk, Alexandra Schild, Dante Wasmuht, Magdalena Bermejo Espinet, German Illera Basas, Gerard de MeloComments: Accepted at WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [143] arXiv:2512.07760 [pdf, ps, other]
-
Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-IdentificationComments: Accepted to AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [144] arXiv:2512.07756 [pdf, ps, other]
-
Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound ReconstructionSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [145] arXiv:2512.07747 [pdf, ps, other]
-
Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and GenerationAuthors: Shihao Zhao, Yitong Chen, Zeyinzi Jiang, Bojia Zi, Shaozhe Hao, Yu Liu, Chaojie Mao, Kwan-Yee K. WongSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [146] arXiv:2512.07745 [pdf, ps, other]
-
Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous DrivingAuthors: Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, Xinggang WangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [147] arXiv:2512.07738 [pdf, ps, other]
-
Title: HLTCOE Evaluation Team at TREC 2025: VQA TrackAuthors: Dengjia Zhang, Charles Weng, Katherine Guerrerio, Yi Lu, Kenton Murray, Alexander Martin, Reno Kriz, Benjamin Van DurmeComments: 7 pages, 1 figureSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [148] arXiv:2512.07733 [pdf, ps, other]
-
Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental ImagerySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [149] arXiv:2512.07730 [pdf, ps, other]
-
Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object HallucinationComments: WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [150] arXiv:2512.07729 [pdf, ps, other]
-
Title: Improving action classification with brain-inspired deep networksSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [151] arXiv:2512.07720 [pdf, ps, other]
-
Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar CreationAuthors: Fan Yang, Heyuan Li, Peihao Li, Weihao Yuan, Lingteng Qiu, Chaoyue Song, Cheng Chen, Yisheng He, Shifeng Zhang, Xiaoguang Han, Steven Hoi, Guosheng LinComments: Project page: \url{this https URL}Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [152] arXiv:2512.07712 [pdf, ps, other]
-
Title: UnCageNet: Tracking and Pose Estimation of Caged AnimalComments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, IndiaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [153] arXiv:2512.07703 [pdf, ps, other]
-
Title: PVeRA: Probabilistic Vector-Based Random Matrix AdaptationAuthors: Leo Fillioux, Enzo Ferrante, Paul-Henry Cournède, Maria Vakalopoulou, Stergios ChristodoulidisSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [154] arXiv:2512.07702 [pdf, ps, other]
-
Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image AlignmentComments: WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [155] arXiv:2512.07698 [pdf, ps, other]
-
Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data OnlySubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [156] arXiv:2512.07674 [pdf, ps, other]
-
Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast RepresentationsAuthors: Mehmet Yigit Avci, Pedro Borges, Virginia Fernandez, Paul Wright, Mehmet Yigitsoy, Sebastien Ourselin, Jorge CardosoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [157] arXiv:2512.07668 [pdf, ps, other]
-
Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and DatasetSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [158] arXiv:2512.07661 [pdf, ps, other]
-
Title: Optimization-Guided Diffusion for Interactive Scene GenerationAuthors: Shiaho Li, Naisheng Ye, Tianyu Li, Kashyap Chitta, Tuo An, Peng Su, Boyang Wang, Haiou Liu, Chen Lv, Hongyang LiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [159] arXiv:2512.07652 [pdf, ps, other]
-
Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific ResearchSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [160] arXiv:2512.07651 [pdf, ps, other]
-
Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline MethodSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [161] arXiv:2512.07628 [pdf, ps, other]
-
Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D GenerationAuthors: Zhiqi Li, Wenhuan Li, Tengfei Wang, Zhenwei Wang, Junta Wu, Haoyuan Wang, Yunhan Yang, Zehuan Huang, Yang Li, Peidong Liu, Chunchao GuoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [162] arXiv:2512.07606 [pdf, ps, other]
-
Title: Decomposition Sampling for Efficient Region Annotations in Active LearningAuthors: Jingna Qiu, Frauke Wilm, Mathias Öttl, Jonas Utz, Maja Schlereth, Moritz Schillinger, Marc Aubreville, Katharina BreiningerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [163] arXiv:2512.07599 [pdf, ps, other]
-
Title: Online Segment Any 3D Thing as Instance TrackingComments: NeurIPS 2025, Code is at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [164] arXiv:2512.07596 [pdf, ps, other]
-
Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic SurgeryAuthors: Wenzhen Dong, Jieming Yu, Yiming Huang, Hongqiu Wang, Lei Zhu, Albert C. S. Chung, Hongliang Ren, Long BaiComments: Technical ReportSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [165] arXiv:2512.07590 [pdf, ps, other]
-
Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [166] arXiv:2512.07584 [pdf, ps, other]
-
Title: LongCat-Image Technical ReportAuthors: Meituan LongCat Team: Hanghang Ma, Haoxian Tan, Jiale Huang, Junqiang Wu, Jun-Yan He, Lishuai Gao, Songlin Xiao, Xiaoming Wei, Xiaoqi Ma, Xunliang Cai, Yayong Guan, Jie HuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [167] arXiv:2512.07580 [pdf, ps, other]
-
Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMsAuthors: Yahong Wang, Juncheng Wu, Zhangkai Ni, Longzhen Yang, Yihang Liu, Chengmei Yang, Ying Wen, Xianfeng Tang, Hui Liu, Yuyin Zhou, Lianghua HeSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [168] arXiv:2512.07568 [pdf, ps, other]
-
Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic DecorrelationAuthors: Xuecheng Li, Weikuan Jia, Alisher Kurbonaliev, Qurbonaliev Alisher, Khudzhamkulov Rustam, Ismoilov Shuhratjon, Eshmatov Javhariddin, Yuanjie ZhengSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
- [169] arXiv:2512.07564 [pdf, ps, other]
-
Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language ModelsComments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [170] arXiv:2512.07527 [pdf, ps, other]
-
Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite ImagesAuthors: Fei Yu, Yu Liu, Luyang Tang, Mingchao Sun, Zengye Ge, Rui Bu, Yuchao Jin, Haisen Zhao, He Sun, Yangyan Li, Mu Xu, Wenzheng Chen, Baoquan ChenSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [171] arXiv:2512.07514 [pdf, ps, other]
-
Title: MeshRipple: Structured Autoregressive Generation of Artist-MeshesAuthors: Junkai Lin, Hang Long, Huipeng Guo, Jielei Zhang, JiaYi Yang, Tianle Guo, Yang Yang, Jianwen Li, Wenxiao Zhang, Matthias Nießner, Wei YangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [172] arXiv:2512.07504 [pdf, ps, other]
-
Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing PointsComments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [173] arXiv:2512.07503 [pdf, ps, other]
-
Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image GenerationAuthors: Yao Teng, Zhihuan Jiang, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui LiuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [174] arXiv:2512.07500 [pdf, ps, other]
-
Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion TransformerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [175] arXiv:2512.07498 [pdf, ps, other]
-
Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral PriorComments: 16 pages (including appendix)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [176] arXiv:2512.07480 [pdf, ps, other]
-
Title: Single-step Diffusion-based Video Coding with Semantic-Temporal GuidanceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [177] arXiv:2512.07469 [pdf, ps, other]
-
Title: Unified Video Editing with Temporal ReasonerComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [178] arXiv:2512.07426 [pdf, ps, other]
-
Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processingAuthors: Karel Moens, Matthew B. Blaschko, Tinne Tuytelaars, Bart Diricx, Jonas De Vylder, Mustafa YousifComments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [179] arXiv:2512.07415 [pdf, ps, other]
-
Title: Data-driven Exploration of Mobility Interaction PatternsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [180] arXiv:2512.07410 [pdf, ps, other]
-
Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction GraphsAuthors: Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang, Xin Chen, Lan Xu, Jingyi Yu, Jingya WangComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [181] arXiv:2512.07394 [pdf, ps, other]
-
Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric VideoComments: webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [182] arXiv:2512.07391 [pdf, ps, other]
-
Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency MonitoringAuthors: Đorđe NedeljkovićSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [183] arXiv:2512.07385 [pdf, ps, other]
-
Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New BaselineAuthors: Chunhui Zhang, Li Liu, Zhipeng Zhang, Yong Wang, Hao Wen, Xi Zhou, Shiming Ge, Yanfeng WangComments: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [184] arXiv:2512.07383 [pdf, ps, other]
-
Title: LogicCBMs: Logic-Enhanced Concept-Based LearningComments: 18 pages, 19 figures, WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [185] arXiv:2512.07381 [pdf, ps, other]
-
Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic ObjectsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [186] arXiv:2512.07379 [pdf, ps, other]
-
Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and EfficiencyComments: 22 pages, 16 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [187] arXiv:2512.07360 [pdf, ps, other]
-
Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic SegmentationComments: Accepted to WACV2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [188] arXiv:2512.07351 [pdf, ps, other]
-
Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake DetectionAuthors: Sayeem Been Zaman, Wasimul Karim, Arefin Ittesafun Abian, Reem E. Mohamed, Md Rafiqul Islam, Asif Karim, Sami AzamSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
- [189] arXiv:2512.07348 [pdf, ps, other]
-
Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image CompositionAuthors: Xinyu Wei, Kangrui Cen, Hongyang Wei, Zhen Guo, Bairui Li, Zeqing Wang, Jinrui Zhang, Lei ZhangComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [190] arXiv:2512.07345 [pdf, ps, other]
-
Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian SplattingComments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [191] arXiv:2512.07338 [pdf, ps, other]
-
Title: Generalized Referring Expression Segmentation on Aerial PhotosComments: Submitted to IEEE J-STARSSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [192] arXiv:2512.07331 [pdf, ps, other]
-
Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision TransformersAuthors: Kanishk AwadhiyaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [193] arXiv:2512.07328 [pdf, ps, other]
-
Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [194] arXiv:2512.07305 [pdf, ps, other]
-
Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image DatasetAuthors: Tobias Abraham HaiderSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [195] arXiv:2512.07302 [pdf, ps, other]
-
Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task PromptsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [196] arXiv:2512.07276 [pdf, ps, other]
-
Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial ImageryComments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [197] arXiv:2512.07275 [pdf, ps, other]
- [198] arXiv:2512.07273 [pdf, ps, other]
-
Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language TranslationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [199] arXiv:2512.07269 [pdf, ps, other]
-
Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth dataSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [200] arXiv:2512.07253 [pdf, ps, other]
-
Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video EnhancementComments: 18 pages, 8 figures, and 7 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [201] arXiv:2512.07251 [pdf, ps, other]
-
Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast EnhancementAuthors: Junqi Liu, Zejun Wu, Pedro R. A. S. Bassi, Xinze Zhou, Wenxuan Li, Ibrahim E. Hamamci, Sezgin Er, Tianyu Lin, Yi Luo, Szymon Płotka, Bjoern Menze, Daguang Xu, Kai Ding, Kang Wang, Yang Yang, Yucheng Tang, Alan L. Yuille, Zongwei ZhouSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [202] arXiv:2512.07247 [pdf, ps, other]
-
Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven EditingComments: 40 pages, 34 figures, 18 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [203] arXiv:2512.07245 [pdf, ps, other]
-
Title: Zero-Shot Textual Explanations via Translating Decision-Critical FeaturesComments: 11+6 pages, 8 figures, 4 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [204] arXiv:2512.07241 [pdf, ps, other]
-
Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network ArchitectureAuthors: Md. Srabon Chowdhury, Syeda Fahmida Tanzim, Sheekar Banerjee, Ishtiak Al Mamoon, AKM Muzahidul IslamSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [205] arXiv:2512.07237 [pdf, ps, other]
-
Title: Unified Camera Positional Encoding for Controlled Video GenerationAuthors: Cheng Zhang, Boying Li, Meng Wei, Yan-Pei Cao, Camilo Cruz Gambardella, Dinh Phung, Jianfei CaiComments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [206] arXiv:2512.07234 [pdf, ps, other]
-
Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [207] arXiv:2512.07230 [pdf, ps, other]
-
Title: STRinGS: Selective Text Refinement in Gaussian SplattingAuthors: Abhinav Raundhal, Gaurav Behera, P J Narayanan, Ravi Kiran Sarvadevabhatla, Makarand TapaswiComments: Accepted to WACV 2026. Project Page, see this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [208] arXiv:2512.07229 [pdf, ps, other]
-
Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category DiscoveryComments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [209] arXiv:2512.07228 [pdf, ps, other]
-
Title: Towards Robust Protective Perturbation against DeepFake Face SwappingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [210] arXiv:2512.07215 [pdf, ps, other]
-
Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose EstimationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [211] arXiv:2512.07211 [pdf, ps, other]
-
Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point CloudsComments: 8 pages, 8 figures, 5 tables, ICCR 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [212] arXiv:2512.07206 [pdf, ps, other]
-
Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CTAuthors: Boyang Pan, Zeyu Zhang, Hongyu Meng, Bin Cui, Yingying Zhang, Wenli Hou, Junhao Li, Langdi Zhong, Xiaoxiao Chen, Xiaoyu Xu, Changjin Zuo, Chao Cheng, Nan-Jie GongSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [213] arXiv:2512.07203 [pdf, ps, other]
-
Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent ReasoningComments: 7 pages, 1 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [214] arXiv:2512.07201 [pdf, ps, other]
-
Title: Understanding Diffusion Models via Code ExecutionAuthors: Cheng YuSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [215] arXiv:2512.07198 [pdf, ps, other]
-
Title: Generating Storytelling Images with Rich Chains-of-ReasoningSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [216] arXiv:2512.07197 [pdf, ps, other]
-
Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian SplattingComments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [217] arXiv:2512.07192 [pdf, ps, other]
-
Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image CompressionComments: 12 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [218] arXiv:2512.07191 [pdf, ps, other]
-
Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field CorrectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [219] arXiv:2512.07190 [pdf, ps, other]
-
Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image ClassificationAuthors: Pengfei Gu, Huimin Li, Haoteng Tang, Dongkuan (DK)Xu, Erik Enriquez, DongChul Kim, Bin Fu, Danny Z. ChenSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [220] arXiv:2512.07186 [pdf, ps, other]
-
Title: START: Spatial and Textual Learning for Chart UnderstandingComments: WACV2026 Camera ReadySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [221] arXiv:2512.07171 [pdf, ps, other]
-
Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image RestorationComments: 21 pages, 11 figures, 5 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [222] arXiv:2512.07170 [pdf, ps, other]
-
Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer ApproachSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [223] arXiv:2512.07166 [pdf, ps, other]
-
Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM EditingComments: 9 pages,7figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [224] arXiv:2512.07165 [pdf, ps, other]
-
Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale AdaptationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [225] arXiv:2512.07155 [pdf, ps, other]
-
Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented MetricsComments: Please visit our project page at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [226] arXiv:2512.07141 [pdf, ps, other]
-
Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [227] arXiv:2512.07136 [pdf, ps, other]
-
Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and ReasoningAuthors: Siyang Jiang, Mu Yuan, Xiang Ji, Bufang Yang, Zeyu Liu, Lilin Xu, Yang Li, Yuting He, Liran Dong, Wenrui Lu, Zhenyu Yan, Xiaofan Jiang, Wei Gao, Hongkai Chen, Guoliang XingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [228] arXiv:2512.07135 [pdf, ps, other]
-
Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement LearningAuthors: Zebin Xing, Pengxuan Yang, Linbo Wang, Yichen Zhang, Yiming Hu, Yupeng Zheng, Junli Wang, Yinfeng Gao, Guang Li, Kun Ma, Long Chen, Zhongpu Xia, Qichao Zhang, Hangjun Ye, Dongbin ZhaoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [229] arXiv:2512.07128 [pdf, ps, other]
-
Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIPSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [230] arXiv:2512.07126 [pdf, ps, other]
-
Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-OnComments: 16 pages, 8 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [231] arXiv:2512.07110 [pdf, ps, other]
-
Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [232] arXiv:2512.07107 [pdf, ps, other]
-
Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D SupervisionComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [233] arXiv:2512.07078 [pdf, ps, other]
-
Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object DetectionComments: 16 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [234] arXiv:2512.07076 [pdf, ps, other]
-
Title: Context-measure: Contextualizing Metric for CamouflageComments: Technical ReportSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [235] arXiv:2512.07065 [pdf, ps, other]
-
Title: Persistent Homology-Guided Frequency Filtering for Image CompressionComments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compressionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [236] arXiv:2512.07062 [pdf, ps, other]
-
Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense PredictionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [237] arXiv:2512.07052 [pdf, ps, other]
-
Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian SplattingAuthors: Hoang-Nhat Tran, Francesco Di Sario, Gabriele Spadaro, Giuseppe Valenzise, Enzo TartaglioneSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [238] arXiv:2512.07051 [pdf, ps, other]
-
Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image SegmentationComments: 11 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [239] arXiv:2512.07037 [pdf, ps, other]
-
Title: Evaluating and Preserving High-level Fidelity in Super-ResolutionSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [240] arXiv:2512.07034 [pdf, ps, other]
-
Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent CuesAuthors: Tuan-Anh Vu, Hai Nguyen-Truong, Ziqiang Zheng, Binh-Son Hua, Qing Guo, Ivor Tsang, Sai-Kit YeungComments: Accepted to WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [241] arXiv:2512.06981 [pdf, ps, other]
-
Title: Selective Masking based Self-Supervised Learning for Image Semantic SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [242] arXiv:2512.06949 [pdf, ps, other]
-
Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin HistologyComments: 19 pages, 5 figures, 2 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [243] arXiv:2512.06921 [pdf, ps, other]
-
Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy IdentificationAuthors: Ziyang Song, Zelin Zang, Xiaofan Ye, Boqiang Xu, Long Bai, Jinlin Wu, Hongliang Ren, Hongbin Liu, Jiebo Luo, Zhen LeiComments: Accepted by IEEE ICIA 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [244] arXiv:2512.06905 [pdf, ps, other]
-
Title: Scaling Zero-Shot Reference-to-Video GenerationAuthors: Zijian Zhou, Shikun Liu, Haozhe Liu, Haonan Qiu, Zhaochong An, Weiming Ren, Zhiheng Liu, Xiaoke Huang, Kam Woh Ng, Tian Xie, Xiao Han, Yuren Cong, Hang Li, Chuyan Zhu, Aditya Patel, Tao Xiang, Sen HeComments: Website: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [245] arXiv:2512.06888 [pdf, ps, other]
-
Title: Overcoming Small Data Limitations in Video-Based Infant Respiration EstimationAuthors: Liyang Song, Hardik Bishnoi, Sai Kumar Reddy Manne, Sarah Ostadabbas, Briana J. Taylor, Michael WanSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [246] arXiv:2512.06886 [pdf, ps, other]
-
Title: Balanced Learning for Domain Adaptive Semantic SegmentationComments: Accepted by International Conference on Machine Learning (ICML 2025)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [247] arXiv:2512.06885 [pdf, ps, other]
-
Title: JoPano: Unified Panorama Generation via Joint ModelingComments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [248] arXiv:2512.06882 [pdf, ps, other]
-
Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian FusionComments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon requestSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [249] arXiv:2512.06877 [pdf, ps, other]
-
Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene ClassificationComments: Accepted and presented in ICSPISSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [250] arXiv:2512.06870 [pdf, ps, other]
-
Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding PerspectiveComments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [251] arXiv:2512.06866 [pdf, ps, other]
-
Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe PriorComments: Accepted by NeurIPS 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [252] arXiv:2512.06865 [pdf, ps, other]
-
Title: Spatial Retrieval Augmented Autonomous DrivingAuthors: Xiaosong Jia, Chenhe Zhang, Yule Jiang, Songbur Wong, Zhiyuan Zhang, Chen Chen, Shaofeng Zhang, Xuanhe Zhou, Xue Yang, Junchi Yan, Yu-Gang JiangComments: Demo Page: this https URL with open sourced code, dataset, and checkpointsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [253] arXiv:2512.06864 [pdf, ps, other]
-
Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-TrainingComments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [254] arXiv:2512.06862 [pdf, ps, other]
-
Title: Omni-Referring Image SegmentationAuthors: Qiancheng Zheng, Yunhang Shen, Gen Luo, Baiyang Song, Xing Sun, Xiaoshuai Sun, Yiyi Zhou, Rongrong JiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [255] arXiv:2512.06849 [pdf, ps, other]
-
Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CTAuthors: Matan Atad, Alexander W. Marka, Lisa Steinhelfer, Anna Curto-Vilalta, Yannik Leonhardt, Sarah C. Foreman, Anna-Sophia Walburga Dietrich, Robert Graf, Alexandra S. Gersing, Bjoern Menze, Daniel Rueckert, Jan S. Kirschke, Hendrik MöllerComments: In submissionSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [256] arXiv:2512.06845 [pdf, ps, other]
-
Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [257] arXiv:2512.06840 [pdf, ps, other]
-
Title: CADE: Continual Weakly-supervised Video Anomaly Detection with EnsemblesComments: Accepted to WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [258] arXiv:2512.06838 [pdf, ps, other]
-
Title: SparseCoop: Cooperative Perception with Kinematic-Grounded QueriesAuthors: Jiahao Wang, Zhongwei Jiang, Wenchao Sun, Jiaru Zhong, Haibao Yu, Yuner Zhang, Chenyang Lu, Chuang Zhang, Lei He, Shaobing Xu, Jianqiang WangComments: Accepted by AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [259] arXiv:2512.06818 [pdf, ps, other]
-
Title: MeshSplatting: Differentiable Rendering with Opaque MeshesAuthors: Jan Held, Sanghyun Son, Renaud Vandeghen, Daniel Rebain, Matheus Gadelha, Yi Zhou, Anthony Cioppa, Ming C. Lin, Marc Van Droogenbroeck, Andrea TagliasacchiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [260] arXiv:2512.06811 [pdf, ps, other]
-
Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language ModelsComments: Accepted by AAAI 2026(Oral)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
- [261] arXiv:2512.06810 [pdf, ps, other]
-
Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement LearningAuthors: Yueqian Wang, Songxiang Liu, Disong Wang, Nuo Xu, Guanglu Wan, Huishuai Zhang, Dongyan ZhaoSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [262] arXiv:2512.06802 [pdf, ps, other]
-
Title: VDOT: Efficient Unified Video Creation via Optimal Transport DistillationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [263] arXiv:2512.06793 [pdf, ps, other]
-
Title: Generalized Geometry Encoding Volume for Real-time Stereo MatchingComments: Accepted by AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [264] arXiv:2512.06783 [pdf, ps, other]
-
Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-VideosComments: 16 pages, 5 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [265] arXiv:2512.06774 [pdf, ps, other]
-
Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian SplattingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [266] arXiv:2512.06769 [pdf, ps, other]
-
Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial UnderstandingAuthors: Hang Yin, Xiaomin He, PeiWen Yuan, Yiwei Li, Jiayi Shi, Wenxiao Fan, Shaoxiong Feng, Kan LiSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [267] arXiv:2512.06763 [pdf, ps, other]
-
Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control AlgorithmsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [268] arXiv:2512.06759 [pdf, ps, other]
-
Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language PriorsComments: 12 pages,13figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [269] arXiv:2512.06750 [pdf, ps, other]
-
Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and EnhancementAuthors: Weiqi Li, Xuanyu Zhang, Bin Chen, Jingfen Xie, Yan Wang, Kexin Zhang, Junlin Li, Li Zhang, Jian Zhang, Shijie ZhaoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [270] arXiv:2512.06746 [pdf, ps, other]
-
Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image DetectionAuthors: Ruoxin Chen, Jiahui Gao, Kaiqing Lin, Keyue Zhang, Yandan Zhao, Isabel Guan, Taiping Yao, Shouhong DingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [271] arXiv:2512.06738 [pdf, ps, other]
-
Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain AdaptationComments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [272] arXiv:2512.06736 [pdf, ps, other]
-
Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton DataSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [273] arXiv:2512.06726 [pdf, ps, other]
-
Title: The Role of Entropy in Visual Grounding: Analysis and OptimizationAuthors: Shuo Li, Jiajun Sun, Zhihao Zhang, Xiaoran Fan, Senjie Jin, Hui Li, Yuming Yang, Junjie Ye, Lixing Shen, Tao Ji, Tao Gui, Qi Zhang, Xuanjing HuangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [274] arXiv:2512.06689 [pdf, ps, other]
-
Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and SeparationComments: Accepted to ASRU 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
- [275] arXiv:2512.06684 [pdf, ps, other]
-
Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron MicroscopySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [276] arXiv:2512.06674 [pdf, ps, other]
-
Title: RunawayEvil: Jailbreaking the Image-to-Video Generative ModelsAuthors: Songping Wang, Rufan Qian, Yueming Lyu, Qinglong Liu, Linzhuang Zou, Jie Qin, Songhua Liu, Caifeng ShanSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [277] arXiv:2512.06673 [pdf, ps, other]
-
Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and ReasoningAuthors: Shida Gao, Feng Xue, Xiangfeng Wang, Anlong Ming, Teng Long, Yihua Shao, Haozhe Wang, Zhaowen Lin, Wei Wang, Nicu SebeSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [278] arXiv:2512.06663 [pdf, ps, other]
-
Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language TasksSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [279] arXiv:2512.06662 [pdf, ps, other]
-
Title: Personalized Image Descriptions from Attention SequencesAuthors: Ruoyu Xue, Hieu Le, Jingyi Xu, Sounak Mondal, Abe Leite, Gregory Zelinsky, Minh Hoai, Dimitris SamarasComments: 10 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [280] arXiv:2512.06657 [pdf, ps, other]
-
Title: TextMamba: Scene Text Detector with MambaSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [281] arXiv:2512.06642 [pdf, ps, other]
-
Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-ResolutionAuthors: Achmad Ardani Prasha, Clavino Ourizqi Rachmadi, Muhamad Fauzan Ibnu Syahlan, Naufal Rahfi Anugerah, Nanda Garin Raditya, Putri Amelia, Sabrina Laila Mutiara, Hilman Syachr RamadhanComments: 21 pages, 7 figures, 3 tableSubjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [282] arXiv:2512.06613 [pdf, ps, other]
-
Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic ApproachAuthors: Yueying KeComments: 10 pages, 6 figures, 2 tables, IEEE conference format. Submitted as course projectSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [283] arXiv:2512.06612 [pdf, ps, other]
-
Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial TranscriptomicsComments: Neurips 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [284] arXiv:2512.06598 [pdf, ps, other]
-
Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake ChamplainAuthors: Muhammad Adil, Patrick J. Clemins, Andrew W. Schroth, Panagiotis D. Oikonomou, Donna M. Rizzo, Peter D. F. Isles, Xiaohan Zhang, Kareem I. Hannoun, Scott Turnbull, Noah B. Beckage, Asim Zia, Safwan WshahComments: 23 pages, 15 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [285] arXiv:2512.06581 [pdf, ps, other]
-
Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video UnderstandingAuthors: Yuhao Su, Anwesa Choudhuri, Zhongpai Gao, Benjamin Planche, Van Nguyen Nguyen, Meng Zheng, Yuhan Shen, Arun Innanje, Terrence Chen, Ehsan Elhamifar, Ziyan WuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [286] arXiv:2512.06575 [pdf, ps, other]
-
Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability ModulesAuthors: Fariza DahesComments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LGSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [287] arXiv:2512.06565 [pdf, ps, other]
-
Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose EstimationAuthors: Xiujin LiuComments: 1 figures, 2 tables, 14pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [288] arXiv:2512.06562 [pdf, ps, other]
-
Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many IdentitiesAuthors: Dung Thuy Nguyen, Quang Nguyen, Preston K. Robinette, Eli Jiang, Taylor T. Johnson, Kevin LeachSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [289] arXiv:2512.06560 [pdf, ps, other]
-
Title: Bridging spatial awareness and global context in medical image segmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [290] arXiv:2512.06531 [pdf, ps, other]
-
Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [291] arXiv:2512.06530 [pdf, ps, other]
-
Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-GeneralizationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [292] arXiv:2512.06521 [pdf, ps, other]
-
Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife ImagesAuthors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)Comments: 31 pages + appendixSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [293] arXiv:2512.06504 [pdf, ps, other]
-
Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data FusionAuthors: Andrii Lysyi, Anatoliy Sachenko, Pavlo Radiuk, Mykola Lysyi, Oleksandr Melnychenko, Diana ZahorodniaSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [294] arXiv:2512.06485 [pdf, ps, other]
-
Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based InteractionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [295] arXiv:2512.06447 [pdf, ps, other]
-
Title: Towards Stable Cross-Domain Depression Recognition under Missing ModalitiesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [296] arXiv:2512.06438 [pdf, ps, other]
-
Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head AvatarsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [297] arXiv:2512.06434 [pdf, ps, other]
-
Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular ScreeningComments: 8 pages, 2 figures, 3 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [298] arXiv:2512.06426 [pdf, ps, other]
-
Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range RecognitionComments: 12 pages, 9 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [299] arXiv:2512.06424 [pdf, ps, other]
-
Title: DragMesh: Interactive 3D Generation Made EasySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [300] arXiv:2512.06422 [pdf, ps, other]
-
Title: A Perception CNN for Facial Expression RecognitionComments: in IEEE Transactions on Image Processing (2025)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [301] arXiv:2512.06421 [pdf, ps, other]
-
Title: Rethinking Training Dynamics in Scale-wise Autoregressive GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [302] arXiv:2512.06400 [pdf, ps, other]
-
Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene EnhancementComments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENTSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [303] arXiv:2512.06379 [pdf, ps, other]
- [304] arXiv:2512.06377 [pdf, ps, other]
- [305] arXiv:2512.06376 [pdf, ps, other]
-
Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation FrameworkSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [306] arXiv:2512.06373 [pdf, ps, other]
-
Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement LearningComments: The project page is [this url](this https URL)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [307] arXiv:2512.06368 [pdf, ps, other]
-
Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular VideosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [308] arXiv:2512.06363 [pdf, ps, other]
-
Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack DetectionAuthors: Jiabao Guo, Yadian Wang, Hui Ma, Yuhao Fu, Ju Jia, Hui Liu, Shengeng Tang, Lechao Cheng, Yunfeng Diao, Ajian LiuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [309] arXiv:2512.06358 [pdf, ps, other]
-
Title: Rectifying Latent Space for Generative Single-Image Reflection RemovalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [310] arXiv:2512.06353 [pdf, ps, other]
-
Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision SearchAuthors: Kaicheng Yang, Kaisen Yang, Baiting Wu, Xun Zhang, Qianrui Yang, Haotong Qin, He Zhang, Yulun ZhangComments: Code and Supplementary Material could be found at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [311] arXiv:2512.06345 [pdf, ps, other]
-
Title: CLUENet: Cluster Attention Makes Neural Networks Have EyesComments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial IntelligenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [312] arXiv:2512.06344 [pdf, ps, other]
-
Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low BitrateSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [313] arXiv:2512.06332 [pdf, ps, other]
-
Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworksSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [314] arXiv:2512.06330 [pdf, ps, other]
-
Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for PansharpeningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [315] arXiv:2512.06328 [pdf, ps, other]
-
Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language ModelsComments: Accepted as an Oral presentation at AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [316] arXiv:2512.06306 [pdf, ps, other]
-
Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose EstimationAuthors: Haoxian Zhou, Chuanzhi Xu, Langyi Chen, Haodong Chen, Yuk Ying Chung, Qiang Qu, Xaoming Chen, Weidong CaiSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [317] arXiv:2512.06290 [pdf, ps, other]
-
Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke ClassificationComments: 17 pages, 5 figuresJournal-ref: ICDAR 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [318] arXiv:2512.06282 [pdf, ps, other]
-
Title: A Sleep Monitoring System Based on Audio, Video and Depth InformationComments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [319] arXiv:2512.06281 [pdf, ps, other]
-
Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language ModelsAuthors: Hengzhuang Li, Xinsong Zhang, Qiming Peng, Bin Luo, Han Hu, Dengyang Jiang, Han-Jia Ye, Teng Zhang, Hai JinSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [320] arXiv:2512.06276 [pdf, ps, other]
-
Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression ComprehensionAuthors: Tianyi Gao, Hao Li, Han Fang, Xin Wei, Xiaodong Dong, Hongbo Sun, Ye Yuan, Zhongjiang He, Jinglin Xu, Jingmin Xin, Hao SunSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [321] arXiv:2512.06275 [pdf, ps, other]
-
Title: FacePhys: State of the Heart LearningAuthors: Kegang Wang, Jiankai Tang, Yuntao Wang, Xin Liu, Yuxuan Fan, Jiatong Ji, Yuanchun Shi, Daniel McDuffSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [322] arXiv:2512.06269 [pdf, ps, other]
- [323] arXiv:2512.06258 [pdf, ps, other]
-
Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMsAuthors: Chaoyang Wang, Yangfan He, Yiyang Zhou, Yixuan Wang, Jiaqi Liu, Peng Xia, Zhengzhong Tu, Mohit Bansal, Huaxiu YaoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [324] arXiv:2512.06255 [pdf, ps, other]
-
Title: Language-driven Fine-grained RetrievalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [325] arXiv:2512.06251 [pdf, ps, other]
-
Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow NetworksAuthors: Fangzhou Lin, Yuping Wang, Yuliang Guo, Zixun Huang, Xinyu Huang, Haichong Zhang, Kazunori Yamada, Zhengzhong Tu, Liu Ren, Ziming ZhangComments: 12 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [326] arXiv:2512.06232 [pdf, ps, other]
-
Title: Opinion: Learning Intuitive Physics May Require More than Visual DataSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [327] arXiv:2512.06230 [pdf, ps, other]
-
Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis TrackingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [328] arXiv:2512.06221 [pdf, ps, other]
-
Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility StudyAuthors: Alena MakarovaComments: 15 pages, 13 figures. Reproducibility studySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [329] arXiv:2512.06206 [pdf, ps, other]
-
Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated LearningAuthors: Akis Linardos, Sarthak Pati, Ujjwal Baid, Brandon Edwards, Patrick Foley, Kevin Ta, Verena Chung, Micah Sheller, Muhammad Irfan Khan, Mojtaba Jafaritadi, Elina Kontio, Suleiman Khan, Leon Mächler, Ivan Ezhov, Suprosanna Shit, Johannes C. Paetzold, Gustav Grimberg, Manuel A. Nickel, David Naccache, Vasilis Siomos, Jonathan Passerat-Palmbach, Giacomo Tarroni, Daewoon Kim, Leonard L. Klausmann, Prashant Shah, Bjoern Menze, Dimitrios Makris, Spyridon BakasComments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URLJournal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [330] arXiv:2512.06190 [pdf, ps, other]
-
Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food DryingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
- [331] arXiv:2512.06185 [pdf, ps, other]
-
Title: SPOOF: Simple Pixel Operations for Out-of-Distribution FoolingComments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary materialSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [332] arXiv:2512.06179 [pdf, ps, other]
-
Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light DirectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [333] arXiv:2512.06174 [pdf, ps, other]
-
Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light DirectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [334] arXiv:2512.06171 [pdf, ps, other]
-
Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect DetectionComments: 11 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [335] arXiv:2512.06158 [pdf, ps, other]
-
Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model AnimationAuthors: Su Sun, Cheng Zhao, Himangi Mittal, Gaurav Mittal, Rohith Kukkala, Yingjie Victor Chen, Mei ChenComments: 15 pages, 11 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [336] arXiv:2512.06105 [pdf, ps, other]
-
Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report GenerationAuthors: Junwen Zheng, Xinran Xu, Li Rong Wang, Chang Cai, Lucinda Siyun Tan, Dingyuan Wang, Hong Liang Tey, Xiuyi FanComments: AAAI-26-AIASubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [337] arXiv:2512.06103 [pdf, ps, other]
-
Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack DetectionComments: Accepted in IEEE T-BIOMSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [338] arXiv:2512.06096 [pdf, ps, other]
-
Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous DrivingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [339] arXiv:2512.06080 [pdf, ps, other]
-
Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce LightAuthors: Tzofi Klinghoffer, Siddharth Somasundaram, Xiaoyu Xiang, Yuchen Fan, Christian Richardt, Akshat Dave, Ramesh Raskar, Rakesh RanjanComments: SIGGRAPH Asia 2025. Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [340] arXiv:2512.06065 [pdf, ps, other]
-
Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video EditingAuthors: Runjia Li, Moayed Haji-Ali, Ashkan Mirzaei, Chaoyang Wang, Arpit Sahni, Ivan Skorokhodov, Aliaksandr Siarohin, Tomas Jakab, Junlin Han, Sergey Tulyakov, Philip Torr, Willi MenapaceComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [341] arXiv:2512.06058 [pdf, ps, other]
-
Title: Representation Learning for Point Cloud UnderstandingAuthors: Siming YanComments: 181 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [342] arXiv:2512.06032 [pdf, ps, other]
-
Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [343] arXiv:2512.06024 [pdf, ps, other]
-
Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensingSubjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
- [344] arXiv:2512.06020 [pdf, ps, other]
-
Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image GenerationComments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [345] arXiv:2512.06014 [pdf, ps, other]
-
Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 DatasetsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [346] arXiv:2512.06013 [pdf, ps, other]
-
Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViTSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [347] arXiv:2512.06012 [pdf, ps, other]
-
Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive ManufacturingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [348] arXiv:2512.06010 [pdf, other]
-
Title: Fast and Flexible Robustness Certificates for Semantic SegmentationAuthors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [349] arXiv:2512.06006 [pdf, ps, other]
-
Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow OptimizationAuthors: Xuefei (Julie) Wang, Kai A. Horstmann, Ethan Lin, Jonathan Chen, Alexander R. Farhang, Sophia Stiles, Atharva Sehgal, Jonathan Light, David Van Valen, Yisong Yue, Jennifer J. SunSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [350] arXiv:2512.06003 [pdf, ps, other]
-
Title: PrunedCaps: A Case For Primary Capsules DiscriminationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [351] arXiv:2512.05996 [pdf, ps, other]
-
Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and CountingComments: 18 pages, under reviewSubjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
- [352] arXiv:2512.05993 [pdf, ps, other]
-
Title: Domain-Specific Foundation Model Improves AI-Based Analysis of NeuropathologyAuthors: Ruchika Verma, Shrishtee Kandoi, Robina Afzal, Shengjia Chen, Jannes Jegminat, Michael W. Karlovich, Melissa Umphlett, Timothy E. Richardson, Kevin Clare, Quazi Hossain, Jorge Samanamud, Phyllis L. Faust, Elan D. Louis, Ann C. McKee, Thor D. Stein, Jonathan D. Cherry, Jesse Mez, Anya C. McGoldrick, Dalilah D. Quintana Mora, Melissa J. Nirenberg, Ruth H. Walker, Yolfrankcis Mendez, Susan Morgello, Dennis W. Dickson, Melissa E. Murray, Carlos Cordon-Cardo, Nadejda M. Tsankova, Jamie M. Walker, Diana K. Dangoor, Stephanie McQuillan, Emma L. Thorn, Claudia De Sanctis, Shuying Li, Thomas J. Fuchs, Kurt Farrell, John F. Crary, Gabriele CampanellaSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [353] arXiv:2512.05991 [pdf, ps, other]
-
Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking HeadAuthors: Chang Liu, Tianjiao Jing, Chengcheng Ma, Xuanqi Zhou, Zhengxuan Lian, Qin Jin, Hongliang Yuan, Shi-Sheng HuangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [354] arXiv:2512.05988 [pdf, ps, other]
-
Title: VG3T: Visual Geometry Grounded Gaussian TransformerSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [355] arXiv:2512.05987 [pdf, ps, other]
-
Title: Adaptive Dataset Quantization: A New Direction for Dataset PruningComments: Accepted by ICCPR 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [356] arXiv:2512.05969 [pdf, ps, other]
-
Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' MatricesAuthors: Hokin DengSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [357] arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]
-
Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMsSubjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [358] arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]
-
Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray SegmentationAuthors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Sharipov Hotam Beknazarovich, Farzona S. Ataeva, Qurbonaliev Alisher, Yuanjie ZhengSubjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [359] arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics FrameworkAuthors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Alimov Ruslan, Lutfuloev Mazbutdzhon, Ismoilov Shuhratjon, Yuanjie ZhengSubjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [360] arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]
-
Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning ModelsSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [361] arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spacesAuthors: Nikita GabdullinComments: 9 pages, 5 figures, 1 table, 4 equationsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [362] arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]
-
Title: Human Geometry Distribution for 3D Animation GenerationSubjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
- [363] arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]
-
Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World ModelsComments: 23 pages, 8 figures, 3 tablesSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
- [364] arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language ModelsSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [365] arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness LikelihoodComments: Accepted to WACV 2026Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [366] arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]
-
Title: A Geometric Unification of Concept Learning with Concept ConesComments: 22 pagesSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [367] arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Affine Subspace Models and Clustering for Patch-Based Image DenoisingComments: Asilomar Conference on Signals, Systems, and Computers 2025Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [368] arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty MetricsAuthors: Tianyi Ren, Daniel Low, Pittra Jaengprajak, Juampablo Heras Rivera, Jacob Ruzevick, Mehmet KurtSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [369] arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]
-
Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem SolversSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [370] arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket SearchComments: This work plans to be submitted to the IEEE for possible publicationSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
- [371] arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]
-
Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal ReasoningAuthors: Nithin Sivakumaran, Justin Chih-Yao Chen, David Wan, Yue Zhang, Jaehong Yoon, Elias Stengel-Eskin, Mohit BansalComments: Code: this https URLSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [372] arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous DrivingAuthors: Zebin Xing, Yupeng Zheng, Qichao Zhang, Zhixing Ding, Pengxuan Yang, Songen Gu, Zhongpu Xia, Dongbin ZhaoSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [373] arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep AnalysisSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [374] arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]
-
Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme PatientsSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [375] arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]
-
Title: VideoVLA: Video Generators Can Be Generalizable Robot ManipulatorsAuthors: Yichao Shen, Fangyun Wei, Zhiying Du, Yaobo Liang, Yan Lu, Jiaolong Yang, Nanning Zheng, Baining GuoComments: Project page: this https URLJournal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [376] arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR ChallengeComments: 2025 NeurIPS Behavior Challenge 1st place solutionSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [377] arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Dynamic Visual SLAM using a General 3D PriorComments: 8 pagesSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [378] arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]
-
Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge DevicesComments: 9Pages, 3 figure, Politeknik Negeri BanyuwangiSubjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [379] arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]
-
Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice AssociationComments: FAME 2026 Technical ReportSubjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
- [380] arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step DynamicsComments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-conceptSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
- [381] arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG DataSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[ showing 250 entries per page: fewer | more | all ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)