Computer Vision and Pattern Recognition
Authors and titles for recent submissions
[ total of 749 entries: 1-500 | 501-749 ][ showing 500 entries per page: fewer | more | all ]
Wed, 10 Dec 2025
- [1] arXiv:2512.08931 [pdf, ps, other]
-
Title: Astra: General Interactive World Model with Autoregressive DenoisingComments: Code is available at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [2] arXiv:2512.08930 [pdf, ps, other]
-
Title: Selfi: Self Improving Reconstruction Engine via 3D Geometric Feature AlignmentAuthors: Youming Deng, Songyou Peng, Junyi Zhang, Kathryn Heal, Tiancheng Sun, John Flynn, Steve Marschner, Lucy ChaiComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [3] arXiv:2512.08924 [pdf, ps, other]
-
Title: Efficiently Reconstructing Dynamic Scenes One D4RT at a TimeAuthors: Chuhan Zhang, Guillaume Le Moing, Skanda Koppula, Ignacio Rocco, Liliane Momeni, Junyu Xie, Shuyang Sun, Rahul Sukthankar, Joëlle K Barral, Raia Hadsell, Zoubin Ghahramani, Andrew Zisserman, Junlin Zhang, Mehdi SM SajjadiComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [4] arXiv:2512.08922 [pdf, ps, other]
-
Title: Unified Diffusion Transformer for High-fidelity Text-Aware Image RestorationAuthors: Jin Hyeon Kim, Paul Hyunbin Cho, Claire Kim, Jaewon Min, Jaeeun Lee, Jihye Park, Yeji Choi, Seungryong KimSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [5] arXiv:2512.08912 [pdf, ps, other]
-
Title: LiDAS: Lighting-driven Dynamic Active Sensing for Nighttime PerceptionComments: Preprint. 12 pages, 9 figures. Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [6] arXiv:2512.08905 [pdf, ps, other]
-
Title: Self-Evolving 3D Scene Generation from a Single ImageSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [7] arXiv:2512.08897 [pdf, ps, other]
-
Title: UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [8] arXiv:2512.08889 [pdf, ps, other]
-
Title: No Labels, No Problem: Training Visual Reasoners with Multimodal VerifiersComments: Project webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [9] arXiv:2512.08888 [pdf, ps, other]
-
Title: Accelerated Rotation-Invariant Convolution for UAV Image SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [10] arXiv:2512.08881 [pdf, ps, other]
-
Title: SATGround: A Spatially-Aware Approach for Visual Grounding in Remote SensingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [11] arXiv:2512.08873 [pdf, ps, other]
-
Title: Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image CaptioningComments: 6 pagesJournal-ref: 2024 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
- [12] arXiv:2512.08860 [pdf, ps, other]
-
Title: Tri-Bench: Stress-Testing VLM Reliability on Spatial Reasoning under Camera Tilt and Object InterferenceAuthors: Amit BendkhaleComments: 6 pages, 3 figures. Code and data: this https URL Accepted to the AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [13] arXiv:2512.08854 [pdf, ps, other]
-
Title: Generation is Required for Data-Efficient PerceptionComments: PreprintSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [14] arXiv:2512.08829 [pdf, ps, other]
-
Title: InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language ModelsComments: 16 pages, 8 figures, conference or other essential infoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [15] arXiv:2512.08820 [pdf, ps, other]
-
Title: Training-Free Dual Hyperbolic Adapters for Better Cross-Modal ReasoningAuthors: Yi Zhang, Chun-Wun Cheng, Junyi He, Ke Yu, Yushun Tang, Carola-Bibiane Schönlieb, Zhihai He, Angelica I. Aviles-RiveroComments: Accepted in IEEE Transactions on Multimedia (TMM)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [16] arXiv:2512.08789 [pdf, ps, other]
-
Title: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte GuidanceComments: 10 pages, 7 figures, 5 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [17] arXiv:2512.08785 [pdf, ps, other]
-
Title: LoFA: Learning to Predict Personalized Priors for Fast Adaptation of Visual Generative ModelsComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [18] arXiv:2512.08774 [pdf, ps, other]
-
Title: Refining Visual Artifacts in Diffusion Models via Explainable AI-based Flaw Activation MapsComments: 10 pages, 9 figures, 7 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [19] arXiv:2512.08765 [pdf, ps, other]
-
Title: Wan-Move: Motion-controllable Video Generation via Latent Trajectory GuidanceAuthors: Ruihang Chu, Yefei He, Zhekai Chen, Shiwei Zhang, Xiaogang Xu, Bin Xia, Dingdong Wang, Hongwei Yi, Xihui Liu, Hengshuang Zhao, Yu Liu, Yingya Zhang, Yujiu YangComments: NeurlPS 2025. Code and data available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [20] arXiv:2512.08751 [pdf, ps, other]
-
Title: Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge DevicesSubjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
- [21] arXiv:2512.08747 [pdf, ps, other]
-
Title: A Scalable Pipeline Combining Procedural 3D Graphics and Guided Diffusion for Photorealistic Synthetic Training Data Generation in White Button Mushroom SegmentationComments: 20 pages, 8 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [22] arXiv:2512.08738 [pdf, ps, other]
-
Title: Pose-Based Sign Language Spotting via an End-to-End Encoder ArchitectureComments: To appear at AACL-IJCNLP 2025 Workshop WSLPSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [23] arXiv:2512.08733 [pdf, ps, other]
-
Title: Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware ReweightingAuthors: Kuniko Paxton, Zeinab Dehghani, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis PapadopoulosSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [24] arXiv:2512.08730 [pdf, ps, other]
-
Title: SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [25] arXiv:2512.08700 [pdf, ps, other]
-
Title: Scale-invariant and View-relational Representation Learning for Full Surround Monocular DepthAuthors: Kyumin Hwang, Wonhyeok Choi, Kiljoon Han, Wonjoon Choi, Minwoo Choi, Yongcheon Na, Minwoo Park, Sunghoon ImComments: Accepted at IEEE Robotics and Automation Letters (RA-L) 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [26] arXiv:2512.08697 [pdf, ps, other]
-
Title: What really matters for person re-identification? A Mixture-of-Experts Framework for Semantic Attribute ImportanceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [27] arXiv:2512.08673 [pdf, ps, other]
-
Title: Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point CloudsComments: 16 pages, 6 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [28] arXiv:2512.08648 [pdf, ps, other]
-
Title: Repulsor: Accelerating Generative Modeling with a Contrastive Memory BankAuthors: Shaofeng Zhang, Xuanqi Chen, Ning Liao, Haoxiang Zhao, Xiaoxing Wang, Haoru Tan, Sitong Wu, Xiaosong Jia, Qi Fan, Junchi YanComments: 19 pages, 19 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [29] arXiv:2512.08647 [pdf, ps, other]
-
Title: C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior RecognitionAuthors: Keito InoshitaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [30] arXiv:2512.08645 [pdf, ps, other]
-
Title: Chain-of-Image Generation: Toward Monitorable and Controllable Image GenerationComments: 19 pages, 13 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [31] arXiv:2512.08639 [pdf, ps, other]
-
Title: Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied ReasoningComments: Under Review, 12 pages, 9 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [32] arXiv:2512.08627 [pdf, ps, other]
-
Title: Trajectory Densification and Depth from Perspective-based BlurSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [33] arXiv:2512.08625 [pdf, ps, other]
-
Title: OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set SemanticsComments: 8 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [34] arXiv:2512.08606 [pdf, ps, other]
-
Title: Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot LearningComments: 14 pages, 8 figures, Association for the Advancement of Artificial Intelligence (AAAI2026, poster)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [35] arXiv:2512.08589 [pdf, ps, other]
-
Title: Automated Pollen Recognition in Optical and Holographic Microscopy ImagesAuthors: Swarn Singh Warshaneyan, Maksims Ivanovs, Blaž Cugmas, Inese Bērziņa, Laura Goldberga, Mindaugas Tamosiunas, Roberts KadiķisComments: 08 pages, 10 figures, 04 tables, 20 references. Date of Conference: 13-14 June 2025 Date Added to IEEE Xplore: 10 July 2025 Electronic ISBN: 979-8-3315-0969-9 Print on Demand(PoD) ISBN: 979-8-3315-0970-5 DOI: 10.1109/AICCONF64766.2025.11064260 Conference Location: Prague, Czech Republic Online Access: this https URLJournal-ref: 2025 3rd Cognitive Models and Artificial Intelligence Conference (AICCONF), vol. 1, no. 1, pp. 1-8, Prague, Czech Republic, IEEE, 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [36] arXiv:2512.08577 [pdf, ps, other]
-
Title: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open SurgerySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
- [37] arXiv:2512.08572 [pdf, ps, other]
-
Title: From Cells to Survival: Hierarchical Analysis of Cell Inter-Relations in Multiplex Microscopy for Lung Cancer PrognosisAuthors: Olle Edgren Schüllerqvist, Jens Baumann, Joakim Lindblad, Love Nordling, Artur Mezheyeuski, Patrick Micke, Nataša SladojeComments: 5 pages, 3 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [38] arXiv:2512.08569 [pdf, ps, other]
-
Title: Instance-Aware Test-Time Segmentation for Continual Domain ShiftsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [39] arXiv:2512.08564 [pdf, ps, other]
-
Title: Modular Neural Image Signal ProcessingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [40] arXiv:2512.08560 [pdf, ps, other]
-
Title: BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human BrainAuthors: Navve Wasserman, Matias Cosarinsky, Yuval Golbari, Aude Oliva, Antonio Torralba, Tamar Rott Shaham, Michal IraniSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [41] arXiv:2512.08557 [pdf, ps, other]
-
Title: SSCATeR: Sparse Scatter-Based Convolution Algorithm with Temporal Data Recycling for Real-Time 3D Object Detection in LiDAR Point CloudsComments: 22 Pages, 26 Figures, This work has been submitted to the IEEE Sensors Journal for possible publicationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [42] arXiv:2512.08547 [pdf, ps, other]
-
Title: An Iteration-Free Fixed-Point Estimator for Diffusion InversionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [43] arXiv:2512.08542 [pdf, ps, other]
-
Title: A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
- [44] arXiv:2512.08537 [pdf, ps, other]
-
Title: Fast-ARDiff: An Entropy-informed Acceleration Framework for Continuous Space Autoregressive GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [45] arXiv:2512.08535 [pdf, ps, other]
-
Title: Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail EnhancementSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [46] arXiv:2512.08534 [pdf, ps, other]
-
Title: PaintFlow: A Unified Framework for Interactive Oil Paintings Editing and GenerationComments: 14 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [47] arXiv:2512.08529 [pdf, ps, other]
-
Title: MVP: Multiple View Prediction Improves GUI GroundingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [48] arXiv:2512.08524 [pdf, ps, other]
-
Title: Beyond Real Weights: Hypercomplex Representations for Stable QuantizationAuthors: Jawad Ibn Ahad, Maisha Rahman, Amrijit Biswas, Muhammad Rafsan Kabir, Robin Krambroeckers, Sifat Momen, Nabeel Mohammed, Shafin RahmanComments: Accepted in Winter Conference on Applications of Computer Vision (WACV) 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [49] arXiv:2512.08511 [pdf, ps, other]
-
Title: Thinking with Images via Self-Calling AgentComments: Code would be released at this https URL soonSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [50] arXiv:2512.08506 [pdf, ps, other]
-
Title: OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point CloudsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [51] arXiv:2512.08505 [pdf, ps, other]
-
Title: Beyond the Noise: Aligning Prompts with Latent Representations in Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [52] arXiv:2512.08503 [pdf, ps, other]
-
Title: Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [53] arXiv:2512.08498 [pdf, ps, other]
-
Title: On-the-fly Large-scale 3D Reconstruction from Multi-Camera RigsAuthors: Yijia Guo, Tong Hu, Zhiwei Li, Liwen Hu, Keming Qian, Xitong Lin, Shengbo Chen, Tiejun Huang, Lei MaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [54] arXiv:2512.08486 [pdf, ps, other]
-
Title: Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned InterventionsComments: Code is available at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [55] arXiv:2512.08478 [pdf, ps, other]
-
Title: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting PlatformAuthors: Yuning Gong, Yifei Liu, Yifan Zhan, Muyao Niu, Xueying Li, Yuanjun Liao, Jiaming Chen, Yuanyuan Gao, Jiaqi Chen, Minming Chen, Li Zhou, Yuning Zhang, Wei Wang, Xiaoqing Hou, Huaxi Huang, Shixiang Tang, Le Ma, Dingwen Zhang, Xue Yang, Junchi Yan, Yanchi Zhang, Yinqiang Zheng, Xiao Sun, Zhihang ZhongComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
- [56] arXiv:2512.08477 [pdf, ps, other]
-
Title: ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Consistent AttentionAuthors: Huiguo He, Pengyu Yan, Ziqi Yi, Weizhi Zhong, Zheng Liu, Yejun Tang, Huan Yang, Kun Gai, Guanbin Li, Lianwen JinSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [57] arXiv:2512.08467 [pdf, ps, other]
-
Title: Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion RecoveryComments: 8 pages, 5 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [58] arXiv:2512.08445 [pdf, ps, other]
-
Title: Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution ShiftsSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [59] arXiv:2512.08441 [pdf, ps, other]
-
Title: Leveraging Multispectral Sensors for Color Correction in Mobile CamerasSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [60] arXiv:2512.08439 [pdf, ps, other]
-
Title: LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-trainingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [61] arXiv:2512.08430 [pdf, ps, other]
-
Title: SDT-6D: Fully Sparse Depth-Transformer for Staged End-to-End 6D Pose Estimation in Industrial Multi-View Bin PickingComments: Accepted to WACV 2026. Preprint versionSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [62] arXiv:2512.08410 [pdf, ps, other]
-
Title: Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip RetrievalAuthors: Tao Chen, Shaobo Ju, Qiong Wu, Chenxin Fang, Kun Zhang, Jun Peng, Hui Li, Yiyi Zhou, Rongrong JiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [63] arXiv:2512.08406 [pdf, ps, other]
-
Title: SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from VideosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [64] arXiv:2512.08400 [pdf, ps, other]
-
Title: Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in FisheriesComments: The paper has been accepted for publication at Northern Lights Deep Learning (NLDL) Conference 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [65] arXiv:2512.08397 [pdf, ps, other]
-
Title: Detection of Digital Facial Retouching utilizing Face Beauty InformationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [66] arXiv:2512.08378 [pdf, ps, other]
-
Title: Simultaneous Enhancement and Noise Suppression under Complex Illumination ConditionsComments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENTSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [67] arXiv:2512.08374 [pdf, ps, other]
-
Title: The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information LossAuthors: Bozhou Li, Xinda Xue, Sihan Yang, Yang Shi, Xinlong Chen, Yushuo Guan, Yuanxing Zhang, Wentao ZhangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [68] arXiv:2512.08362 [pdf, ps, other]
-
Title: SCU-CGAN: Enhancing Fire Detection through Synthetic Fire Image Generation and Dataset AugmentationComments: Accepted for main track at MobieSec 2024 (not published in the proceedings)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [69] arXiv:2512.08358 [pdf, ps, other]
-
Title: TrackingWorld: World-centric Monocular 3D Tracking of Almost All PixelsAuthors: Jiahao Lu, Weitao Xiong, Jiacheng Deng, Peng Li, Tianyu Huang, Zhiyang Dou, Cheng Lin, Sai-Kit Yeung, Yuan LiuComments: Accepted by NeurIPS 2025. Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [70] arXiv:2512.08337 [pdf, ps, other]
-
Title: DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [71] arXiv:2512.08334 [pdf, ps, other]
-
Title: HybridSplat: Fast Reflection-baked Gaussian Tracing using Hybrid SplattingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [72] arXiv:2512.08331 [pdf, ps, other]
-
Title: Bi^2MAC: Bimodal Bi-Adaptive Mask-Aware Convolution for Remote Sensing PansharpeningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [73] arXiv:2512.08330 [pdf, ps, other]
-
Title: PointDico: Contrastive 3D Representation Learning Guided by Diffusion ModelsComments: Accepted by IJCNN 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [74] arXiv:2512.08329 [pdf, ps, other]
-
Title: Interpreting Structured Perturbations in Image Protection Methods for Diffusion ModelsComments: 32 pages, 17 figures, 1 table, 5 algorithms, preprintSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [75] arXiv:2512.08327 [pdf, ps, other]
-
Title: Low Rank Support Quaternion Matrix MachineSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
- [76] arXiv:2512.08325 [pdf, ps, other]
-
Title: GeoDiffMM: Geometry-Guided Conditional Diffusion for Motion MagnificationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [77] arXiv:2512.08323 [pdf, ps, other]
-
Title: Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challengeAuthors: Achraf Ben-Hamadou, Nour Neifar, Ahmed Rekik, Oussama Smaoui, Firas Bouzguenda, Sergi Pujades, Niels van Nistelrooij, Shankeeth Vinayahalingam, Kaibo Shi, Hairong Jin, Youyi Zheng, Tibor Kubík, Oldřich Kodym, Petr Šilling, Kateřina Trávníčková, Tomáš Mojžiš, Jan Matula, Jeffry Hartanto, Xiaoying Zhu, Kim-Ngan Nguyen, Tudor Dascalu, Huikai Wu, and Weijie Liu, Shaojie Zhuang, Guangshun Wei, Yuanfeng ZhouComments: MICCAI 2024, 3DTeethLand, Challenge report, under reviewSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [78] arXiv:2512.08317 [pdf, ps, other]
-
Title: GeoDM: Geometry-aware Distribution Matching for Dataset DistillationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [79] arXiv:2512.08309 [pdf, ps, other]
-
Title: Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise in Infinite, Real-Time Terrain GenerationAuthors: Alexander GoslinSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
- [80] arXiv:2512.08294 [pdf, ps, other]
-
Title: OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and ManipulationAuthors: Yexin Liu, Manyuan Zhang, Yueze Wang, Hongyu Li, Dian Zheng, Weiming Zhang, Changsheng Lu, Xunliang Cai, Yan Feng, Peng Pei, Harry YangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [81] arXiv:2512.08282 [pdf, ps, other]
-
Title: PAVAS: Physics-Aware Video-to-Audio SynthesisSubjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
- [82] arXiv:2512.08269 [pdf, ps, other]
-
Title: EgoX: Egocentric Video Generation from a Single Exocentric VideoComments: 21 pages, project page : this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [83] arXiv:2512.08262 [pdf, ps, other]
-
Title: RLCNet: An end-to-end deep learning framework for simultaneous online calibration of LiDAR, RADAR, and CameraSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [84] arXiv:2512.08254 [pdf, ps, other]
-
Title: SFP: Real-World Scene Recovery Using Spatial and Frequency PriorsComments: 10 pages, 13 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [85] arXiv:2512.08253 [pdf, ps, other]
-
Title: Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [86] arXiv:2512.08247 [pdf, ps, other]
-
Title: Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object DetectionComments: AAAI-26Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [87] arXiv:2512.08243 [pdf, ps, other]
-
Title: Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSIAuthors: Saeeda Naz, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences (UEAS), Swat, Pakistan)Comments: 26 Pages, 10 Figures, 4 TablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [88] arXiv:2512.08240 [pdf, ps, other]
-
Title: HybridToken-VLM: Hybrid Token Compression for Vision-Language ModelsAuthors: Jusheng Zhang, Xiaoyang Guo, Kaitong Cai, Qinhan Lv, Yijia Fan, Wenhao Chai, Jian Wang, Keze WangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [89] arXiv:2512.08237 [pdf, ps, other]
-
Title: FastBEV++: Fast by Algorithm, Deployable by DesignSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [90] arXiv:2512.08229 [pdf, ps, other]
-
Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic SystemsSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [91] arXiv:2512.08228 [pdf, ps, other]
-
Title: MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal ModelsAuthors: Jusheng Zhang, Kaitong Cai, Xiaoyang Guo, Sidi Liu, Qinhan Lv, Ruiqi Chen, Jing Yang, Yijia Fan, Xiaofei Sun, Jian Wang, Ziliang Chen, Liang Lin, Keze WangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [92] arXiv:2512.08227 [pdf, ps, other]
-
Title: New VVC profiles targeting Feature Coding for MachinesComments: Accepted for presentation at ICIP 2025 workshop on Coding for MachinesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [93] arXiv:2512.08223 [pdf, ps, other]
-
Title: SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [94] arXiv:2512.08221 [pdf, ps, other]
-
Title: VisKnow: Constructing Visual Knowledge Base for Object UnderstandingComments: 16 pages, 12 figures, 7 tables. Under reviewSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [95] arXiv:2512.08215 [pdf, ps, other]
-
Title: Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior RefinementSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [96] arXiv:2512.08198 [pdf, ps, other]
-
Title: Animal Re-Identification on MicrocontrollersSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [97] arXiv:2512.08180 [pdf, ps, other]
-
Title: GeoLoom: High-quality Geometric Diagram Generation from Textual InputSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [98] arXiv:2512.08163 [pdf, ps, other]
-
Title: Accuracy Does Not Guarantee Human-Likeness in Monocular Depth EstimatorsComments: 22 pages, 12 figures, 1 tableSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [99] arXiv:2512.08161 [pdf, ps, other]
-
Title: Fourier-RWKV: A Multi-State Perception Network for Efficient Image DehazingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [100] arXiv:2512.08135 [pdf, ps, other]
-
Title: CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial ReasoningComments: Accepted to WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [101] arXiv:2512.08075 [pdf, ps, other]
-
Title: Identification of Deforestation Areas in the Amazon Rainforest Using Change Detection ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [102] arXiv:2512.08048 [pdf, ps, other]
-
Title: Mask to Adapt: Simple Random Masking Enables Robust Continual Test-Time LearningAuthors: Chandler Timm C. DolorielComments: ongoing workSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [103] arXiv:2512.08042 [pdf, ps, other]
-
Title: Towards Sustainable Universal Deepfake Detection with Frequency-Domain MaskingAuthors: Chandler Timm C. Doloriel, Habib Ullah, Kristian Hovde Liland, Fadi Al Machot, Ngai-Man CheungSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [104] arXiv:2512.08040 [pdf, ps, other]
-
Title: Lost in Translation, Found in Embeddings: Sign Language Translation and AlignmentSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [105] arXiv:2512.08038 [pdf, ps, other]
-
Title: SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity ClassificationAuthors: Elifnur Sunger, Tales Imbiriba, Peter Campbell, Deniz Erdogmus, Stratis Ioannidis, Jennifer DyComments: 20 pages, 16 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [106] arXiv:2512.08016 [pdf, ps, other]
-
Title: FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language ModelsAuthors: Jiyoon Pyo, Yuankun Jiao, Dongwon Jung, Zekun Li, Leeje Jang, Sofia Kirsanova, Jina Kim, Yijun Lin, Qin Liu, Junyi Xie, Hadi Askari, Nan Xu, Muhao Chen, Yao-Yi ChiangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [107] arXiv:2512.07984 [pdf, ps, other]
-
Title: Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer DetectionComments: 13 pages, 7 figures, 3 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [108] arXiv:2512.07951 [pdf, ps, other]
-
Title: Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic QualityAuthors: Zekai Luo, Zongze Du, Zhouhang Zhu, Hao Zhong, Muzhi Zhu, Wen Wang, Yuling Xi, Chenchen Jing, Hao Chen, Chunhua ShenComments: Project webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [109] arXiv:2512.07925 [pdf, ps, other]
-
Title: Near-real time fires detection using satellite imagery in Sudan conflictSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [110] arXiv:2512.07838 [pdf, ps, other]
-
Title: Detection of Cyberbullying in GIF using AISubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
- [111] arXiv:2512.08715 (cross-list from cs.PF) [pdf, ps, other]
-
Title: Multi-domain performance analysis with scores tailored to user preferencesSubjects: Performance (cs.PF); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [112] arXiv:2512.08629 (cross-list from cs.AI) [pdf, ps, other]
-
Title: See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic ArmSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
- [113] arXiv:2512.08545 (cross-list from cs.CL) [pdf, ps, other]
-
Title: Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon TasksComments: 22 pages, 2 tables, 9 figuresSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
- [114] arXiv:2512.08500 (cross-list from cs.GR) [pdf, ps, other]
-
Title: Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D MotionsSubjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
- [115] arXiv:2512.08360 (cross-list from cs.NE) [pdf, ps, other]
-
Title: Conditional Morphogenesis: Emergent Generation of Structural Digits via Neural Cellular AutomataAuthors: Ali SakourComments: 13 pages, 5 figures. Code available at: this https URLSubjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [116] arXiv:2512.08284 (cross-list from physics.geo-ph) [pdf, ps, other]
-
Title: Self-Reinforced Deep Priors for Reparameterized Full Waveform InversionComments: Submitted to GEOPHYSICSSubjects: Geophysics (physics.geo-ph); Computer Vision and Pattern Recognition (cs.CV)
- [117] arXiv:2512.08271 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic TeleoperationComments: Published and Presented at 3rd Workshop on Human-Centric Multilateral Teleoperation in ICRA 2025Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
- [118] arXiv:2512.08216 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [119] arXiv:2512.08188 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World ModelAuthors: Wenjiang Xu, Cindy Wang, Rui Fang, Mingkang Zhang, Lusong Li, Jing Xu, Jiayuan Gu, Zecui Zeng, Rui ChenComments: Website at this https URLSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [120] arXiv:2512.08170 (cross-list from cs.RO) [pdf, ps, other]
-
Title: RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric FeaturesSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [121] arXiv:2512.08153 (cross-list from cs.LG) [pdf, ps, other]
-
Title: TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion ModelsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [122] arXiv:2512.08125 (cross-list from eess.IV) [pdf, ps, other]
-
Title: FlowSteer: Conditioning Flow Field for Consistent Image RestorationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [123] arXiv:2512.08099 (cross-list from math.NA) [pdf, ps, other]
-
Title: Generalizations of the Normalized Radon Cumulative Distribution Transform for Limited Data RecognitionSubjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
- [124] arXiv:2512.08029 (cross-list from cs.LG) [pdf, ps, other]
-
Title: CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent SpaceSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [125] arXiv:2512.07998 (cross-list from cs.RO) [pdf, ps, other]
-
Title: DIJIT: A Robotic Head for an Active ObserverAuthors: Mostafa Kamali Tabrizi, Mingshi Chi, Bir Bikram Dey, Yu Qing Yuan, Markus D. Solbach, Yiqian Liu, Michael Jenkin, John K. TsotsosSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [126] arXiv:2512.07981 (cross-list from cs.LG) [pdf, ps, other]
-
Title: CIP-Net: Continual Interpretable Prototype-based NetworkSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [127] arXiv:2512.07976 (cross-list from cs.RO) [pdf, ps, other]
-
Title: VLD: Visual Language Goal Distance for Reinforcement Learning NavigationSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [128] arXiv:2512.07969 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Sparse Variable Projection in Robotic Perception: Exploiting Separable Structure for Efficient Nonlinear OptimizationComments: 8 pages, submitted for reviewSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [129] arXiv:2512.07884 (cross-list from cs.LG) [pdf, ps, other]
-
Title: GSPN-2: Efficient Parallel Sequence ModelingAuthors: Hongjun Wang, Yitong Jiang, Collin McCarthy, David Wehr, Hanrong Ye, Xinhao Li, Ka Chun Cheung, Wonmin Byeon, Jinwei Gu, Ke Chen, Kai Han, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Sifei LiuComments: NeurIPS 2025Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [130] arXiv:2512.07855 (cross-list from cs.LG) [pdf, ps, other]
-
Title: LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer ModelSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [131] arXiv:2512.05791 (cross-list from physics.med-ph) [pdf, ps, other]
-
Title: Fast and Robust Diffusion Posterior Sampling for MR Image Reconstruction Using the Preconditioned Unadjusted Langevin AlgorithmComments: Submitted to Magnetic Resonance in MedicineSubjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Probability (math.PR)
Tue, 9 Dec 2025
- [132] arXiv:2512.07834 [pdf, ps, other]
-
Title: Voxify3D: Pixel Art Meets Volumetric RenderingComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [133] arXiv:2512.07833 [pdf, ps, other]
-
Title: Relational Visual SimilarityAuthors: Thao Nguyen, Sicheng Mo, Krishna Kumar Singh, Yilin Wang, Jing Shi, Nicholas Kolkin, Eli Shechtman, Yong Jae Lee, Yuheng LiComments: Project page, data, and code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [134] arXiv:2512.07831 [pdf, ps, other]
-
Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video GenerationAuthors: Jiehui Huang, Yuechen Zhang, Xu He, Yuan Gao, Zhi Cen, Bin Xia, Yan Zhou, Xin Tao, Pengfei Wan, Jiaya JiaComments: Project Website this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [135] arXiv:2512.07829 [pdf, ps, other]
-
Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [136] arXiv:2512.07826 [pdf, ps, other]
-
Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video EditingAuthors: Haoyang He, Jie Wang, Jiangning Zhang, Zhucun Xue, Xingyuan Bu, Qiangpeng Yang, Shilei Wen, Lei XieComments: 38 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [137] arXiv:2512.07821 [pdf, ps, other]
-
Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion ModelingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [138] arXiv:2512.07807 [pdf, ps, other]
-
Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale ScenesComments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [139] arXiv:2512.07806 [pdf, ps, other]
-
Title: Multi-view Pyramid Transformer: Look Coarser to See BroaderComments: Project page: see this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [140] arXiv:2512.07802 [pdf, ps, other]
-
Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive MemoryAuthors: Zhaochong An, Menglin Jia, Haonan Qiu, Zijian Zhou, Xiaoke Huang, Zhiheng Liu, Weiming Ren, Kumara Kahatapitiya, Ding Liu, Sen He, Chenyang Zhang, Tao Xiang, Fanny Yang, Serge Belongie, Tian XieComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [141] arXiv:2512.07778 [pdf, ps, other]
-
Title: Distribution Matching Variational AutoEncoderSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [142] arXiv:2512.07776 [pdf, ps, other]
-
Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population MonitoringAuthors: Maximilian Schall, Felix Leonard Knöfel, Noah Elias König, Jan Jonas Kubeler, Maximilian von Klinski, Joan Wilhelm Linnemann, Xiaoshi Liu, Iven Jelle Schlegelmilch, Ole Woyciniuk, Alexandra Schild, Dante Wasmuht, Magdalena Bermejo Espinet, German Illera Basas, Gerard de MeloComments: Accepted at WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [143] arXiv:2512.07760 [pdf, ps, other]
-
Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-IdentificationComments: Accepted to AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [144] arXiv:2512.07756 [pdf, ps, other]
-
Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound ReconstructionSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [145] arXiv:2512.07747 [pdf, ps, other]
-
Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and GenerationAuthors: Shihao Zhao, Yitong Chen, Zeyinzi Jiang, Bojia Zi, Shaozhe Hao, Yu Liu, Chaojie Mao, Kwan-Yee K. WongSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [146] arXiv:2512.07745 [pdf, ps, other]
-
Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous DrivingAuthors: Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, Xinggang WangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [147] arXiv:2512.07738 [pdf, ps, other]
-
Title: HLTCOE Evaluation Team at TREC 2025: VQA TrackAuthors: Dengjia Zhang, Charles Weng, Katherine Guerrerio, Yi Lu, Kenton Murray, Alexander Martin, Reno Kriz, Benjamin Van DurmeComments: 7 pages, 1 figureSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [148] arXiv:2512.07733 [pdf, ps, other]
-
Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental ImagerySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [149] arXiv:2512.07730 [pdf, ps, other]
-
Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object HallucinationComments: WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [150] arXiv:2512.07729 [pdf, ps, other]
-
Title: Improving action classification with brain-inspired deep networksSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [151] arXiv:2512.07720 [pdf, ps, other]
-
Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar CreationAuthors: Fan Yang, Heyuan Li, Peihao Li, Weihao Yuan, Lingteng Qiu, Chaoyue Song, Cheng Chen, Yisheng He, Shifeng Zhang, Xiaoguang Han, Steven Hoi, Guosheng LinComments: Project page: \url{this https URL}Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [152] arXiv:2512.07712 [pdf, ps, other]
-
Title: UnCageNet: Tracking and Pose Estimation of Caged AnimalComments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, IndiaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [153] arXiv:2512.07703 [pdf, ps, other]
-
Title: PVeRA: Probabilistic Vector-Based Random Matrix AdaptationAuthors: Leo Fillioux, Enzo Ferrante, Paul-Henry Cournède, Maria Vakalopoulou, Stergios ChristodoulidisSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [154] arXiv:2512.07702 [pdf, ps, other]
-
Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image AlignmentComments: WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [155] arXiv:2512.07698 [pdf, ps, other]
-
Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data OnlySubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [156] arXiv:2512.07674 [pdf, ps, other]
-
Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast RepresentationsAuthors: Mehmet Yigit Avci, Pedro Borges, Virginia Fernandez, Paul Wright, Mehmet Yigitsoy, Sebastien Ourselin, Jorge CardosoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [157] arXiv:2512.07668 [pdf, ps, other]
-
Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and DatasetSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [158] arXiv:2512.07661 [pdf, ps, other]
-
Title: Optimization-Guided Diffusion for Interactive Scene GenerationAuthors: Shiaho Li, Naisheng Ye, Tianyu Li, Kashyap Chitta, Tuo An, Peng Su, Boyang Wang, Haiou Liu, Chen Lv, Hongyang LiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [159] arXiv:2512.07652 [pdf, ps, other]
-
Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific ResearchSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [160] arXiv:2512.07651 [pdf, ps, other]
-
Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline MethodSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [161] arXiv:2512.07628 [pdf, ps, other]
-
Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D GenerationAuthors: Zhiqi Li, Wenhuan Li, Tengfei Wang, Zhenwei Wang, Junta Wu, Haoyuan Wang, Yunhan Yang, Zehuan Huang, Yang Li, Peidong Liu, Chunchao GuoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [162] arXiv:2512.07606 [pdf, ps, other]
-
Title: Decomposition Sampling for Efficient Region Annotations in Active LearningAuthors: Jingna Qiu, Frauke Wilm, Mathias Öttl, Jonas Utz, Maja Schlereth, Moritz Schillinger, Marc Aubreville, Katharina BreiningerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [163] arXiv:2512.07599 [pdf, ps, other]
-
Title: Online Segment Any 3D Thing as Instance TrackingComments: NeurIPS 2025, Code is at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [164] arXiv:2512.07596 [pdf, ps, other]
-
Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic SurgeryAuthors: Wenzhen Dong, Jieming Yu, Yiming Huang, Hongqiu Wang, Lei Zhu, Albert C. S. Chung, Hongliang Ren, Long BaiComments: Technical ReportSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [165] arXiv:2512.07590 [pdf, ps, other]
-
Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [166] arXiv:2512.07584 [pdf, ps, other]
-
Title: LongCat-Image Technical ReportAuthors: Meituan LongCat Team: Hanghang Ma, Haoxian Tan, Jiale Huang, Junqiang Wu, Jun-Yan He, Lishuai Gao, Songlin Xiao, Xiaoming Wei, Xiaoqi Ma, Xunliang Cai, Yayong Guan, Jie HuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [167] arXiv:2512.07580 [pdf, ps, other]
-
Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMsAuthors: Yahong Wang, Juncheng Wu, Zhangkai Ni, Longzhen Yang, Yihang Liu, Chengmei Yang, Ying Wen, Xianfeng Tang, Hui Liu, Yuyin Zhou, Lianghua HeSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [168] arXiv:2512.07568 [pdf, ps, other]
-
Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic DecorrelationAuthors: Xuecheng Li, Weikuan Jia, Alisher Kurbonaliev, Qurbonaliev Alisher, Khudzhamkulov Rustam, Ismoilov Shuhratjon, Eshmatov Javhariddin, Yuanjie ZhengSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
- [169] arXiv:2512.07564 [pdf, ps, other]
-
Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language ModelsComments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [170] arXiv:2512.07527 [pdf, ps, other]
-
Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite ImagesAuthors: Fei Yu, Yu Liu, Luyang Tang, Mingchao Sun, Zengye Ge, Rui Bu, Yuchao Jin, Haisen Zhao, He Sun, Yangyan Li, Mu Xu, Wenzheng Chen, Baoquan ChenSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [171] arXiv:2512.07514 [pdf, ps, other]
-
Title: MeshRipple: Structured Autoregressive Generation of Artist-MeshesAuthors: Junkai Lin, Hang Long, Huipeng Guo, Jielei Zhang, JiaYi Yang, Tianle Guo, Yang Yang, Jianwen Li, Wenxiao Zhang, Matthias Nießner, Wei YangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [172] arXiv:2512.07504 [pdf, ps, other]
-
Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing PointsComments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [173] arXiv:2512.07503 [pdf, ps, other]
-
Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image GenerationAuthors: Yao Teng, Zhihuan Jiang, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui LiuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [174] arXiv:2512.07500 [pdf, ps, other]
-
Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion TransformerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [175] arXiv:2512.07498 [pdf, ps, other]
-
Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral PriorComments: 16 pages (including appendix)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [176] arXiv:2512.07480 [pdf, ps, other]
-
Title: Single-step Diffusion-based Video Coding with Semantic-Temporal GuidanceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [177] arXiv:2512.07469 [pdf, ps, other]
-
Title: Unified Video Editing with Temporal ReasonerComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [178] arXiv:2512.07426 [pdf, ps, other]
-
Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processingAuthors: Karel Moens, Matthew B. Blaschko, Tinne Tuytelaars, Bart Diricx, Jonas De Vylder, Mustafa YousifComments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [179] arXiv:2512.07415 [pdf, ps, other]
-
Title: Data-driven Exploration of Mobility Interaction PatternsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [180] arXiv:2512.07410 [pdf, ps, other]
-
Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction GraphsAuthors: Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang, Xin Chen, Lan Xu, Jingyi Yu, Jingya WangComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [181] arXiv:2512.07394 [pdf, ps, other]
-
Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric VideoComments: webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [182] arXiv:2512.07391 [pdf, ps, other]
-
Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency MonitoringAuthors: Đorđe NedeljkovićSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [183] arXiv:2512.07385 [pdf, ps, other]
-
Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New BaselineAuthors: Chunhui Zhang, Li Liu, Zhipeng Zhang, Yong Wang, Hao Wen, Xi Zhou, Shiming Ge, Yanfeng WangComments: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [184] arXiv:2512.07383 [pdf, ps, other]
-
Title: LogicCBMs: Logic-Enhanced Concept-Based LearningComments: 18 pages, 19 figures, WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [185] arXiv:2512.07381 [pdf, ps, other]
-
Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic ObjectsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [186] arXiv:2512.07379 [pdf, ps, other]
-
Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and EfficiencyComments: 22 pages, 16 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [187] arXiv:2512.07360 [pdf, ps, other]
-
Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic SegmentationComments: Accepted to WACV2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [188] arXiv:2512.07351 [pdf, ps, other]
-
Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake DetectionAuthors: Sayeem Been Zaman, Wasimul Karim, Arefin Ittesafun Abian, Reem E. Mohamed, Md Rafiqul Islam, Asif Karim, Sami AzamSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
- [189] arXiv:2512.07348 [pdf, ps, other]
-
Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image CompositionAuthors: Xinyu Wei, Kangrui Cen, Hongyang Wei, Zhen Guo, Bairui Li, Zeqing Wang, Jinrui Zhang, Lei ZhangComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [190] arXiv:2512.07345 [pdf, ps, other]
-
Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian SplattingComments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [191] arXiv:2512.07338 [pdf, ps, other]
-
Title: Generalized Referring Expression Segmentation on Aerial PhotosComments: Submitted to IEEE J-STARSSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [192] arXiv:2512.07331 [pdf, ps, other]
-
Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision TransformersAuthors: Kanishk AwadhiyaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [193] arXiv:2512.07328 [pdf, ps, other]
-
Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [194] arXiv:2512.07305 [pdf, ps, other]
-
Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image DatasetAuthors: Tobias Abraham HaiderSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [195] arXiv:2512.07302 [pdf, ps, other]
-
Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task PromptsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [196] arXiv:2512.07276 [pdf, ps, other]
-
Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial ImageryComments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [197] arXiv:2512.07275 [pdf, ps, other]
- [198] arXiv:2512.07273 [pdf, ps, other]
-
Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language TranslationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [199] arXiv:2512.07269 [pdf, ps, other]
-
Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth dataSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [200] arXiv:2512.07253 [pdf, ps, other]
-
Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video EnhancementComments: 18 pages, 8 figures, and 7 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [201] arXiv:2512.07251 [pdf, ps, other]
-
Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast EnhancementAuthors: Junqi Liu, Zejun Wu, Pedro R. A. S. Bassi, Xinze Zhou, Wenxuan Li, Ibrahim E. Hamamci, Sezgin Er, Tianyu Lin, Yi Luo, Szymon Płotka, Bjoern Menze, Daguang Xu, Kai Ding, Kang Wang, Yang Yang, Yucheng Tang, Alan L. Yuille, Zongwei ZhouSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [202] arXiv:2512.07247 [pdf, ps, other]
-
Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven EditingComments: 40 pages, 34 figures, 18 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [203] arXiv:2512.07245 [pdf, ps, other]
-
Title: Zero-Shot Textual Explanations via Translating Decision-Critical FeaturesComments: 11+6 pages, 8 figures, 4 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [204] arXiv:2512.07241 [pdf, ps, other]
-
Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network ArchitectureAuthors: Md. Srabon Chowdhury, Syeda Fahmida Tanzim, Sheekar Banerjee, Ishtiak Al Mamoon, AKM Muzahidul IslamSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [205] arXiv:2512.07237 [pdf, ps, other]
-
Title: Unified Camera Positional Encoding for Controlled Video GenerationAuthors: Cheng Zhang, Boying Li, Meng Wei, Yan-Pei Cao, Camilo Cruz Gambardella, Dinh Phung, Jianfei CaiComments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [206] arXiv:2512.07234 [pdf, ps, other]
-
Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [207] arXiv:2512.07230 [pdf, ps, other]
-
Title: STRinGS: Selective Text Refinement in Gaussian SplattingAuthors: Abhinav Raundhal, Gaurav Behera, P J Narayanan, Ravi Kiran Sarvadevabhatla, Makarand TapaswiComments: Accepted to WACV 2026. Project Page, see this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [208] arXiv:2512.07229 [pdf, ps, other]
-
Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category DiscoveryComments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [209] arXiv:2512.07228 [pdf, ps, other]
-
Title: Towards Robust Protective Perturbation against DeepFake Face SwappingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [210] arXiv:2512.07215 [pdf, ps, other]
-
Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose EstimationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [211] arXiv:2512.07211 [pdf, ps, other]
-
Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point CloudsComments: 8 pages, 8 figures, 5 tables, ICCR 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [212] arXiv:2512.07206 [pdf, ps, other]
-
Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CTAuthors: Boyang Pan, Zeyu Zhang, Hongyu Meng, Bin Cui, Yingying Zhang, Wenli Hou, Junhao Li, Langdi Zhong, Xiaoxiao Chen, Xiaoyu Xu, Changjin Zuo, Chao Cheng, Nan-Jie GongSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [213] arXiv:2512.07203 [pdf, ps, other]
-
Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent ReasoningComments: 7 pages, 1 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [214] arXiv:2512.07201 [pdf, ps, other]
-
Title: Understanding Diffusion Models via Code ExecutionAuthors: Cheng YuSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [215] arXiv:2512.07198 [pdf, ps, other]
-
Title: Generating Storytelling Images with Rich Chains-of-ReasoningSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [216] arXiv:2512.07197 [pdf, ps, other]
-
Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian SplattingComments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [217] arXiv:2512.07192 [pdf, ps, other]
-
Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image CompressionComments: 12 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [218] arXiv:2512.07191 [pdf, ps, other]
-
Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field CorrectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [219] arXiv:2512.07190 [pdf, ps, other]
-
Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image ClassificationAuthors: Pengfei Gu, Huimin Li, Haoteng Tang, Dongkuan (DK)Xu, Erik Enriquez, DongChul Kim, Bin Fu, Danny Z. ChenSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [220] arXiv:2512.07186 [pdf, ps, other]
-
Title: START: Spatial and Textual Learning for Chart UnderstandingComments: WACV2026 Camera ReadySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [221] arXiv:2512.07171 [pdf, ps, other]
-
Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image RestorationComments: 21 pages, 11 figures, 5 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [222] arXiv:2512.07170 [pdf, ps, other]
-
Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer ApproachSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [223] arXiv:2512.07166 [pdf, ps, other]
-
Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM EditingComments: 9 pages,7figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [224] arXiv:2512.07165 [pdf, ps, other]
-
Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale AdaptationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [225] arXiv:2512.07155 [pdf, ps, other]
-
Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented MetricsComments: Please visit our project page at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [226] arXiv:2512.07141 [pdf, ps, other]
-
Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [227] arXiv:2512.07136 [pdf, ps, other]
-
Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and ReasoningAuthors: Siyang Jiang, Mu Yuan, Xiang Ji, Bufang Yang, Zeyu Liu, Lilin Xu, Yang Li, Yuting He, Liran Dong, Wenrui Lu, Zhenyu Yan, Xiaofan Jiang, Wei Gao, Hongkai Chen, Guoliang XingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [228] arXiv:2512.07135 [pdf, ps, other]
-
Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement LearningAuthors: Zebin Xing, Pengxuan Yang, Linbo Wang, Yichen Zhang, Yiming Hu, Yupeng Zheng, Junli Wang, Yinfeng Gao, Guang Li, Kun Ma, Long Chen, Zhongpu Xia, Qichao Zhang, Hangjun Ye, Dongbin ZhaoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [229] arXiv:2512.07128 [pdf, ps, other]
-
Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIPSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [230] arXiv:2512.07126 [pdf, ps, other]
-
Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-OnComments: 16 pages, 8 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [231] arXiv:2512.07110 [pdf, ps, other]
-
Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [232] arXiv:2512.07107 [pdf, ps, other]
-
Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D SupervisionComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [233] arXiv:2512.07078 [pdf, ps, other]
-
Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object DetectionComments: 16 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [234] arXiv:2512.07076 [pdf, ps, other]
-
Title: Context-measure: Contextualizing Metric for CamouflageComments: Technical ReportSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [235] arXiv:2512.07065 [pdf, ps, other]
-
Title: Persistent Homology-Guided Frequency Filtering for Image CompressionComments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compressionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [236] arXiv:2512.07062 [pdf, ps, other]
-
Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense PredictionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [237] arXiv:2512.07052 [pdf, ps, other]
-
Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian SplattingAuthors: Hoang-Nhat Tran, Francesco Di Sario, Gabriele Spadaro, Giuseppe Valenzise, Enzo TartaglioneSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [238] arXiv:2512.07051 [pdf, ps, other]
-
Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image SegmentationComments: 11 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [239] arXiv:2512.07037 [pdf, ps, other]
-
Title: Evaluating and Preserving High-level Fidelity in Super-ResolutionSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [240] arXiv:2512.07034 [pdf, ps, other]
-
Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent CuesAuthors: Tuan-Anh Vu, Hai Nguyen-Truong, Ziqiang Zheng, Binh-Son Hua, Qing Guo, Ivor Tsang, Sai-Kit YeungComments: Accepted to WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [241] arXiv:2512.06981 [pdf, ps, other]
-
Title: Selective Masking based Self-Supervised Learning for Image Semantic SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [242] arXiv:2512.06949 [pdf, ps, other]
-
Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin HistologyComments: 19 pages, 5 figures, 2 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [243] arXiv:2512.06921 [pdf, ps, other]
-
Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy IdentificationAuthors: Ziyang Song, Zelin Zang, Xiaofan Ye, Boqiang Xu, Long Bai, Jinlin Wu, Hongliang Ren, Hongbin Liu, Jiebo Luo, Zhen LeiComments: Accepted by IEEE ICIA 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [244] arXiv:2512.06905 [pdf, ps, other]
-
Title: Scaling Zero-Shot Reference-to-Video GenerationAuthors: Zijian Zhou, Shikun Liu, Haozhe Liu, Haonan Qiu, Zhaochong An, Weiming Ren, Zhiheng Liu, Xiaoke Huang, Kam Woh Ng, Tian Xie, Xiao Han, Yuren Cong, Hang Li, Chuyan Zhu, Aditya Patel, Tao Xiang, Sen HeComments: Website: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [245] arXiv:2512.06888 [pdf, ps, other]
-
Title: Overcoming Small Data Limitations in Video-Based Infant Respiration EstimationAuthors: Liyang Song, Hardik Bishnoi, Sai Kumar Reddy Manne, Sarah Ostadabbas, Briana J. Taylor, Michael WanSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [246] arXiv:2512.06886 [pdf, ps, other]
-
Title: Balanced Learning for Domain Adaptive Semantic SegmentationComments: Accepted by International Conference on Machine Learning (ICML 2025)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [247] arXiv:2512.06885 [pdf, ps, other]
-
Title: JoPano: Unified Panorama Generation via Joint ModelingComments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [248] arXiv:2512.06882 [pdf, ps, other]
-
Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian FusionComments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon requestSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [249] arXiv:2512.06877 [pdf, ps, other]
-
Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene ClassificationComments: Accepted and presented in ICSPISSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [250] arXiv:2512.06870 [pdf, ps, other]
-
Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding PerspectiveComments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [251] arXiv:2512.06866 [pdf, ps, other]
-
Title: Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe PriorComments: Accepted by NeurIPS 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [252] arXiv:2512.06865 [pdf, ps, other]
-
Title: Spatial Retrieval Augmented Autonomous DrivingAuthors: Xiaosong Jia, Chenhe Zhang, Yule Jiang, Songbur Wong, Zhiyuan Zhang, Chen Chen, Shaofeng Zhang, Xuanhe Zhou, Xue Yang, Junchi Yan, Yu-Gang JiangComments: Demo Page: this https URL with open sourced code, dataset, and checkpointsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [253] arXiv:2512.06864 [pdf, ps, other]
-
Title: Boosting Unsupervised Video Instance Segmentation with Automatic Quality-Guided Self-TrainingComments: Accepted to WACV 2026. arXiv admin note: substantial text overlap with arXiv:2508.19808Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [254] arXiv:2512.06862 [pdf, ps, other]
-
Title: Omni-Referring Image SegmentationAuthors: Qiancheng Zheng, Yunhang Shen, Gen Luo, Baiyang Song, Xing Sun, Xiaoshuai Sun, Yiyi Zhou, Rongrong JiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [255] arXiv:2512.06849 [pdf, ps, other]
-
Title: Hide-and-Seek Attribution: Weakly Supervised Segmentation of Vertebral Metastases in CTAuthors: Matan Atad, Alexander W. Marka, Lisa Steinhelfer, Anna Curto-Vilalta, Yannik Leonhardt, Sarah C. Foreman, Anna-Sophia Walburga Dietrich, Robert Graf, Alexandra S. Gersing, Bjoern Menze, Daniel Rueckert, Jan S. Kirschke, Hendrik MöllerComments: In submissionSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [256] arXiv:2512.06845 [pdf, ps, other]
-
Title: Pseudo Anomalies Are All You Need: Diffusion-Based Generation for Weakly-Supervised Video Anomaly DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [257] arXiv:2512.06840 [pdf, ps, other]
-
Title: CADE: Continual Weakly-supervised Video Anomaly Detection with EnsemblesComments: Accepted to WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [258] arXiv:2512.06838 [pdf, ps, other]
-
Title: SparseCoop: Cooperative Perception with Kinematic-Grounded QueriesAuthors: Jiahao Wang, Zhongwei Jiang, Wenchao Sun, Jiaru Zhong, Haibao Yu, Yuner Zhang, Chenyang Lu, Chuang Zhang, Lei He, Shaobing Xu, Jianqiang WangComments: Accepted by AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [259] arXiv:2512.06818 [pdf, ps, other]
-
Title: MeshSplatting: Differentiable Rendering with Opaque MeshesAuthors: Jan Held, Sanghyun Son, Renaud Vandeghen, Daniel Rebain, Matheus Gadelha, Yi Zhou, Anthony Cioppa, Ming C. Lin, Marc Van Droogenbroeck, Andrea TagliasacchiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [260] arXiv:2512.06811 [pdf, ps, other]
-
Title: RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language ModelsComments: Accepted by AAAI 2026(Oral)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
- [261] arXiv:2512.06810 [pdf, ps, other]
-
Title: MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement LearningAuthors: Yueqian Wang, Songxiang Liu, Disong Wang, Nuo Xu, Guanglu Wan, Huishuai Zhang, Dongyan ZhaoSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [262] arXiv:2512.06802 [pdf, ps, other]
-
Title: VDOT: Efficient Unified Video Creation via Optimal Transport DistillationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [263] arXiv:2512.06793 [pdf, ps, other]
-
Title: Generalized Geometry Encoding Volume for Real-time Stereo MatchingComments: Accepted by AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [264] arXiv:2512.06783 [pdf, ps, other]
-
Title: Physics Informed Human Posture Estimation Based on 3D Landmarks from Monocular RGB-VideosComments: 16 pages, 5 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [265] arXiv:2512.06774 [pdf, ps, other]
-
Title: RDSplat: Robust Watermarking Against Diffusion Editing for 3D Gaussian SplattingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [266] arXiv:2512.06769 [pdf, ps, other]
-
Title: Stitch and Tell: A Structured Multimodal Data Augmentation Method for Spatial UnderstandingAuthors: Hang Yin, Xiaomin He, PeiWen Yuan, Yiwei Li, Jiayi Shi, Wenxiao Fan, Shaoxiong Feng, Kan LiSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [267] arXiv:2512.06763 [pdf, ps, other]
-
Title: JOCA: Task-Driven Joint Optimisation of Camera Hardware and Adaptive Camera Control AlgorithmsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [268] arXiv:2512.06759 [pdf, ps, other]
-
Title: VisChainBench: A Benchmark for Multi-Turn, Multi-Image Visual Reasoning Beyond Language PriorsComments: 12 pages,13figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [269] arXiv:2512.06750 [pdf, ps, other]
-
Title: UARE: A Unified Vision-Language Model for Image Quality Assessment, Restoration, and EnhancementAuthors: Weiqi Li, Xuanyu Zhang, Bin Chen, Jingfen Xie, Yan Wang, Kexin Zhang, Junlin Li, Li Zhang, Jian Zhang, Shijie ZhaoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [270] arXiv:2512.06746 [pdf, ps, other]
-
Title: Task-Model Alignment: A Simple Path to Generalizable AI-Generated Image DetectionAuthors: Ruoxin Chen, Jiahui Gao, Kaiqing Lin, Keyue Zhang, Yandan Zhao, Isabel Guan, Taiping Yao, Shouhong DingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [271] arXiv:2512.06738 [pdf, ps, other]
-
Title: FedSCAl: Leveraging Server and Client Alignment for Unsupervised Federated Source-Free Domain AdaptationComments: Accepted to Winter Conference on Applications of Computer Vision (WACV) 2026, Round 1Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [272] arXiv:2512.06736 [pdf, ps, other]
-
Title: Graph Convolutional Long Short-Term Memory Attention Network for Post-Stroke Compensatory Movement Detection Based on Skeleton DataSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [273] arXiv:2512.06726 [pdf, ps, other]
-
Title: The Role of Entropy in Visual Grounding: Analysis and OptimizationAuthors: Shuo Li, Jiajun Sun, Zhihao Zhang, Xiaoran Fan, Senjie Jin, Hui Li, Yuming Yang, Junjie Ye, Lixing Shen, Tao Ji, Tao Gui, Qi Zhang, Xuanjing HuangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [274] arXiv:2512.06689 [pdf, ps, other]
-
Title: Lightweight Wasserstein Audio-Visual Model for Unified Speech Enhancement and SeparationComments: Accepted to ASRU 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
- [275] arXiv:2512.06684 [pdf, ps, other]
-
Title: EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron MicroscopySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [276] arXiv:2512.06674 [pdf, ps, other]
-
Title: RunawayEvil: Jailbreaking the Image-to-Video Generative ModelsAuthors: Songping Wang, Rufan Qian, Yueming Lyu, Qinglong Liu, Linzhuang Zou, Jie Qin, Songhua Liu, Caifeng ShanSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [277] arXiv:2512.06673 [pdf, ps, other]
-
Title: 1 + 1 > 2: Detector-Empowered Video Large Language Model for Spatio-Temporal Grounding and ReasoningAuthors: Shida Gao, Feng Xue, Xiangfeng Wang, Anlong Ming, Teng Long, Yihua Shao, Haozhe Wang, Zhaowen Lin, Wei Wang, Nicu SebeSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [278] arXiv:2512.06663 [pdf, ps, other]
-
Title: CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language TasksSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [279] arXiv:2512.06662 [pdf, ps, other]
-
Title: Personalized Image Descriptions from Attention SequencesAuthors: Ruoyu Xue, Hieu Le, Jingyi Xu, Sounak Mondal, Abe Leite, Gregory Zelinsky, Minh Hoai, Dimitris SamarasComments: 10 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [280] arXiv:2512.06657 [pdf, ps, other]
-
Title: TextMamba: Scene Text Detector with MambaSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [281] arXiv:2512.06642 [pdf, ps, other]
-
Title: Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-ResolutionAuthors: Achmad Ardani Prasha, Clavino Ourizqi Rachmadi, Muhamad Fauzan Ibnu Syahlan, Naufal Rahfi Anugerah, Nanda Garin Raditya, Putri Amelia, Sabrina Laila Mutiara, Hilman Syachr RamadhanComments: 21 pages, 7 figures, 3 tableSubjects: Computer Vision and Pattern Recognition (cs.CV); Cosmology and Nongalactic Astrophysics (astro-ph.CO); Instrumentation and Methods for Astrophysics (astro-ph.IM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [282] arXiv:2512.06613 [pdf, ps, other]
-
Title: Hierarchical Deep Learning for Diatom Image Classification: A Multi-Level Taxonomic ApproachAuthors: Yueying KeComments: 10 pages, 6 figures, 2 tables, IEEE conference format. Submitted as course projectSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [283] arXiv:2512.06612 [pdf, ps, other]
-
Title: Learning Relative Gene Expression Trends from Pathology Images in Spatial TranscriptomicsComments: Neurips 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [284] arXiv:2512.06598 [pdf, ps, other]
-
Title: From Remote Sensing to Multiple Time Horizons Forecasts: Transformers Model for CyanoHAB Intensity in Lake ChamplainAuthors: Muhammad Adil, Patrick J. Clemins, Andrew W. Schroth, Panagiotis D. Oikonomou, Donna M. Rizzo, Peter D. F. Isles, Xiaohan Zhang, Kareem I. Hannoun, Scott Turnbull, Noah B. Beckage, Asim Zia, Safwan WshahComments: 23 pages, 15 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [285] arXiv:2512.06581 [pdf, ps, other]
-
Title: MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video UnderstandingAuthors: Yuhao Su, Anwesa Choudhuri, Zhongpai Gao, Benjamin Planche, Van Nguyen Nguyen, Meng Zheng, Yuhan Shen, Arun Innanje, Terrence Chen, Ehsan Elhamifar, Ziyan WuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [286] arXiv:2512.06575 [pdf, ps, other]
-
Title: Proof of Concept for Mammography Classification with Enhanced Compactness and Separability ModulesAuthors: Fariza DahesComments: 26 pages, 16 figures, 2 tables; proof of concept on mammography classification with compactness/separability modules and interactive dashboard; preprint submitted to arXiv cs.LGSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [287] arXiv:2512.06565 [pdf, ps, other]
-
Title: GNC-Pose: Geometry-Aware GNC-PnP for Accurate 6D Pose EstimationAuthors: Xiujin LiuComments: 1 figures, 2 tables, 14pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [288] arXiv:2512.06562 [pdf, ps, other]
-
Title: SUGAR: A Sweeter Spot for Generative Unlearning of Many IdentitiesAuthors: Dung Thuy Nguyen, Quang Nguyen, Preston K. Robinette, Eli Jiang, Taylor T. Johnson, Kevin LeachSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [289] arXiv:2512.06560 [pdf, ps, other]
-
Title: Bridging spatial awareness and global context in medical image segmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [290] arXiv:2512.06531 [pdf, ps, other]
-
Title: Novel Deep Learning Architectures for Classification and Segmentation of Brain Tumors from MRI ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [291] arXiv:2512.06530 [pdf, ps, other]
-
Title: On The Role of K-Space Acquisition in MRI Reconstruction Domain-GeneralizationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [292] arXiv:2512.06521 [pdf, ps, other]
-
Title: ShadowWolf -- Automatic Labelling, Evaluation and Model Training Optimised for Camera Trap Wildlife ImagesAuthors: Jens Dede (1), Anna Förster (1) ((1) Department of Sustainable Communication Networks, University of Bremen, Bibliothekstr. 1, 28359, Bremen, Bremen, Germany)Comments: 31 pages + appendixSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [293] arXiv:2512.06504 [pdf, ps, other]
-
Title: Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data FusionAuthors: Andrii Lysyi, Anatoliy Sachenko, Pavlo Radiuk, Mykola Lysyi, Oleksandr Melnychenko, Diana ZahorodniaSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [294] arXiv:2512.06485 [pdf, ps, other]
-
Title: Sanvaad: A Multimodal Accessibility Framework for ISL Recognition and Voice-Based InteractionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [295] arXiv:2512.06447 [pdf, ps, other]
-
Title: Towards Stable Cross-Domain Depression Recognition under Missing ModalitiesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [296] arXiv:2512.06438 [pdf, ps, other]
-
Title: AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head AvatarsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [297] arXiv:2512.06434 [pdf, ps, other]
-
Title: Automated Deep Learning Estimation of Anthropometric Measurements for Preparticipation Cardiovascular ScreeningComments: 8 pages, 2 figures, 3 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [298] arXiv:2512.06426 [pdf, ps, other]
-
Title: When Gender is Hard to See: Multi-Attribute Support for Long-Range RecognitionComments: 12 pages, 9 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [299] arXiv:2512.06424 [pdf, ps, other]
-
Title: DragMesh: Interactive 3D Generation Made EasySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [300] arXiv:2512.06422 [pdf, ps, other]
-
Title: A Perception CNN for Facial Expression RecognitionComments: in IEEE Transactions on Image Processing (2025)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [301] arXiv:2512.06421 [pdf, ps, other]
-
Title: Rethinking Training Dynamics in Scale-wise Autoregressive GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [302] arXiv:2512.06400 [pdf, ps, other]
-
Title: Perceptual Region-Driven Infrared-Visible Co-Fusion for Extreme Scene EnhancementComments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENTSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [303] arXiv:2512.06379 [pdf, ps, other]
- [304] arXiv:2512.06377 [pdf, ps, other]
- [305] arXiv:2512.06376 [pdf, ps, other]
-
Title: Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation FrameworkSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [306] arXiv:2512.06373 [pdf, ps, other]
-
Title: VG-Refiner: Towards Tool-Refined Referring Grounded Reasoning via Agentic Reinforcement LearningComments: The project page is [this url](this https URL)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [307] arXiv:2512.06368 [pdf, ps, other]
-
Title: HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular VideosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [308] arXiv:2512.06363 [pdf, ps, other]
-
Title: Spoofing-aware Prompt Learning for Unified Physical-Digital Facial Attack DetectionAuthors: Jiabao Guo, Yadian Wang, Hui Ma, Yuhao Fu, Ju Jia, Hui Liu, Shengeng Tang, Lechao Cheng, Yunfeng Diao, Ajian LiuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [309] arXiv:2512.06358 [pdf, ps, other]
-
Title: Rectifying Latent Space for Generative Single-Image Reflection RemovalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [310] arXiv:2512.06353 [pdf, ps, other]
-
Title: TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision SearchAuthors: Kaicheng Yang, Kaisen Yang, Baiting Wu, Xun Zhang, Qianrui Yang, Haotong Qin, He Zhang, Yulun ZhangComments: Code and Supplementary Material could be found at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [311] arXiv:2512.06345 [pdf, ps, other]
-
Title: CLUENet: Cluster Attention Makes Neural Networks Have EyesComments: 10 pages, 6 figures, 2026 Association for the Advancement of Artificial IntelligenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [312] arXiv:2512.06344 [pdf, ps, other]
-
Title: Beyond Hallucinations: A Multimodal-Guided Task-Aware Generative Image Compression for Ultra-Low BitrateSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [313] arXiv:2512.06332 [pdf, ps, other]
-
Title: CryoHype: Reconstructing a thousand cryo-EM structures with transformer-based hypernetworksSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [314] arXiv:2512.06330 [pdf, ps, other]
-
Title: S2WMamba: A Spectral-Spatial Wavelet Mamba for PansharpeningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [315] arXiv:2512.06328 [pdf, ps, other]
-
Title: ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language ModelsComments: Accepted as an Oral presentation at AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [316] arXiv:2512.06306 [pdf, ps, other]
-
Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose EstimationAuthors: Haoxian Zhou, Chuanzhi Xu, Langyi Chen, Haodong Chen, Yuk Ying Chung, Qiang Qu, Xaoming Chen, Weidong CaiSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [317] arXiv:2512.06290 [pdf, ps, other]
-
Title: StrokeNet: Unveiling How to Learn Fine-Grained Interactions in Online Handwritten Stroke ClassificationComments: 17 pages, 5 figuresJournal-ref: ICDAR 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [318] arXiv:2512.06282 [pdf, ps, other]
-
Title: A Sleep Monitoring System Based on Audio, Video and Depth InformationComments: Accepted in the Computer Vision, Graphics and Image Processing (CVGIP 2013)Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [319] arXiv:2512.06281 [pdf, ps, other]
-
Title: Unleashing the Intrinsic Visual Representation Capability of Multimodal Large Language ModelsAuthors: Hengzhuang Li, Xinsong Zhang, Qiming Peng, Bin Luo, Han Hu, Dengyang Jiang, Han-Jia Ye, Teng Zhang, Hai JinSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [320] arXiv:2512.06276 [pdf, ps, other]
-
Title: RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression ComprehensionAuthors: Tianyi Gao, Hao Li, Han Fang, Xin Wei, Xiaodong Dong, Hongbo Sun, Ye Yuan, Zhongjiang He, Jinglin Xu, Jingmin Xin, Hao SunSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [321] arXiv:2512.06275 [pdf, ps, other]
-
Title: FacePhys: State of the Heart LearningAuthors: Kegang Wang, Jiankai Tang, Yuntao Wang, Xin Liu, Yuxuan Fan, Jiatong Ji, Yuanchun Shi, Daniel McDuffSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [322] arXiv:2512.06269 [pdf, ps, other]
- [323] arXiv:2512.06258 [pdf, ps, other]
-
Title: Knowing the Answer Isn't Enough: Fixing Reasoning Path Failures in LVLMsAuthors: Chaoyang Wang, Yangfan He, Yiyang Zhou, Yixuan Wang, Jiaqi Liu, Peng Xia, Zhengzhong Tu, Mohit Bansal, Huaxiu YaoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [324] arXiv:2512.06255 [pdf, ps, other]
-
Title: Language-driven Fine-grained RetrievalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [325] arXiv:2512.06251 [pdf, ps, other]
-
Title: NexusFlow: Unifying Disparate Tasks under Partial Supervision via Invertible Flow NetworksAuthors: Fangzhou Lin, Yuping Wang, Yuliang Guo, Zixun Huang, Xinyu Huang, Haichong Zhang, Kazunori Yamada, Zhengzhong Tu, Liu Ren, Ziming ZhangComments: 12 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [326] arXiv:2512.06232 [pdf, ps, other]
-
Title: Opinion: Learning Intuitive Physics May Require More than Visual DataSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [327] arXiv:2512.06230 [pdf, ps, other]
-
Title: GPU-GLMB: Assessing the Scalability of GPU-Accelerated Multi-Hypothesis TrackingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [328] arXiv:2512.06221 [pdf, ps, other]
-
Title: Revisiting SVD and Wavelet Difference Reduction for Lossy Image Compression: A Reproducibility StudyAuthors: Alena MakarovaComments: 15 pages, 13 figures. Reproducibility studySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [329] arXiv:2512.06206 [pdf, ps, other]
-
Title: The MICCAI Federated Tumor Segmentation (FeTS) Challenge 2024: Efficient and Robust Aggregation Methods for Federated LearningAuthors: Akis Linardos, Sarthak Pati, Ujjwal Baid, Brandon Edwards, Patrick Foley, Kevin Ta, Verena Chung, Micah Sheller, Muhammad Irfan Khan, Mojtaba Jafaritadi, Elina Kontio, Suleiman Khan, Leon Mächler, Ivan Ezhov, Suprosanna Shit, Johannes C. Paetzold, Gustav Grimberg, Manuel A. Nickel, David Naccache, Vasilis Siomos, Jonathan Passerat-Palmbach, Giacomo Tarroni, Daewoon Kim, Leonard L. Klausmann, Prashant Shah, Bjoern Menze, Dimitrios Makris, Spyridon BakasComments: Published at the Journal of Machine Learning for Biomedical Imaging (MELBA) this https URLJournal-ref: Machine.Learning.for.Biomedical.Imaging. 3 (2025)Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [330] arXiv:2512.06190 [pdf, ps, other]
-
Title: Multi-Modal Zero-Shot Prediction of Color Trajectories in Food DryingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
- [331] arXiv:2512.06185 [pdf, ps, other]
-
Title: SPOOF: Simple Pixel Operations for Out-of-Distribution FoolingComments: 10 pages with 8 figures, plus 13 pages and 16 figures of supplementary materialSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [332] arXiv:2512.06179 [pdf, ps, other]
-
Title: Physics-Grounded Attached Shadow Detection Using Approximate 3D Geometry and Light DirectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [333] arXiv:2512.06174 [pdf, ps, other]
-
Title: Physics-Grounded Shadow Generation from Monocular 3D Geometry Priors and Approximate Light DirectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [334] arXiv:2512.06171 [pdf, ps, other]
-
Title: Automated Annotation of Shearographic Measurements Enabling Weakly Supervised Defect DetectionComments: 11 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [335] arXiv:2512.06158 [pdf, ps, other]
-
Title: Tracking-Guided 4D Generation: Foundation-Tracker Motion Priors for 3D Model AnimationAuthors: Su Sun, Cheng Zhao, Himangi Mittal, Gaurav Mittal, Rohith Kukkala, Yingjie Victor Chen, Mei ChenComments: 15 pages, 11 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [336] arXiv:2512.06105 [pdf, ps, other]
-
Title: Explainable Melanoma Diagnosis with Contrastive Learning and LLM-based Report GenerationAuthors: Junwen Zheng, Xinran Xu, Li Rong Wang, Chang Cai, Lucinda Siyun Tan, Dingyuan Wang, Hong Liang Tey, Xiuyi FanComments: AAAI-26-AIASubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [337] arXiv:2512.06103 [pdf, ps, other]
-
Title: SpectraIrisPAD: Leveraging Vision Foundation Models for Spectrally Conditioned Multispectral Iris Presentation Attack DetectionComments: Accepted in IEEE T-BIOMSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [338] arXiv:2512.06096 [pdf, ps, other]
-
Title: BeLLA: End-to-End Birds Eye View Large Language Assistant for Autonomous DrivingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [339] arXiv:2512.06080 [pdf, ps, other]
-
Title: Shoot-Bounce-3D: Single-Shot Occlusion-Aware 3D from Lidar by Decomposing Two-Bounce LightAuthors: Tzofi Klinghoffer, Siddharth Somasundaram, Xiaoyu Xiang, Yuchen Fan, Christian Richardt, Akshat Dave, Ramesh Raskar, Rakesh RanjanComments: SIGGRAPH Asia 2025. Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [340] arXiv:2512.06065 [pdf, ps, other]
-
Title: EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video EditingAuthors: Runjia Li, Moayed Haji-Ali, Ashkan Mirzaei, Chaoyang Wang, Arpit Sahni, Ivan Skorokhodov, Aliaksandr Siarohin, Tomas Jakab, Junlin Han, Sergey Tulyakov, Philip Torr, Willi MenapaceComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [341] arXiv:2512.06058 [pdf, ps, other]
-
Title: Representation Learning for Point Cloud UnderstandingAuthors: Siming YanComments: 181 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [342] arXiv:2512.06032 [pdf, ps, other]
-
Title: The SAM2-to-SAM3 Gap in the Segment Anything Model Family: Why Prompt-Based Expertise Fails in Concept-Driven Image SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [343] arXiv:2512.06024 [pdf, ps, other]
-
Title: Neural reconstruction of 3D ocean wave hydrodynamics from camera sensingSubjects: Computer Vision and Pattern Recognition (cs.CV); Fluid Dynamics (physics.flu-dyn)
- [344] arXiv:2512.06020 [pdf, ps, other]
-
Title: PrefGen: Multimodal Preference Learning for Preference-Conditioned Image GenerationComments: Project Page: \href{https://prefgen.github.io/}{\texttt{this https URL}}Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [345] arXiv:2512.06014 [pdf, ps, other]
-
Title: Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 DatasetsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [346] arXiv:2512.06013 [pdf, ps, other]
-
Title: VAT: Vision Action Transformer by Unlocking Full Representation of ViTSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [347] arXiv:2512.06012 [pdf, ps, other]
-
Title: High-Throughput Unsupervised Profiling of the Morphology of 316L Powder Particles for Use in Additive ManufacturingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [348] arXiv:2512.06010 [pdf, other]
-
Title: Fast and Flexible Robustness Certificates for Semantic SegmentationAuthors: Thomas Massena (IRIT-MISFIT, DTIPG - SNCF, UT3), Corentin Friedrich, Franck Mamalet, Mathieu Serrurier (IRIT-MISFIT)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [349] arXiv:2512.06006 [pdf, ps, other]
-
Title: Simple Agents Outperform Experts in Biomedical Imaging Workflow OptimizationAuthors: Xuefei (Julie) Wang, Kai A. Horstmann, Ethan Lin, Jonathan Chen, Alexander R. Farhang, Sophia Stiles, Atharva Sehgal, Jonathan Light, David Van Valen, Yisong Yue, Jennifer J. SunSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [350] arXiv:2512.06003 [pdf, ps, other]
-
Title: PrunedCaps: A Case For Primary Capsules DiscriminationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [351] arXiv:2512.05996 [pdf, ps, other]
-
Title: FishDetector-R1: Unified MLLM-Based Framework with Reinforcement Fine-Tuning for Weakly Supervised Fish Detection, Segmentation, and CountingComments: 18 pages, under reviewSubjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Robotics (cs.RO); Image and Video Processing (eess.IV)
- [352] arXiv:2512.05993 [pdf, ps, other]
-
Title: Domain-Specific Foundation Model Improves AI-Based Analysis of NeuropathologyAuthors: Ruchika Verma, Shrishtee Kandoi, Robina Afzal, Shengjia Chen, Jannes Jegminat, Michael W. Karlovich, Melissa Umphlett, Timothy E. Richardson, Kevin Clare, Quazi Hossain, Jorge Samanamud, Phyllis L. Faust, Elan D. Louis, Ann C. McKee, Thor D. Stein, Jonathan D. Cherry, Jesse Mez, Anya C. McGoldrick, Dalilah D. Quintana Mora, Melissa J. Nirenberg, Ruth H. Walker, Yolfrankcis Mendez, Susan Morgello, Dennis W. Dickson, Melissa E. Murray, Carlos Cordon-Cardo, Nadejda M. Tsankova, Jamie M. Walker, Diana K. Dangoor, Stephanie McQuillan, Emma L. Thorn, Claudia De Sanctis, Shuying Li, Thomas J. Fuchs, Kurt Farrell, John F. Crary, Gabriele CampanellaSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [353] arXiv:2512.05991 [pdf, ps, other]
-
Title: EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking HeadAuthors: Chang Liu, Tianjiao Jing, Chengcheng Ma, Xuanqi Zhou, Zhengxuan Lian, Qin Jin, Hongliang Yuan, Shi-Sheng HuangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [354] arXiv:2512.05988 [pdf, ps, other]
-
Title: VG3T: Visual Geometry Grounded Gaussian TransformerSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [355] arXiv:2512.05987 [pdf, ps, other]
-
Title: Adaptive Dataset Quantization: A New Direction for Dataset PruningComments: Accepted by ICCPR 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [356] arXiv:2512.05969 [pdf, ps, other]
-
Title: Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' MatricesAuthors: Hokin DengSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [357] arXiv:2512.07687 (cross-list from cs.CL) [pdf, ps, other]
-
Title: HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMsSubjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [358] arXiv:2512.07576 (cross-list from eess.IV) [pdf, ps, other]
-
Title: R2MF-Net: A Recurrent Residual Multi-Path Fusion Network for Robust Multi-directional Spine X-ray SegmentationAuthors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Sharipov Hotam Beknazarovich, Farzona S. Ataeva, Qurbonaliev Alisher, Yuanjie ZhengSubjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [359] arXiv:2512.07574 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics FrameworkAuthors: Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Alimov Ruslan, Lutfuloev Mazbutdzhon, Ismoilov Shuhratjon, Yuanjie ZhengSubjects: Image and Video Processing (eess.IV); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [360] arXiv:2512.07558 (cross-list from cs.LG) [pdf, ps, other]
-
Title: ReLaX: Reasoning with Latent Exploration for Large Reasoning ModelsSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [361] arXiv:2512.07509 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Exploring possible vector systems for faster training of neural networks with preconfigured latent spacesAuthors: Nikita GabdullinComments: 9 pages, 5 figures, 1 table, 4 equationsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [362] arXiv:2512.07459 (cross-list from cs.GR) [pdf, ps, other]
-
Title: Human Geometry Distribution for 3D Animation GenerationSubjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
- [363] arXiv:2512.07437 (cross-list from cs.LG) [pdf, ps, other]
-
Title: KAN-Dreamer: Benchmarking Kolmogorov-Arnold Networks as Function Approximators in World ModelsComments: 23 pages, 8 figures, 3 tablesSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)
- [364] arXiv:2512.07419 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language ModelsSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [365] arXiv:2512.07390 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Towards Reliable Test-Time Adaptation: Style Invariance as a Correctness LikelihoodComments: Accepted to WACV 2026Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [366] arXiv:2512.07355 (cross-list from cs.AI) [pdf, ps, other]
-
Title: A Geometric Unification of Concept Learning with Concept ConesComments: 22 pagesSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [367] arXiv:2512.07259 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Affine Subspace Models and Clustering for Patch-Based Image DenoisingComments: Asilomar Conference on Signals, Systems, and Computers 2025Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [368] arXiv:2512.07224 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Clinical Interpretability of Deep Learning Segmentation Through Shapley-Derived Agreement and Uncertainty MetricsAuthors: Tianyi Ren, Daniel Low, Pittra Jaengprajak, Juampablo Heras Rivera, Jacob Ruzevick, Mehmet KurtSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [369] arXiv:2512.07150 (cross-list from cs.LG) [pdf, ps, other]
-
Title: FlowLPS: Langevin-Proximal Sampling for Flow-based Inverse Problem SolversSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [370] arXiv:2512.07142 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket SearchComments: This work plans to be submitted to the IEEE for possible publicationSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
- [371] arXiv:2512.07132 (cross-list from cs.CL) [pdf, ps, other]
-
Title: DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal ReasoningAuthors: Nithin Sivakumaran, Justin Chih-Yao Chen, David Wan, Yue Zhang, Jaehong Yoon, Elias Stengel-Eskin, Mohit BansalComments: Code: this https URLSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [372] arXiv:2512.07130 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Mimir: Hierarchical Goal-Driven Diffusion with Uncertainty Propagation for End-to-End Autonomous DrivingAuthors: Zebin Xing, Yupeng Zheng, Qichao Zhang, Zhixing Ding, Pengxuan Yang, Songen Gu, Zhongpu Xia, Dongbin ZhaoSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [373] arXiv:2512.07040 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Transformation of Biological Networks into Images via Semantic Cartography for Visual Interpretation and Scalable Deep AnalysisSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [374] arXiv:2512.06990 (cross-list from cs.AI) [pdf, ps, other]
-
Title: Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme PatientsSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [375] arXiv:2512.06963 (cross-list from cs.RO) [pdf, ps, other]
-
Title: VideoVLA: Video Generators Can Be Generalizable Robot ManipulatorsAuthors: Yichao Shen, Fangyun Wei, Zhiying Du, Yaobo Liang, Yan Lu, Jiaolong Yang, Nanning Zheng, Baining GuoComments: Project page: this https URLJournal-ref: The Thirty-ninth Annual Conference on Neural Information Processing Systems(NeurIPS2025)Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [376] arXiv:2512.06951 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR ChallengeComments: 2025 NeurIPS Behavior Challenge 1st place solutionSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [377] arXiv:2512.06868 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Dynamic Visual SLAM using a General 3D PriorComments: 8 pagesSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [378] arXiv:2512.06848 (cross-list from cs.CL) [pdf, ps, other]
-
Title: AquaFusionNet: Lightweight VisionSensor Fusion Framework for Real-Time Pathogen Detection and Water Quality Anomaly Prediction on Edge DevicesComments: 9Pages, 3 figure, Politeknik Negeri BanyuwangiSubjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [379] arXiv:2512.06757 (cross-list from cs.SD) [pdf, ps, other]
-
Title: XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice AssociationComments: FAME 2026 Technical ReportSubjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV)
- [380] arXiv:2512.06737 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step DynamicsComments: 80 pages, 6 tables, 2 figures, 5 appendices, proof-of-conceptSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
- [381] arXiv:2512.06730 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Enhancing Interpretability of AR-SSVEP-Based Motor Intention Recognition via CNN-BiLSTM and SHAP Analysis on EEG DataSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [382] arXiv:2512.06665 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Rethinking Robustness: A New Approach to Evaluating Feature Attribution MethodsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [383] arXiv:2512.06649 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Estimating Black Carbon Concentration from Urban Traffic Using Vision-Based Machine LearningAuthors: Camellia Zakaria, Aryan Sadeghi, Weaam Jaafar, Junshi Xu, Alex Mariakakis, Marianne HatzopoulouComments: 12 pages, 16 figures, 4 tables, 4 pages Appendix, in submission and under review for ACM MobiSys 2026 as of December 6th, 2025Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Emerging Technologies (cs.ET)
- [384] arXiv:2512.06648 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural NetworkAuthors: Xiao LiComments: in Chinese languageSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [385] arXiv:2512.06628 (cross-list from cs.RO) [pdf, ps, other]
-
Title: MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical AlignmentAuthors: Ruicheng Zhang, Mingyang Zhang, Jun Zhou, Zhangrui Guo, Xiaofan Liu, Zunnan Xu, Zhizhou Zhong, Puxin Yan, Haocheng Luo, Xiu LiSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [386] arXiv:2512.06609 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Vector Quantization using Gaussian Variational AutoencoderAuthors: Tongda Xu, Wendi Zheng, Jiajun He, Jose Miguel Hernandez-Lobato, Yan Wang, Ya-Qin Zhang, Jie TangSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [387] arXiv:2512.06589 (cross-list from cs.CR) [pdf, ps, other]
-
Title: OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense EvaluationAuthors: Xiaojun Jia, Jie Liao, Qi Guo, Teng Ma, Simeng Qin, Ranjie Duan, Tianlin Li, Yihao Huang, Zhitao Zeng, Dongxian Wu, Yiming Li, Wenqi Ren, Xiaochun Cao, Yang LiuSubjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [388] arXiv:2512.06147 (cross-list from cs.RO) [pdf, ps, other]
-
Title: GuideNav: User-Informed Development of a Vision-Only Robotic Navigation Assistant For Blind TravelersAuthors: Hochul Hwang, Soowan Yang, Jahir Sadik Monon, Nicholas A Giudice, Sunghoon Ivan Lee, Joydeep Biswas, Donghyun KimSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
- [389] arXiv:2512.06008 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Semantic Temporal Single-photon LiDARAuthors: Fang Li, Tonglin Mu, Shuling Li, Junran Guo, Keyuan Li, Jianing Li, Ziyang Luo, Xiaodong Fan, Ye Chen, Yunfeng Liu, Hong Cai, Lip Ket Chin, Jinbei Zhang, Shihai SunComments: 14 pages, 5 figures. And any comment is welcomeSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantum Physics (quant-ph)
- [390] arXiv:2512.05992 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image SegmentationComments: NeurIPS Black in AI workshop - 2022Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Mon, 8 Dec 2025
- [391] arXiv:2512.05965 [pdf, ps, other]
-
Title: EditThinker: Unlocking Iterative Reasoning for Any Image EditorAuthors: Hongyu Li, Manyuan Zhang, Dian Zheng, Ziyu Guo, Yimeng Jia, Kaituo Feng, Hao Yu, Yexin Liu, Yan Feng, Peng Pei, Xunliang Cai, Linjiang Huang, Hongsheng Li, Si LiuComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [392] arXiv:2512.05960 [pdf, ps, other]
-
Title: AQUA-Net: Adaptive Frequency Fusion and Illumination Aware Network for Underwater Image EnhancementSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [393] arXiv:2512.05941 [pdf, ps, other]
-
Title: Zoom in, Click out: Unlocking and Evaluating the Potential of Zooming for GUI GroundingAuthors: Zhiyuan Jiang, Shenghao Xie, Wenyi Li, Wenqiang Zu, Peihang Li, Jiahao Qiu, Siqi Pei, Lei Ma, Tiejun Huang, Mengdi Wang, Shilong LiuComments: Code is available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [394] arXiv:2512.05937 [pdf, ps, other]
-
Title: Measuring the Effect of Background on Classification and Feature Importance in Deep Learning for AV PerceptionComments: 8 pages, 2 figures, 7 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [395] arXiv:2512.05936 [pdf, ps, other]
-
Title: Synset Signset Germany: a Synthetic Dataset for German Traffic Sign RecognitionAuthors: Anne Sielemann, Lena Loercher, Max-Lion Schumacher, Stefan Wolf, Masoud Roschani, Jens ZiehnComments: 8 pages, 8 figures, 3 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [396] arXiv:2512.05928 [pdf, ps, other]
-
Title: A Comparative Study on Synthetic Facial Data Generation Techniques for Face RecognitionComments: 18 pages, 17 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [397] arXiv:2512.05927 [pdf, ps, other]
-
Title: World Models That Know When They Don't Know: Controllable Video Generation with Calibrated UncertaintySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [398] arXiv:2512.05922 [pdf, ps, other]
-
Title: LPD: Learnable Prototypes with Diversity Regularization for Weakly Supervised Histopathology SegmentationAuthors: Khang Le, Anh Mai Vu, Thi Kim Trang Vo, Ha Thach, Ngoc Bui Lam Quang, Thanh-Huy Nguyen, Minh H. N. Le, Zhu Han, Chandra Mohan, Hien Van NguyenComments: Note: Khang Le and Anh Mai Vu contributed equallySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [399] arXiv:2512.05920 [pdf, ps, other]
-
Title: NICE: Neural Implicit Craniofacial Model for Orthognathic Surgery PredictionSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [400] arXiv:2512.05905 [pdf, ps, other]
-
Title: SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose RepresentationsAuthors: Wenhao Yan, Sheng Ye, Zhuoyi Yang, Jiayan Teng, ZhenHui Dong, Kairui Wen, Xiaotao Gu, Yong-Jin Liu, Jie TangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [401] arXiv:2512.05866 [pdf, ps, other]
-
Title: Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN DiscriminatorComments: This paper has been accepted for presentation at the IEEE 28th International Conference on Computer and Information Technology (ICCIT), December 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [402] arXiv:2512.05859 [pdf, ps, other]
-
Title: Edit-aware RAW ReconstructionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [403] arXiv:2512.05853 [pdf, ps, other]
-
Title: VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential AttackAuthors: Shiji Zhao, Shukun Xiong, Yao Huang, Yan Jin, Zhenyu Wu, Jiyang Guan, Ranjie Duan, Jialing Tao, Hui Xue, Xingxing WeiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [404] arXiv:2512.05830 [pdf, ps, other]
-
Title: Phase-OTDR Event Detection Using Image-Based Data Transformation and Deep LearningComments: 22 pages, 11 figures, 5 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [405] arXiv:2512.05814 [pdf, ps, other]
-
Title: UG-FedDA: Uncertainty-Guided Federated Domain Adaptation for Multi-Center Alzheimer's Disease DetectionAuthors: Fubao Zhu, Zhanyuan Jia, Zhiguo Wang, Huan Huang, Danyang Sun, Chuang Han, Yanting Li, Jiaofen Nan, Chen Zhao, Weihua ZhouComments: The code is already available on GitHub: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [406] arXiv:2512.05809 [pdf, ps, other]
-
Title: Probing the effectiveness of World Models for Spatial Reasoning through Test-time ScalingComments: Extended abstract at World Modeling Workshop 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [407] arXiv:2512.05802 [pdf, ps, other]
-
Title: Bring Your Dreams to Life: Continual Text-to-Video CustomizationAuthors: Jiahua Dong, Xudong Wang, Wenqi Liang, Zongyan Han, Meng Cao, Duzhen Zhang, Hanbin Zhao, Zhi Han, Salman Khan, Fahad Shahbaz KhanComments: Accepted to AAAI2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [408] arXiv:2512.05783 [pdf, ps, other]
-
Title: Curvature-Regularized Variational Autoencoder for 3D Scene Reconstruction from Sparse DepthSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [409] arXiv:2512.05774 [pdf, ps, other]
-
Title: Active Video Perception: Iterative Evidence Seeking for Agentic Long Video UnderstandingAuthors: Ziyang Wang, Honglu Zhou, Shijie Wang, Junnan Li, Caiming Xiong, Silvio Savarese, Mohit Bansal, Michael S. Ryoo, Juan Carlos NieblesComments: Website: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [410] arXiv:2512.05762 [pdf, ps, other]
-
Title: FNOPT: Resolution-Agnostic, Self-Supervised Cloth Simulation using Meta-Optimization with Fourier Neural OperatorsComments: Accepted for WACVSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [411] arXiv:2512.05759 [pdf, ps, other]
-
Title: Label-Efficient Point Cloud Segmentation with Active LearningAuthors: Johannes Meyer, Jasper Hoffmann, Felix Schulz, Dominik Merkle, Daniel Buescher, Alexander Reiterer, Joschka Boedecker, Wolfram BurgardSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [412] arXiv:2512.05754 [pdf, ps, other]
-
Title: USV: Unified Sparsification for Accelerating Video Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [413] arXiv:2512.05746 [pdf, ps, other]
-
Title: HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [414] arXiv:2512.05740 [pdf, ps, other]
-
Title: Distilling Expert Surgical Knowledge: How to train local surgical VLMs for anatomy explanation in Complete Mesocolic ExcisionAuthors: Lennart Maack, Julia-Kristin Graß, Lisa-Marie Toscha, Nathaniel Melling, Alexander SchlaeferSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [415] arXiv:2512.05710 [pdf, ps, other]
-
Title: Manifold-Aware Point Cloud Completion via Geodesic-Attentive Hierarchical Feature LearningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [416] arXiv:2512.05698 [pdf, ps, other]
-
Title: OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors ReasoningAuthors: Xusheng Guo, Wanfa Zhang, Shijia Zhao, Qiming Xia, Xiaolong Xie, Mingming Wang, Hai Wu, Chenglu WenComments: The 40th Annual AAAI Conference on Artificial IntelligenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [417] arXiv:2512.05683 [pdf, ps, other]
-
Title: Physics-Informed Graph Neural Network with Frequency-Aware Learning for Optical Aberration CorrectionAuthors: Yong En Kok, Bowen Deng, Alexander Bentley, Andrew J. Parkes, Michael G. Somekh, Amanda J. Wright, Michael P. PoundSubjects: Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
- [418] arXiv:2512.05674 [pdf, ps, other]
-
Title: Hyperspectral Unmixing with 3D Convolutional Sparse Coding and Projected Simplex Volume MaximizationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [419] arXiv:2512.05672 [pdf, ps, other]
-
Title: InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse ProblemSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [420] arXiv:2512.05669 [pdf, ps, other]
-
Title: Deep Learning-Based Real-Time Sequential Facial Expression Analysis Using Geometric FeaturesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [421] arXiv:2512.05663 [pdf, ps, other]
-
Title: LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D DetectionAuthors: Johannes Meier, Jonathan Michel, Oussema Dhaouadi, Yung-Hsu Yang, Christoph Reich, Zuria Bauer, Stefan Roth, Marc Pollefeys, Jacques Kaiser, Daniel CremersSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [422] arXiv:2512.05651 [pdf, ps, other]
-
Title: Self-Supervised AI-Generated Image Detection: A Camera Metadata PerspectiveSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [423] arXiv:2512.05635 [pdf, ps, other]
-
Title: Experts-Guided Unbalanced Optimal Transport for ISP Learning from Unpaired and/or Paired DataSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [424] arXiv:2512.05613 [pdf, ps, other]
-
Title: DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation ModelAuthors: Pasquale De Marinis, Pieter M. Blok, Uzay Kaymak, Rogier Brussee, Gennaro Vessio, Giovanna CastellanoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [425] arXiv:2512.05610 [pdf, ps, other]
-
Title: NormalView: sensor-agnostic tree species classification from backpack and aerial lidar data using geometric projectionsAuthors: Juho Korkeala, Jesse Muhojoki, Josef Taher, Klaara Salolahti, Matti Hyyppä, Antero Kukko, Juha HyyppäComments: 19 pages, 8 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [426] arXiv:2512.05597 [pdf, ps, other]
-
Title: Fast SceneScript: Accurate and Efficient Structured Language Model via Multi-Token PredictionComments: 10 pages, 8 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [427] arXiv:2512.05593 [pdf, ps, other]
-
Title: Learning High-Fidelity Cloth Animation via Skinning-Free Image TransferComments: Accepted to 3DV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [428] arXiv:2512.05571 [pdf, ps, other]
-
Title: MedDIFT: Multi-Scale Diffusion-Based Correspondence in 3D Medical ImagingAuthors: Xingyu Zhang, Anna Reithmeir, Fryderyk Kögl, Rickmer Braren, Julia A. Schnabel, Daniel M. LangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [429] arXiv:2512.05564 [pdf, ps, other]
-
Title: ProPhy: Progressive Physical Alignment for Dynamic World SimulationAuthors: Zijun Wang, Panwen Hu, Jing Wang, Terry Jingchen Zhang, Yuhao Cheng, Long Chen, Yiqiang Yan, Zutao Jiang, Hanhui Li, Xiaodan LiangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [430] arXiv:2512.05557 [pdf, ps, other]
-
Title: 2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence ConsistencySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [431] arXiv:2512.05546 [pdf, ps, other]
-
Title: Conscious Gaze: Adaptive Attention Mechanisms for Hallucination Mitigation in Vision-Language ModelsComments: 6 pages, 6 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [432] arXiv:2512.05539 [pdf, ps, other]
-
Title: Ideal Observer for Segmentation of Dead Leaves ImagesComments: 41 pages, 16 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Statistics Theory (math.ST); Methodology (stat.ME)
- [433] arXiv:2512.05529 [pdf, ps, other]
-
Title: See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth PriorsComments: The first two authors contributed equallySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [434] arXiv:2512.05524 [pdf, ps, other]
-
Title: VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [435] arXiv:2512.05515 [pdf, ps, other]
-
Title: DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment AnalysisComments: Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [436] arXiv:2512.05513 [pdf, ps, other]
-
Title: Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded ReasoningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [437] arXiv:2512.05511 [pdf, ps, other]
-
Title: Rethinking Infrared Small Target Detection: A Foundation-Driven Efficient ParadigmAuthors: Chuang Yu, Jinmiao Zhao, Yunpeng Liu, Yaokun Li, Xiujun Shu, Yuanhao Feng, Bo Wang, Yimian Dai, Xiangyu YueSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [438] arXiv:2512.05494 [pdf, ps, other]
- [439] arXiv:2512.05492 [pdf, ps, other]
-
Title: WaterWave: Bridging Underwater Image Enhancement into Video Streams via Wavelet-based Temporal Consistency FieldSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [440] arXiv:2512.05482 [pdf, ps, other]
-
Title: Concept-based Explainable Data Mining with VLM for 3D DetectionAuthors: Mai TsujimotoComments: 28 pages including appendix. Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [441] arXiv:2512.05481 [pdf, ps, other]
-
Title: UniFS: Unified Multi-Contrast MRI Reconstruction via Frequency-Spatial FusionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [442] arXiv:2512.05478 [pdf, ps, other]
-
Title: EmoStyle: Emotion-Driven Image StylizationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [443] arXiv:2512.05468 [pdf, ps, other]
-
Title: University Building Recognition Dataset in Thailand for the mission-oriented IoT sensor systemAuthors: Takara Taniguchi, Yudai Ueda, Atsuya Muramatsu, Kohki Hashimoto, Ryo Yagi, Hideya Ochiai, Chaodit AswakulSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [444] arXiv:2512.05446 [pdf, ps, other]
-
Title: TED-4DGS: Temporally Activated and Embedding-based Deformation for 4DGS CompressionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [445] arXiv:2512.05422 [pdf, ps, other]
-
Title: ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information InteractionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [446] arXiv:2512.05418 [pdf, ps, other]
-
Title: Performance Evaluation of Deep Learning for Tree Branch Segmentation in Autonomous Forestry SystemsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [447] arXiv:2512.05415 [pdf, ps, other]
-
Title: Moving object detection from multi-depth images with an attention-enhanced CNNAuthors: Masato Shibukawa, Fumi Yoshida, Toshifumi Yanagisawa, Takashi Ito, Hirohisa Kurosaki, Makoto Yoshikawa, Kohki Kamiya, Ji-an Jiang, Wesley Fraser, JJ Kavelaars, Susan Benecchi, Anne Verbiscer, Akira Hatakeyama, Hosei O, Naoya OzakiComments: 14 pages, 22 figures, submitted to PASJSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [448] arXiv:2512.05412 [pdf, ps, other]
-
Title: YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning ApplicationsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [449] arXiv:2512.05410 [pdf, ps, other]
-
Title: Genetic Algorithms For Parameter Optimization for Disparity Map Generation of Radiata Pine Branch ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [450] arXiv:2512.05398 [pdf, ps, other]
-
Title: The Dynamic Prior: Understanding 3D Structures for Casual Dynamic VideosComments: Code is available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [451] arXiv:2512.05394 [pdf, ps, other]
-
Title: Delving into Latent Spectral Biasing of Video VAEs for Superior DiffusabilitySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [452] arXiv:2512.05391 [pdf, ps, other]
-
Title: LoC-Path: Learning to Compress for Pathology Multimodal Large Language ModelsAuthors: Qingqiao Hu, Weimin Lyu, Meilong Xu, Kehan Qi, Xiaoling Hu, Saumya Gupta, Jiawei Zhou, Chao ChenComments: 20 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [453] arXiv:2512.05385 [pdf, ps, other]
-
Title: ShaRP: SHAllow-LayeR Pruning for Video Large Language Models AccelerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [454] arXiv:2512.05362 [pdf, ps, other]
-
Title: PoolNet: Deep Learning for 2D to 3D Video Process ValidationComments: All code related to this paper can be found at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [455] arXiv:2512.05359 [pdf, ps, other]
-
Title: Group Orthogonal Low-Rank Adaptation for RGB-T TrackingComments: 13 pages, 8 figures. Accepted by AAAI 2026. Extended versionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [456] arXiv:2512.05354 [pdf, ps, other]
-
Title: SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time TrainingComments: project page this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [457] arXiv:2512.05343 [pdf, ps, other]
-
Title: SpaceControl: Introducing Test-Time Spatial Control to 3D Generative ModelingAuthors: Elisabetta Fedele, Francis Engelmann, Ian Huang, Or Litany, Marc Pollefeys, Leonidas GuibasComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [458] arXiv:2512.05277 [pdf, ps, other]
-
Title: From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language ModelAuthors: Kevin Cannons, Saeed Ranjbar Alvar, Mohammad Asiful Hossain, Ahmad Rezaei, Mohsen Gholami, Alireza Heidarikhazaei, Zhou Weimin, Yong Zhang, Mohammad AkbariSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [459] arXiv:2512.05272 [pdf, ps, other]
-
Title: Inferring Compositional 4D Scenes without Ever Seeing OneComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [460] arXiv:2512.05268 [pdf, ps, other]
-
Title: CARD: Correlation Aware Restoration with DiffusionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [461] arXiv:2512.05259 [pdf, ps, other]
-
Title: Age-Inclusive 3D Human Mesh Recovery for Action-Preserving Data AnonymizationAuthors: Georgios Chatzichristodoulou, Niki Efthymiou, Panagiotis Filntisis, Georgios Pavlakos, Petros MaragosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [462] arXiv:2512.05240 [pdf, ps, other]
-
Title: IE2Video: Adapting Pretrained Diffusion Models for Event-Based Video ReconstructionAuthors: Dmitrii Torbunov, Onur Okuducu, Yi Huang, Odera Dim, Rebecca Coles, Yonggang Cui, Yihui RenSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [463] arXiv:2512.05209 [pdf, ps, other]
-
Title: DEAR: Dataset for Evaluating the Aesthetics of RenderingDEAR: Dataset for Evaluating the Aesthetics of RenderingAuthors: Vsevolod Plohotnuk, Artyom Panshin, Nikola Banić, Simone Bianco, Michael Freeman, Egor ErshovSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [464] arXiv:2512.05198 [pdf, ps, other]
-
Title: Your Latent Mask is Wrong: Pixel-Equivalent Latent Compositing for Diffusion ModelsComments: 16 pages, 10 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
- [465] arXiv:2512.05172 [pdf, ps, other]
-
Title: Semore: VLM-guided Enhanced Semantic Motion Representations for Visual Reinforcement LearningSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [466] arXiv:2512.05152 [pdf, ps, other]
-
Title: EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer ModelsComments: 6pages, 5figures, published to 2025 IEEE International Conference on Multimedia and Expo (ICME), Nantes, France, 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [467] arXiv:2512.05150 [pdf, ps, other]
-
Title: TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial FlowsComments: arxiv v0Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [468] arXiv:2512.05145 [pdf, ps, other]
-
Title: Self-Improving VLM Judges Without Human AnnotationsAuthors: Inna Wanyin Lin, Yushi Hu, Shuyue Stella Li, Scott Geng, Pang Wei Koh, Luke Zettlemoyer, Tim Althoff, Marjan GhazvininejadSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [469] arXiv:2512.05140 [pdf, other]
-
Title: FlowEO: Generative Unsupervised Domain Adaptation for Earth ObservationAuthors: Georges Le Bellier (CEDRIC - VERTIGO, Cnam), Nicolas Audebert (LaSTIG, IGN, CEDRIC - VERTIGO)Comments: 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Mar 2026, Tucson (AZ), United StatesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [470] arXiv:2512.05139 [pdf, ps, other]
-
Title: Spatiotemporal Satellite Image Downscaling with Transfer Encoders and Autoregressive Generative ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
- [471] arXiv:2512.05137 [pdf, ps, other]
-
Title: ChromouVQA: Benchmarking Vision-Language Models under Chromatic Camouflaged ImagesAuthors: Yunfei Zhang, Yizhuo He, Yuanxun Shao, Zhengtao Yao, Haoyan Xu, Junhao Dong, Zhen Yao, Zhikang DongSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [472] arXiv:2512.05136 [pdf, ps, other]
-
Title: Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography OutcomesAuthors: Yujie Xiao, Gongzhen Tang, Deyun Zhang, Jun Li, Guangkun Nie, Haoyu Wang, Shun Huang, Tong Liu, Qinghao Zhao, Kangyin Chen, Shenda HongSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [473] arXiv:2512.05134 [pdf, ps, other]
-
Title: InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion ModelsAuthors: Zihao WuComments: 8 pages main, 8 pages appendix, 16 figures, 5 tables. Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
- [474] arXiv:2512.05132 [pdf, ps, other]
-
Title: Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution TrainingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [475] arXiv:2512.05131 [pdf, ps, other]
-
Title: AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language GuidanceComments: Under reviewSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [476] arXiv:2512.05959 (cross-list from cs.CL) [pdf, ps, other]
-
Title: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAGAuthors: David Anugraha, Patrick Amadeus Irawan, Anshul Singh, En-Shiun Annie Lee, Genta Indra WinataComments: PreprintSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [477] arXiv:2512.05955 (cross-list from cs.RO) [pdf, ps, other]
-
Title: SIMPACT: Simulation-Enabled Action Planning using Vision-Language ModelsSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [478] arXiv:2512.05932 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Physically-Based Simulation of Automotive LiDARSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [479] arXiv:2512.05824 (cross-list from cs.AI) [pdf, ps, other]
-
Title: Multimodal Oncology Agent for IDH1 Mutation Prediction in Low-Grade GliomaAuthors: Hafsa Akebli (1), Adam Shephard (2), Vincenzo Della Mea (1), Nasir Rajpoot (2 and 3) ((1) University of Udine, Udine, Italy, (2) University of Warwick, Coventry, UK, (3) Histofy Ltd, Coventry, UK)Comments: 4 pages, 2 figuresSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [480] arXiv:2512.05812 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Toward Efficient and Robust Behavior Models for Multi-Agent Driving SimulationComments: This work has been submitted to the IEEE for possible publicationSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [481] arXiv:2512.05665 (cross-list from cs.CL) [pdf, ps, other]
-
Title: Interleaved Latent Visual Reasoning with Selective Perceptual ModelingComments: 11 pages, 6 figures. Code available at this https URLSubjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [482] arXiv:2512.05438 (cross-list from cs.HC) [pdf, ps, other]
-
Title: EXR: An Interactive Immersive EHR Visualization in Extended RealityAuthors: Benoit Marteau, Shaun Q. Y. Tan, Jieru Li, Andrew Hornback, Yishan Zhong, Shaunna Wang, Christian Lowson, Jason Woloff, Joshua M. Pahys, Steven W. Hwang, Coleman Hilton, May D. WangComments: 11 pages, 6 figures. Preprint version. This paper has been accepted to IEEE ICIR 2025. This is the author-prepared version and not the final published version. The final version will appear in IEEE XploSubjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
- [483] arXiv:2512.05299 (cross-list from eess.SY) [pdf, ps, other]
-
Title: ARCAS: An Augmented Reality Collision Avoidance System with SLAM-Based Tracking for Enhancing VRU SafetyAuthors: Ahmad Yehia, Jiseop Byeon, Tianyi Wang, Huihai Wang, Yiming Xu, Junfeng Jiao, Christian ClaudelComments: 8 pages, 3 figures, 1 tableSubjects: Systems and Control (eess.SY); Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Robotics (cs.RO); Image and Video Processing (eess.IV)
- [484] arXiv:2512.05126 (cross-list from eess.AS) [pdf, ps, other]
-
Title: SyncVoice: Towards Video Dubbing with Vision-Augmented Pretrained TTS ModelAuthors: Kaidi Wang, Yi He, Wenhao Guan, Weijie Wu, Hongwu Ding, Xiong Zhang, Di Wu, Meng Meng, Jian Luan, Lin Li, Qingyang HongSubjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
Fri, 5 Dec 2025 (showing first 16 of 135 entries)
- [485] arXiv:2512.05115 [pdf, ps, other]
-
Title: Light-X: Generative 4D Video Rendering with Camera and Illumination ControlAuthors: Tianqi Liu, Zhaoxi Chen, Zihao Huang, Shaocong Xu, Saining Zhang, Chongjie Ye, Bohan Li, Zhiguo Cao, Wei Li, Hao Zhao, Ziwei LiuComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [486] arXiv:2512.05113 [pdf, ps, other]
-
Title: Splannequin: Freezing Monocular Mannequin-Challenge Footage with Dual-Detection SplattingComments: WACV 2026. Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [487] arXiv:2512.05112 [pdf, ps, other]
-
Title: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept GenerationAuthors: Dongzhi Jiang, Renrui Zhang, Haodong Li, Zhuofan Zong, Ziyu Guo, Jun He, Claire Guo, Junyan Ye, Rongyao Fang, Weijia Li, Rui Liu, Hongsheng LiComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [488] arXiv:2512.05111 [pdf, ps, other]
-
Title: ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual ReasoningAuthors: Shengyuan Ding, Xinyu Fang, Ziyu Liu, Yuhang Zang, Yuhang Cao, Xiangyu Zhao, Haodong Duan, Xiaoyi Dong, Jianze Liang, Bin Wang, Conghui He, Dahua Lin, Jiaqi WangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [489] arXiv:2512.05110 [pdf, ps, other]
-
Title: ShadowDraw: From Any Object to Shadow-Drawing Compositional ArtComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
- [490] arXiv:2512.05106 [pdf, ps, other]
-
Title: NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Robotics (cs.RO)
- [491] arXiv:2512.05104 [pdf, ps, other]
-
Title: EvoIR: Towards All-in-One Image Restoration via Evolutionary Frequency ModulationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [492] arXiv:2512.05098 [pdf, ps, other]
- [493] arXiv:2512.05091 [pdf, ps, other]
-
Title: Visual Reasoning Tracer: Object-Level Grounded Reasoning BenchmarkAuthors: Haobo Yuan, Yueyi Sun, Yanwei Li, Tao Zhang, Xueqing Deng, Henghui Ding, Lu Qi, Anran Wang, Xiangtai Li, Ming-Hsuan YangComments: Technical Report; Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [494] arXiv:2512.05081 [pdf, ps, other]
-
Title: Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative CompressionComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [495] arXiv:2512.05079 [pdf, ps, other]
-
Title: Object Reconstruction under Occlusion with Generative Priors and Contact-induced ConstraintsComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [496] arXiv:2512.05076 [pdf, ps, other]
-
Title: BulletTime: Decoupled Control of Time and Camera Pose for Video GenerationAuthors: Yiming Wang, Qihang Zhang, Shengqu Cai, Tong Wu, Jan Ackermann, Zhengfei Kuang, Yang Zheng, Frano Rajič, Siyu Tang, Gordon WetzsteinComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [497] arXiv:2512.05060 [pdf, ps, other]
-
Title: 4DLangVGGT: 4D Language-Visual Geometry Grounded TransformerAuthors: Xianfeng Wu, Yajing Bai, Minghan Li, Xianzu Wu, Xueqi Zhao, Zhongyuan Lai, Wenyu Liu, Xinggang WangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [498] arXiv:2512.05044 [pdf, ps, other]
-
Title: Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single ImageComments: 18 PagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [499] arXiv:2512.05039 [pdf, ps, other]
-
Title: Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual EncodingComments: Submitted for review CVPR-2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [500] arXiv:2512.05025 [pdf, ps, other]
-
Title: RAMEN: Resolution-Adjustable Multimodal Encoder for Earth ObservationAuthors: Nicolas Houdré, Diego Marcos, Hugo Riffaud de Turckheim, Dino Ienco, Laurent Wendling, Camille Kurtz, Sylvain LobrySubjects: Computer Vision and Pattern Recognition (cs.CV)
[ showing 500 entries per page: fewer | more | all ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)