Computer Vision and Pattern Recognition
Authors and titles for recent submissions
[ total of 749 entries: 1-250 | 251-500 | 501-749 ][ showing 250 entries per page: fewer | more | all ]
Wed, 10 Dec 2025
- [1] arXiv:2512.08931 [pdf, ps, other]
-
Title: Astra: General Interactive World Model with Autoregressive DenoisingComments: Code is available at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [2] arXiv:2512.08930 [pdf, ps, other]
-
Title: Selfi: Self Improving Reconstruction Engine via 3D Geometric Feature AlignmentAuthors: Youming Deng, Songyou Peng, Junyi Zhang, Kathryn Heal, Tiancheng Sun, John Flynn, Steve Marschner, Lucy ChaiComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [3] arXiv:2512.08924 [pdf, ps, other]
-
Title: Efficiently Reconstructing Dynamic Scenes One D4RT at a TimeAuthors: Chuhan Zhang, Guillaume Le Moing, Skanda Koppula, Ignacio Rocco, Liliane Momeni, Junyu Xie, Shuyang Sun, Rahul Sukthankar, Joëlle K Barral, Raia Hadsell, Zoubin Ghahramani, Andrew Zisserman, Junlin Zhang, Mehdi SM SajjadiComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [4] arXiv:2512.08922 [pdf, ps, other]
-
Title: Unified Diffusion Transformer for High-fidelity Text-Aware Image RestorationAuthors: Jin Hyeon Kim, Paul Hyunbin Cho, Claire Kim, Jaewon Min, Jaeeun Lee, Jihye Park, Yeji Choi, Seungryong KimSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [5] arXiv:2512.08912 [pdf, ps, other]
-
Title: LiDAS: Lighting-driven Dynamic Active Sensing for Nighttime PerceptionComments: Preprint. 12 pages, 9 figures. Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [6] arXiv:2512.08905 [pdf, ps, other]
-
Title: Self-Evolving 3D Scene Generation from a Single ImageSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [7] arXiv:2512.08897 [pdf, ps, other]
-
Title: UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [8] arXiv:2512.08889 [pdf, ps, other]
-
Title: No Labels, No Problem: Training Visual Reasoners with Multimodal VerifiersComments: Project webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [9] arXiv:2512.08888 [pdf, ps, other]
-
Title: Accelerated Rotation-Invariant Convolution for UAV Image SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [10] arXiv:2512.08881 [pdf, ps, other]
-
Title: SATGround: A Spatially-Aware Approach for Visual Grounding in Remote SensingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [11] arXiv:2512.08873 [pdf, ps, other]
-
Title: Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image CaptioningComments: 6 pagesJournal-ref: 2024 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
- [12] arXiv:2512.08860 [pdf, ps, other]
-
Title: Tri-Bench: Stress-Testing VLM Reliability on Spatial Reasoning under Camera Tilt and Object InterferenceAuthors: Amit BendkhaleComments: 6 pages, 3 figures. Code and data: this https URL Accepted to the AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [13] arXiv:2512.08854 [pdf, ps, other]
-
Title: Generation is Required for Data-Efficient PerceptionComments: PreprintSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [14] arXiv:2512.08829 [pdf, ps, other]
-
Title: InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language ModelsComments: 16 pages, 8 figures, conference or other essential infoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [15] arXiv:2512.08820 [pdf, ps, other]
-
Title: Training-Free Dual Hyperbolic Adapters for Better Cross-Modal ReasoningAuthors: Yi Zhang, Chun-Wun Cheng, Junyi He, Ke Yu, Yushun Tang, Carola-Bibiane Schönlieb, Zhihai He, Angelica I. Aviles-RiveroComments: Accepted in IEEE Transactions on Multimedia (TMM)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [16] arXiv:2512.08789 [pdf, ps, other]
-
Title: MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte GuidanceComments: 10 pages, 7 figures, 5 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [17] arXiv:2512.08785 [pdf, ps, other]
-
Title: LoFA: Learning to Predict Personalized Priors for Fast Adaptation of Visual Generative ModelsComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [18] arXiv:2512.08774 [pdf, ps, other]
-
Title: Refining Visual Artifacts in Diffusion Models via Explainable AI-based Flaw Activation MapsComments: 10 pages, 9 figures, 7 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [19] arXiv:2512.08765 [pdf, ps, other]
-
Title: Wan-Move: Motion-controllable Video Generation via Latent Trajectory GuidanceAuthors: Ruihang Chu, Yefei He, Zhekai Chen, Shiwei Zhang, Xiaogang Xu, Bin Xia, Dingdong Wang, Hongwei Yi, Xihui Liu, Hengshuang Zhao, Yu Liu, Yingya Zhang, Yujiu YangComments: NeurlPS 2025. Code and data available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [20] arXiv:2512.08751 [pdf, ps, other]
-
Title: Skewness-Guided Pruning of Multimodal Swin Transformers for Federated Skin Lesion Classification on Edge DevicesSubjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
- [21] arXiv:2512.08747 [pdf, ps, other]
-
Title: A Scalable Pipeline Combining Procedural 3D Graphics and Guided Diffusion for Photorealistic Synthetic Training Data Generation in White Button Mushroom SegmentationComments: 20 pages, 8 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [22] arXiv:2512.08738 [pdf, ps, other]
-
Title: Pose-Based Sign Language Spotting via an End-to-End Encoder ArchitectureComments: To appear at AACL-IJCNLP 2025 Workshop WSLPSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [23] arXiv:2512.08733 [pdf, ps, other]
-
Title: Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware ReweightingAuthors: Kuniko Paxton, Zeinab Dehghani, Koorosh Aslansefat, Dhavalkumar Thakker, Yiannis PapadopoulosSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [24] arXiv:2512.08730 [pdf, ps, other]
-
Title: SegEarth-OV3: Exploring SAM 3 for Open-Vocabulary Semantic Segmentation in Remote Sensing ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [25] arXiv:2512.08700 [pdf, ps, other]
-
Title: Scale-invariant and View-relational Representation Learning for Full Surround Monocular DepthAuthors: Kyumin Hwang, Wonhyeok Choi, Kiljoon Han, Wonjoon Choi, Minwoo Choi, Yongcheon Na, Minwoo Park, Sunghoon ImComments: Accepted at IEEE Robotics and Automation Letters (RA-L) 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [26] arXiv:2512.08697 [pdf, ps, other]
-
Title: What really matters for person re-identification? A Mixture-of-Experts Framework for Semantic Attribute ImportanceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [27] arXiv:2512.08673 [pdf, ps, other]
-
Title: Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point CloudsComments: 16 pages, 6 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [28] arXiv:2512.08648 [pdf, ps, other]
-
Title: Repulsor: Accelerating Generative Modeling with a Contrastive Memory BankAuthors: Shaofeng Zhang, Xuanqi Chen, Ning Liao, Haoxiang Zhao, Xiaoxing Wang, Haoru Tan, Sitong Wu, Xiaosong Jia, Qi Fan, Junchi YanComments: 19 pages, 19 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [29] arXiv:2512.08647 [pdf, ps, other]
-
Title: C-DIRA: Computationally Efficient Dynamic ROI Routing and Domain-Invariant Adversarial Learning for Lightweight Driver Behavior RecognitionAuthors: Keito InoshitaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [30] arXiv:2512.08645 [pdf, ps, other]
-
Title: Chain-of-Image Generation: Toward Monitorable and Controllable Image GenerationComments: 19 pages, 13 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [31] arXiv:2512.08639 [pdf, ps, other]
-
Title: Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied ReasoningComments: Under Review, 12 pages, 9 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [32] arXiv:2512.08627 [pdf, ps, other]
-
Title: Trajectory Densification and Depth from Perspective-based BlurSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [33] arXiv:2512.08625 [pdf, ps, other]
-
Title: OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set SemanticsComments: 8 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [34] arXiv:2512.08606 [pdf, ps, other]
-
Title: Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot LearningComments: 14 pages, 8 figures, Association for the Advancement of Artificial Intelligence (AAAI2026, poster)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [35] arXiv:2512.08589 [pdf, ps, other]
-
Title: Automated Pollen Recognition in Optical and Holographic Microscopy ImagesAuthors: Swarn Singh Warshaneyan, Maksims Ivanovs, Blaž Cugmas, Inese Bērziņa, Laura Goldberga, Mindaugas Tamosiunas, Roberts KadiķisComments: 08 pages, 10 figures, 04 tables, 20 references. Date of Conference: 13-14 June 2025 Date Added to IEEE Xplore: 10 July 2025 Electronic ISBN: 979-8-3315-0969-9 Print on Demand(PoD) ISBN: 979-8-3315-0970-5 DOI: 10.1109/AICCONF64766.2025.11064260 Conference Location: Prague, Czech Republic Online Access: this https URLJournal-ref: 2025 3rd Cognitive Models and Artificial Intelligence Conference (AICCONF), vol. 1, no. 1, pp. 1-8, Prague, Czech Republic, IEEE, 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [36] arXiv:2512.08577 [pdf, ps, other]
-
Title: Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open SurgerySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
- [37] arXiv:2512.08572 [pdf, ps, other]
-
Title: From Cells to Survival: Hierarchical Analysis of Cell Inter-Relations in Multiplex Microscopy for Lung Cancer PrognosisAuthors: Olle Edgren Schüllerqvist, Jens Baumann, Joakim Lindblad, Love Nordling, Artur Mezheyeuski, Patrick Micke, Nataša SladojeComments: 5 pages, 3 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [38] arXiv:2512.08569 [pdf, ps, other]
-
Title: Instance-Aware Test-Time Segmentation for Continual Domain ShiftsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [39] arXiv:2512.08564 [pdf, ps, other]
-
Title: Modular Neural Image Signal ProcessingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [40] arXiv:2512.08560 [pdf, ps, other]
-
Title: BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human BrainAuthors: Navve Wasserman, Matias Cosarinsky, Yuval Golbari, Aude Oliva, Antonio Torralba, Tamar Rott Shaham, Michal IraniSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [41] arXiv:2512.08557 [pdf, ps, other]
-
Title: SSCATeR: Sparse Scatter-Based Convolution Algorithm with Temporal Data Recycling for Real-Time 3D Object Detection in LiDAR Point CloudsComments: 22 Pages, 26 Figures, This work has been submitted to the IEEE Sensors Journal for possible publicationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [42] arXiv:2512.08547 [pdf, ps, other]
-
Title: An Iteration-Free Fixed-Point Estimator for Diffusion InversionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [43] arXiv:2512.08542 [pdf, ps, other]
-
Title: A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
- [44] arXiv:2512.08537 [pdf, ps, other]
-
Title: Fast-ARDiff: An Entropy-informed Acceleration Framework for Continuous Space Autoregressive GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [45] arXiv:2512.08535 [pdf, ps, other]
-
Title: Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail EnhancementSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [46] arXiv:2512.08534 [pdf, ps, other]
-
Title: PaintFlow: A Unified Framework for Interactive Oil Paintings Editing and GenerationComments: 14 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [47] arXiv:2512.08529 [pdf, ps, other]
-
Title: MVP: Multiple View Prediction Improves GUI GroundingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [48] arXiv:2512.08524 [pdf, ps, other]
-
Title: Beyond Real Weights: Hypercomplex Representations for Stable QuantizationAuthors: Jawad Ibn Ahad, Maisha Rahman, Amrijit Biswas, Muhammad Rafsan Kabir, Robin Krambroeckers, Sifat Momen, Nabeel Mohammed, Shafin RahmanComments: Accepted in Winter Conference on Applications of Computer Vision (WACV) 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [49] arXiv:2512.08511 [pdf, ps, other]
-
Title: Thinking with Images via Self-Calling AgentComments: Code would be released at this https URL soonSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [50] arXiv:2512.08506 [pdf, ps, other]
-
Title: OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point CloudsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [51] arXiv:2512.08505 [pdf, ps, other]
-
Title: Beyond the Noise: Aligning Prompts with Latent Representations in Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [52] arXiv:2512.08503 [pdf, ps, other]
-
Title: Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [53] arXiv:2512.08498 [pdf, ps, other]
-
Title: On-the-fly Large-scale 3D Reconstruction from Multi-Camera RigsAuthors: Yijia Guo, Tong Hu, Zhiwei Li, Liwen Hu, Keming Qian, Xitong Lin, Shengbo Chen, Tiejun Huang, Lei MaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [54] arXiv:2512.08486 [pdf, ps, other]
-
Title: Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned InterventionsComments: Code is available at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [55] arXiv:2512.08478 [pdf, ps, other]
-
Title: Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting PlatformAuthors: Yuning Gong, Yifei Liu, Yifan Zhan, Muyao Niu, Xueying Li, Yuanjun Liao, Jiaming Chen, Yuanyuan Gao, Jiaqi Chen, Minming Chen, Li Zhou, Yuning Zhang, Wei Wang, Xiaoqing Hou, Huaxi Huang, Shixiang Tang, Le Ma, Dingwen Zhang, Xue Yang, Junchi Yan, Yanchi Zhang, Yinqiang Zheng, Xiao Sun, Zhihang ZhongComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
- [56] arXiv:2512.08477 [pdf, ps, other]
-
Title: ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Consistent AttentionAuthors: Huiguo He, Pengyu Yan, Ziqi Yi, Weizhi Zhong, Zheng Liu, Yejun Tang, Huan Yang, Kun Gai, Guanbin Li, Lianwen JinSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [57] arXiv:2512.08467 [pdf, ps, other]
-
Title: Team-Aware Football Player Tracking with SAM: An Appearance-Based Approach to Occlusion RecoveryComments: 8 pages, 5 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [58] arXiv:2512.08445 [pdf, ps, other]
-
Title: Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution ShiftsSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [59] arXiv:2512.08441 [pdf, ps, other]
-
Title: Leveraging Multispectral Sensors for Color Correction in Mobile CamerasSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [60] arXiv:2512.08439 [pdf, ps, other]
-
Title: LapFM: A Laparoscopic Segmentation Foundation Model via Hierarchical Concept Evolving Pre-trainingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [61] arXiv:2512.08430 [pdf, ps, other]
-
Title: SDT-6D: Fully Sparse Depth-Transformer for Staged End-to-End 6D Pose Estimation in Industrial Multi-View Bin PickingComments: Accepted to WACV 2026. Preprint versionSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [62] arXiv:2512.08410 [pdf, ps, other]
-
Title: Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip RetrievalAuthors: Tao Chen, Shaobo Ju, Qiong Wu, Chenxin Fang, Kun Zhang, Jun Peng, Hui Li, Yiyi Zhou, Rongrong JiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [63] arXiv:2512.08406 [pdf, ps, other]
-
Title: SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from VideosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [64] arXiv:2512.08400 [pdf, ps, other]
-
Title: Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in FisheriesComments: The paper has been accepted for publication at Northern Lights Deep Learning (NLDL) Conference 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [65] arXiv:2512.08397 [pdf, ps, other]
-
Title: Detection of Digital Facial Retouching utilizing Face Beauty InformationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [66] arXiv:2512.08378 [pdf, ps, other]
-
Title: Simultaneous Enhancement and Noise Suppression under Complex Illumination ConditionsComments: The paper has been accepted and officially published by IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENTSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [67] arXiv:2512.08374 [pdf, ps, other]
-
Title: The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information LossAuthors: Bozhou Li, Xinda Xue, Sihan Yang, Yang Shi, Xinlong Chen, Yushuo Guan, Yuanxing Zhang, Wentao ZhangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [68] arXiv:2512.08362 [pdf, ps, other]
-
Title: SCU-CGAN: Enhancing Fire Detection through Synthetic Fire Image Generation and Dataset AugmentationComments: Accepted for main track at MobieSec 2024 (not published in the proceedings)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [69] arXiv:2512.08358 [pdf, ps, other]
-
Title: TrackingWorld: World-centric Monocular 3D Tracking of Almost All PixelsAuthors: Jiahao Lu, Weitao Xiong, Jiacheng Deng, Peng Li, Tianyu Huang, Zhiyang Dou, Cheng Lin, Sai-Kit Yeung, Yuan LiuComments: Accepted by NeurIPS 2025. Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [70] arXiv:2512.08337 [pdf, ps, other]
-
Title: DINO-BOLDNet: A DINOv3-Guided Multi-Slice Attention Network for T1-to-BOLD GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [71] arXiv:2512.08334 [pdf, ps, other]
-
Title: HybridSplat: Fast Reflection-baked Gaussian Tracing using Hybrid SplattingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [72] arXiv:2512.08331 [pdf, ps, other]
-
Title: Bi^2MAC: Bimodal Bi-Adaptive Mask-Aware Convolution for Remote Sensing PansharpeningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [73] arXiv:2512.08330 [pdf, ps, other]
-
Title: PointDico: Contrastive 3D Representation Learning Guided by Diffusion ModelsComments: Accepted by IJCNN 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [74] arXiv:2512.08329 [pdf, ps, other]
-
Title: Interpreting Structured Perturbations in Image Protection Methods for Diffusion ModelsComments: 32 pages, 17 figures, 1 table, 5 algorithms, preprintSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [75] arXiv:2512.08327 [pdf, ps, other]
-
Title: Low Rank Support Quaternion Matrix MachineSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
- [76] arXiv:2512.08325 [pdf, ps, other]
-
Title: GeoDiffMM: Geometry-Guided Conditional Diffusion for Motion MagnificationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [77] arXiv:2512.08323 [pdf, ps, other]
-
Title: Detecting Dental Landmarks from Intraoral 3D Scans: the 3DTeethLand challengeAuthors: Achraf Ben-Hamadou, Nour Neifar, Ahmed Rekik, Oussama Smaoui, Firas Bouzguenda, Sergi Pujades, Niels van Nistelrooij, Shankeeth Vinayahalingam, Kaibo Shi, Hairong Jin, Youyi Zheng, Tibor Kubík, Oldřich Kodym, Petr Šilling, Kateřina Trávníčková, Tomáš Mojžiš, Jan Matula, Jeffry Hartanto, Xiaoying Zhu, Kim-Ngan Nguyen, Tudor Dascalu, Huikai Wu, and Weijie Liu, Shaojie Zhuang, Guangshun Wei, Yuanfeng ZhouComments: MICCAI 2024, 3DTeethLand, Challenge report, under reviewSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [78] arXiv:2512.08317 [pdf, ps, other]
-
Title: GeoDM: Geometry-aware Distribution Matching for Dataset DistillationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [79] arXiv:2512.08309 [pdf, ps, other]
-
Title: Terrain Diffusion: A Diffusion-Based Successor to Perlin Noise in Infinite, Real-Time Terrain GenerationAuthors: Alexander GoslinSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
- [80] arXiv:2512.08294 [pdf, ps, other]
-
Title: OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and ManipulationAuthors: Yexin Liu, Manyuan Zhang, Yueze Wang, Hongyu Li, Dian Zheng, Weiming Zhang, Changsheng Lu, Xunliang Cai, Yan Feng, Peng Pei, Harry YangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [81] arXiv:2512.08282 [pdf, ps, other]
-
Title: PAVAS: Physics-Aware Video-to-Audio SynthesisSubjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD)
- [82] arXiv:2512.08269 [pdf, ps, other]
-
Title: EgoX: Egocentric Video Generation from a Single Exocentric VideoComments: 21 pages, project page : this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [83] arXiv:2512.08262 [pdf, ps, other]
-
Title: RLCNet: An end-to-end deep learning framework for simultaneous online calibration of LiDAR, RADAR, and CameraSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [84] arXiv:2512.08254 [pdf, ps, other]
-
Title: SFP: Real-World Scene Recovery Using Spatial and Frequency PriorsComments: 10 pages, 13 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [85] arXiv:2512.08253 [pdf, ps, other]
-
Title: Query-aware Hub Prototype Learning for Few-Shot 3D Point Cloud Semantic SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [86] arXiv:2512.08247 [pdf, ps, other]
-
Title: Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object DetectionComments: AAAI-26Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [87] arXiv:2512.08243 [pdf, ps, other]
-
Title: Residual-SwinCA-Net: A Channel-Aware Integrated Residual CNN-Swin Transformer for Malignant Lesion Segmentation in BUSIAuthors: Saeeda Naz, Saddam Hussain Khan (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences (UEAS), Swat, Pakistan)Comments: 26 Pages, 10 Figures, 4 TablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [88] arXiv:2512.08240 [pdf, ps, other]
-
Title: HybridToken-VLM: Hybrid Token Compression for Vision-Language ModelsAuthors: Jusheng Zhang, Xiaoyang Guo, Kaitong Cai, Qinhan Lv, Yijia Fan, Wenhao Chai, Jian Wang, Keze WangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [89] arXiv:2512.08237 [pdf, ps, other]
-
Title: FastBEV++: Fast by Algorithm, Deployable by DesignSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [90] arXiv:2512.08229 [pdf, ps, other]
-
Title: Geometry-Aware Sparse Depth Sampling for High-Fidelity RGB-D Depth Completion in Robotic SystemsSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [91] arXiv:2512.08228 [pdf, ps, other]
-
Title: MM-CoT:A Benchmark for Probing Visual Chain-of-Thought Reasoning in Multimodal ModelsAuthors: Jusheng Zhang, Kaitong Cai, Xiaoyang Guo, Sidi Liu, Qinhan Lv, Ruiqi Chen, Jing Yang, Yijia Fan, Xiaofei Sun, Jian Wang, Ziliang Chen, Liang Lin, Keze WangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [92] arXiv:2512.08227 [pdf, ps, other]
-
Title: New VVC profiles targeting Feature Coding for MachinesComments: Accepted for presentation at ICIP 2025 workshop on Coding for MachinesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [93] arXiv:2512.08223 [pdf, ps, other]
-
Title: SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [94] arXiv:2512.08221 [pdf, ps, other]
-
Title: VisKnow: Constructing Visual Knowledge Base for Object UnderstandingComments: 16 pages, 12 figures, 7 tables. Under reviewSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [95] arXiv:2512.08215 [pdf, ps, other]
-
Title: Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior RefinementSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [96] arXiv:2512.08198 [pdf, ps, other]
-
Title: Animal Re-Identification on MicrocontrollersSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [97] arXiv:2512.08180 [pdf, ps, other]
-
Title: GeoLoom: High-quality Geometric Diagram Generation from Textual InputSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [98] arXiv:2512.08163 [pdf, ps, other]
-
Title: Accuracy Does Not Guarantee Human-Likeness in Monocular Depth EstimatorsComments: 22 pages, 12 figures, 1 tableSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [99] arXiv:2512.08161 [pdf, ps, other]
-
Title: Fourier-RWKV: A Multi-State Perception Network for Efficient Image DehazingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [100] arXiv:2512.08135 [pdf, ps, other]
-
Title: CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial ReasoningComments: Accepted to WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [101] arXiv:2512.08075 [pdf, ps, other]
-
Title: Identification of Deforestation Areas in the Amazon Rainforest Using Change Detection ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [102] arXiv:2512.08048 [pdf, ps, other]
-
Title: Mask to Adapt: Simple Random Masking Enables Robust Continual Test-Time LearningAuthors: Chandler Timm C. DolorielComments: ongoing workSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [103] arXiv:2512.08042 [pdf, ps, other]
-
Title: Towards Sustainable Universal Deepfake Detection with Frequency-Domain MaskingAuthors: Chandler Timm C. Doloriel, Habib Ullah, Kristian Hovde Liland, Fadi Al Machot, Ngai-Man CheungSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [104] arXiv:2512.08040 [pdf, ps, other]
-
Title: Lost in Translation, Found in Embeddings: Sign Language Translation and AlignmentSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [105] arXiv:2512.08038 [pdf, ps, other]
-
Title: SSplain: Sparse and Smooth Explainer for Retinopathy of Prematurity ClassificationAuthors: Elifnur Sunger, Tales Imbiriba, Peter Campbell, Deniz Erdogmus, Stratis Ioannidis, Jennifer DyComments: 20 pages, 16 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [106] arXiv:2512.08016 [pdf, ps, other]
-
Title: FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language ModelsAuthors: Jiyoon Pyo, Yuankun Jiao, Dongwon Jung, Zekun Li, Leeje Jang, Sofia Kirsanova, Jina Kim, Yijun Lin, Qin Liu, Junyi Xie, Hadi Askari, Nan Xu, Muhao Chen, Yao-Yi ChiangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [107] arXiv:2512.07984 [pdf, ps, other]
-
Title: Restrictive Hierarchical Semantic Segmentation for Stratified Tooth Layer DetectionComments: 13 pages, 7 figures, 3 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [108] arXiv:2512.07951 [pdf, ps, other]
-
Title: Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic QualityAuthors: Zekai Luo, Zongze Du, Zhouhang Zhu, Hao Zhong, Muzhi Zhu, Wen Wang, Yuling Xi, Chenchen Jing, Hao Chen, Chunhua ShenComments: Project webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [109] arXiv:2512.07925 [pdf, ps, other]
-
Title: Near-real time fires detection using satellite imagery in Sudan conflictSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [110] arXiv:2512.07838 [pdf, ps, other]
-
Title: Detection of Cyberbullying in GIF using AISubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
- [111] arXiv:2512.08715 (cross-list from cs.PF) [pdf, ps, other]
-
Title: Multi-domain performance analysis with scores tailored to user preferencesSubjects: Performance (cs.PF); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [112] arXiv:2512.08629 (cross-list from cs.AI) [pdf, ps, other]
-
Title: See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic ArmSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
- [113] arXiv:2512.08545 (cross-list from cs.CL) [pdf, ps, other]
-
Title: Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon TasksComments: 22 pages, 2 tables, 9 figuresSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
- [114] arXiv:2512.08500 (cross-list from cs.GR) [pdf, ps, other]
-
Title: Learning to Control Physically-simulated 3D Characters via Generating and Mimicking 2D MotionsSubjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
- [115] arXiv:2512.08360 (cross-list from cs.NE) [pdf, ps, other]
-
Title: Conditional Morphogenesis: Emergent Generation of Structural Digits via Neural Cellular AutomataAuthors: Ali SakourComments: 13 pages, 5 figures. Code available at: this https URLSubjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [116] arXiv:2512.08284 (cross-list from physics.geo-ph) [pdf, ps, other]
-
Title: Self-Reinforced Deep Priors for Reparameterized Full Waveform InversionComments: Submitted to GEOPHYSICSSubjects: Geophysics (physics.geo-ph); Computer Vision and Pattern Recognition (cs.CV)
- [117] arXiv:2512.08271 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Zero-Splat TeleAssist: A Zero-Shot Pose Estimation Framework for Semantic TeleoperationComments: Published and Presented at 3rd Workshop on Human-Centric Multilateral Teleoperation in ICRA 2025Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
- [118] arXiv:2512.08216 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [119] arXiv:2512.08188 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World ModelAuthors: Wenjiang Xu, Cindy Wang, Rui Fang, Mingkang Zhang, Lusong Li, Jing Xu, Jiayuan Gu, Zecui Zeng, Rui ChenComments: Website at this https URLSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [120] arXiv:2512.08170 (cross-list from cs.RO) [pdf, ps, other]
-
Title: RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric FeaturesSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [121] arXiv:2512.08153 (cross-list from cs.LG) [pdf, ps, other]
-
Title: TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion ModelsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [122] arXiv:2512.08125 (cross-list from eess.IV) [pdf, ps, other]
-
Title: FlowSteer: Conditioning Flow Field for Consistent Image RestorationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [123] arXiv:2512.08099 (cross-list from math.NA) [pdf, ps, other]
-
Title: Generalizations of the Normalized Radon Cumulative Distribution Transform for Limited Data RecognitionSubjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
- [124] arXiv:2512.08029 (cross-list from cs.LG) [pdf, ps, other]
-
Title: CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent SpaceSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [125] arXiv:2512.07998 (cross-list from cs.RO) [pdf, ps, other]
-
Title: DIJIT: A Robotic Head for an Active ObserverAuthors: Mostafa Kamali Tabrizi, Mingshi Chi, Bir Bikram Dey, Yu Qing Yuan, Markus D. Solbach, Yiqian Liu, Michael Jenkin, John K. TsotsosSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [126] arXiv:2512.07981 (cross-list from cs.LG) [pdf, ps, other]
-
Title: CIP-Net: Continual Interpretable Prototype-based NetworkSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [127] arXiv:2512.07976 (cross-list from cs.RO) [pdf, ps, other]
-
Title: VLD: Visual Language Goal Distance for Reinforcement Learning NavigationSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [128] arXiv:2512.07969 (cross-list from cs.RO) [pdf, ps, other]
-
Title: Sparse Variable Projection in Robotic Perception: Exploiting Separable Structure for Efficient Nonlinear OptimizationComments: 8 pages, submitted for reviewSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [129] arXiv:2512.07884 (cross-list from cs.LG) [pdf, ps, other]
-
Title: GSPN-2: Efficient Parallel Sequence ModelingAuthors: Hongjun Wang, Yitong Jiang, Collin McCarthy, David Wehr, Hanrong Ye, Xinhao Li, Ka Chun Cheung, Wonmin Byeon, Jinwei Gu, Ke Chen, Kai Han, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Sifei LiuComments: NeurIPS 2025Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [130] arXiv:2512.07855 (cross-list from cs.LG) [pdf, ps, other]
-
Title: LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer ModelSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [131] arXiv:2512.05791 (cross-list from physics.med-ph) [pdf, ps, other]
-
Title: Fast and Robust Diffusion Posterior Sampling for MR Image Reconstruction Using the Preconditioned Unadjusted Langevin AlgorithmComments: Submitted to Magnetic Resonance in MedicineSubjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Probability (math.PR)
Tue, 9 Dec 2025 (showing first 119 of 259 entries)
- [132] arXiv:2512.07834 [pdf, ps, other]
-
Title: Voxify3D: Pixel Art Meets Volumetric RenderingComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [133] arXiv:2512.07833 [pdf, ps, other]
-
Title: Relational Visual SimilarityAuthors: Thao Nguyen, Sicheng Mo, Krishna Kumar Singh, Yilin Wang, Jing Shi, Nicholas Kolkin, Eli Shechtman, Yong Jae Lee, Yuheng LiComments: Project page, data, and code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [134] arXiv:2512.07831 [pdf, ps, other]
-
Title: UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video GenerationAuthors: Jiehui Huang, Yuechen Zhang, Xu He, Yuan Gao, Zhi Cen, Bin Xia, Yan Zhou, Xin Tao, Pengfei Wan, Jiaya JiaComments: Project Website this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [135] arXiv:2512.07829 [pdf, ps, other]
-
Title: One Layer Is Enough: Adapting Pretrained Visual Encoders for Image GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [136] arXiv:2512.07826 [pdf, ps, other]
-
Title: OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video EditingAuthors: Haoyang He, Jie Wang, Jiangning Zhang, Zhucun Xue, Xingyuan Bu, Qiangpeng Yang, Shilei Wen, Lei XieComments: 38 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [137] arXiv:2512.07821 [pdf, ps, other]
-
Title: WorldReel: 4D Video Generation with Consistent Geometry and Motion ModelingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [138] arXiv:2512.07807 [pdf, ps, other]
-
Title: Lang3D-XL: Language Embedded 3D Gaussians for Large-scale ScenesComments: Accepted to SIGGRAPH Asia 2025. Project webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [139] arXiv:2512.07806 [pdf, ps, other]
-
Title: Multi-view Pyramid Transformer: Look Coarser to See BroaderComments: Project page: see this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [140] arXiv:2512.07802 [pdf, ps, other]
-
Title: OneStory: Coherent Multi-Shot Video Generation with Adaptive MemoryAuthors: Zhaochong An, Menglin Jia, Haonan Qiu, Zijian Zhou, Xiaoke Huang, Zhiheng Liu, Weiming Ren, Kumara Kahatapitiya, Ding Liu, Sen He, Chenyang Zhang, Tao Xiang, Fanny Yang, Serge Belongie, Tian XieComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [141] arXiv:2512.07778 [pdf, ps, other]
-
Title: Distribution Matching Variational AutoEncoderSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [142] arXiv:2512.07776 [pdf, ps, other]
-
Title: GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population MonitoringAuthors: Maximilian Schall, Felix Leonard Knöfel, Noah Elias König, Jan Jonas Kubeler, Maximilian von Klinski, Joan Wilhelm Linnemann, Xiaoshi Liu, Iven Jelle Schlegelmilch, Ole Woyciniuk, Alexandra Schild, Dante Wasmuht, Magdalena Bermejo Espinet, German Illera Basas, Gerard de MeloComments: Accepted at WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [143] arXiv:2512.07760 [pdf, ps, other]
-
Title: Modality-Aware Bias Mitigation and Invariance Learning for Unsupervised Visible-Infrared Person Re-IdentificationComments: Accepted to AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [144] arXiv:2512.07756 [pdf, ps, other]
-
Title: UltrasODM: A Dual Stream Optical Flow Mamba Network for 3D Freehand Ultrasound ReconstructionSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [145] arXiv:2512.07747 [pdf, ps, other]
-
Title: Unison: A Fully Automatic, Task-Universal, and Low-Cost Framework for Unified Understanding and GenerationAuthors: Shihao Zhao, Yitong Chen, Zeyinzi Jiang, Bojia Zi, Shaozhe Hao, Yu Liu, Chaojie Mao, Kwan-Yee K. WongSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [146] arXiv:2512.07745 [pdf, ps, other]
-
Title: DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous DrivingAuthors: Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, Xinggang WangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [147] arXiv:2512.07738 [pdf, ps, other]
-
Title: HLTCOE Evaluation Team at TREC 2025: VQA TrackAuthors: Dengjia Zhang, Charles Weng, Katherine Guerrerio, Yi Lu, Kenton Murray, Alexander Martin, Reno Kriz, Benjamin Van DurmeComments: 7 pages, 1 figureSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [148] arXiv:2512.07733 [pdf, ps, other]
-
Title: SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental ImagerySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [149] arXiv:2512.07730 [pdf, ps, other]
-
Title: SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object HallucinationComments: WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [150] arXiv:2512.07729 [pdf, ps, other]
-
Title: Improving action classification with brain-inspired deep networksSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [151] arXiv:2512.07720 [pdf, ps, other]
-
Title: ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar CreationAuthors: Fan Yang, Heyuan Li, Peihao Li, Weihao Yuan, Lingteng Qiu, Chaoyue Song, Cheng Chen, Yisheng He, Shifeng Zhang, Xiaoguang Han, Steven Hoi, Guosheng LinComments: Project page: \url{this https URL}Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [152] arXiv:2512.07712 [pdf, ps, other]
-
Title: UnCageNet: Tracking and Pose Estimation of Caged AnimalComments: 9 pages, 2 figures, 2 tables. Accepted to the Indian Conference on Computer Vision, Graphics, and Image Processing (ICVGIP 2025), Mandi, IndiaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [153] arXiv:2512.07703 [pdf, ps, other]
-
Title: PVeRA: Probabilistic Vector-Based Random Matrix AdaptationAuthors: Leo Fillioux, Enzo Ferrante, Paul-Henry Cournède, Maria Vakalopoulou, Stergios ChristodoulidisSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [154] arXiv:2512.07702 [pdf, ps, other]
-
Title: Guiding What Not to Generate: Automated Negative Prompting for Text-Image AlignmentComments: WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [155] arXiv:2512.07698 [pdf, ps, other]
-
Title: sim2art: Accurate Articulated Object Modeling from a Single Video using Synthetic Training Data OnlySubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [156] arXiv:2512.07674 [pdf, ps, other]
-
Title: DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast RepresentationsAuthors: Mehmet Yigit Avci, Pedro Borges, Virginia Fernandez, Paul Wright, Mehmet Yigitsoy, Sebastien Ourselin, Jorge CardosoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [157] arXiv:2512.07668 [pdf, ps, other]
-
Title: EgoCampus: Egocentric Pedestrian Eye Gaze Model and DatasetSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [158] arXiv:2512.07661 [pdf, ps, other]
-
Title: Optimization-Guided Diffusion for Interactive Scene GenerationAuthors: Shiaho Li, Naisheng Ye, Tianyu Li, Kashyap Chitta, Tuo An, Peng Su, Boyang Wang, Haiou Liu, Chen Lv, Hongyang LiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [159] arXiv:2512.07652 [pdf, ps, other]
-
Title: An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific ResearchSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [160] arXiv:2512.07651 [pdf, ps, other]
-
Title: Liver Fibrosis Quantification and Analysis: The LiQA Dataset and Baseline MethodSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [161] arXiv:2512.07628 [pdf, ps, other]
-
Title: MoCA: Mixture-of-Components Attention for Scalable Compositional 3D GenerationAuthors: Zhiqi Li, Wenhuan Li, Tengfei Wang, Zhenwei Wang, Junta Wu, Haoyuan Wang, Yunhan Yang, Zehuan Huang, Yang Li, Peidong Liu, Chunchao GuoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [162] arXiv:2512.07606 [pdf, ps, other]
-
Title: Decomposition Sampling for Efficient Region Annotations in Active LearningAuthors: Jingna Qiu, Frauke Wilm, Mathias Öttl, Jonas Utz, Maja Schlereth, Moritz Schillinger, Marc Aubreville, Katharina BreiningerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [163] arXiv:2512.07599 [pdf, ps, other]
-
Title: Online Segment Any 3D Thing as Instance TrackingComments: NeurIPS 2025, Code is at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [164] arXiv:2512.07596 [pdf, ps, other]
-
Title: More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic SurgeryAuthors: Wenzhen Dong, Jieming Yu, Yiming Huang, Hongqiu Wang, Lei Zhu, Albert C. S. Chung, Hongliang Ren, Long BaiComments: Technical ReportSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [165] arXiv:2512.07590 [pdf, ps, other]
-
Title: Robust Variational Model Based Tailored UNet: Leveraging Edge Detector and Mean Curvature for Improved Image SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [166] arXiv:2512.07584 [pdf, ps, other]
-
Title: LongCat-Image Technical ReportAuthors: Meituan LongCat Team: Hanghang Ma, Haoxian Tan, Jiale Huang, Junqiang Wu, Jun-Yan He, Lishuai Gao, Songlin Xiao, Xiaoming Wei, Xiaoqi Ma, Xunliang Cai, Yayong Guan, Jie HuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [167] arXiv:2512.07580 [pdf, ps, other]
-
Title: All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMsAuthors: Yahong Wang, Juncheng Wu, Zhangkai Ni, Longzhen Yang, Yihang Liu, Chengmei Yang, Ying Wen, Xianfeng Tang, Hui Liu, Yuyin Zhou, Lianghua HeSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [168] arXiv:2512.07568 [pdf, ps, other]
-
Title: Dual-Stream Cross-Modal Representation Learning via Residual Semantic DecorrelationAuthors: Xuecheng Li, Weikuan Jia, Alisher Kurbonaliev, Qurbonaliev Alisher, Khudzhamkulov Rustam, Ismoilov Shuhratjon, Eshmatov Javhariddin, Yuanjie ZhengSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
- [169] arXiv:2512.07564 [pdf, ps, other]
-
Title: Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language ModelsComments: 24 pages, 3 figures, 2 tables. Training-free self-correction framework for vision-language models. Code and implementation details will be released at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [170] arXiv:2512.07527 [pdf, ps, other]
-
Title: From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite ImagesAuthors: Fei Yu, Yu Liu, Luyang Tang, Mingchao Sun, Zengye Ge, Rui Bu, Yuchao Jin, Haisen Zhao, He Sun, Yangyan Li, Mu Xu, Wenzheng Chen, Baoquan ChenSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [171] arXiv:2512.07514 [pdf, ps, other]
-
Title: MeshRipple: Structured Autoregressive Generation of Artist-MeshesAuthors: Junkai Lin, Hang Long, Huipeng Guo, Jielei Zhang, JiaYi Yang, Tianle Guo, Yang Yang, Jianwen Li, Wenxiao Zhang, Matthias Nießner, Wei YangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [172] arXiv:2512.07504 [pdf, ps, other]
-
Title: ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing PointsComments: Accepted to WACV 2026, 8 pages, supplementary included. Dataset and code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [173] arXiv:2512.07503 [pdf, ps, other]
-
Title: SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image GenerationAuthors: Yao Teng, Zhihuan Jiang, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui LiuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [174] arXiv:2512.07500 [pdf, ps, other]
-
Title: MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion TransformerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [175] arXiv:2512.07498 [pdf, ps, other]
-
Title: Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral PriorComments: 16 pages (including appendix)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [176] arXiv:2512.07480 [pdf, ps, other]
-
Title: Single-step Diffusion-based Video Coding with Semantic-Temporal GuidanceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [177] arXiv:2512.07469 [pdf, ps, other]
-
Title: Unified Video Editing with Temporal ReasonerComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [178] arXiv:2512.07426 [pdf, ps, other]
-
Title: When normalization hallucinates: unseen risks in AI-powered whole slide image processingAuthors: Karel Moens, Matthew B. Blaschko, Tinne Tuytelaars, Bart Diricx, Jonas De Vylder, Mustafa YousifComments: 4 pages, accepted for oral presentation at SPIE Medical Imaging, 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [179] arXiv:2512.07415 [pdf, ps, other]
-
Title: Data-driven Exploration of Mobility Interaction PatternsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [180] arXiv:2512.07410 [pdf, ps, other]
-
Title: InterAgent: Physics-based Multi-agent Command Execution via Diffusion on Interaction GraphsAuthors: Bin Li, Ruichi Zhang, Han Liang, Jingyan Zhang, Juze Zhang, Xin Chen, Lan Xu, Jingyi Yu, Jingya WangComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [181] arXiv:2512.07394 [pdf, ps, other]
-
Title: Reconstructing Objects along Hand Interaction Timelines in Egocentric VideoComments: webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [182] arXiv:2512.07391 [pdf, ps, other]
-
Title: GlimmerNet: A Lightweight Grouped Dilated Depthwise Convolutions for UAV-Based Emergency MonitoringAuthors: Đorđe NedeljkovićSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [183] arXiv:2512.07385 [pdf, ps, other]
-
Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New BaselineAuthors: Chunhui Zhang, Li Liu, Zhipeng Zhang, Yong Wang, Hao Wen, Xi Zhou, Shiming Ge, Yanfeng WangComments: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [184] arXiv:2512.07383 [pdf, ps, other]
-
Title: LogicCBMs: Logic-Enhanced Concept-Based LearningComments: 18 pages, 19 figures, WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [185] arXiv:2512.07381 [pdf, ps, other]
-
Title: Tessellation GS: Neural Mesh Gaussians for Robust Monocular Reconstruction of Dynamic ObjectsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [186] arXiv:2512.07379 [pdf, ps, other]
-
Title: Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and EfficiencyComments: 22 pages, 16 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [187] arXiv:2512.07360 [pdf, ps, other]
-
Title: Structure-Aware Feature Rectification with Region Adjacency Graphs for Training-Free Open-Vocabulary Semantic SegmentationComments: Accepted to WACV2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [188] arXiv:2512.07351 [pdf, ps, other]
-
Title: DeepAgent: A Dual Stream Multi Agent Fusion for Robust Multimodal Deepfake DetectionAuthors: Sayeem Been Zaman, Wasimul Karim, Arefin Ittesafun Abian, Reem E. Mohamed, Md Rafiqul Islam, Asif Karim, Sami AzamSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD)
- [189] arXiv:2512.07348 [pdf, ps, other]
-
Title: MICo-150K: A Comprehensive Dataset Advancing Multi-Image CompositionAuthors: Xinyu Wei, Kangrui Cen, Hongyang Wei, Zhen Guo, Bairui Li, Zeqing Wang, Jinrui Zhang, Lei ZhangComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [190] arXiv:2512.07345 [pdf, ps, other]
-
Title: Debiasing Diffusion Priors via 3D Attention for Consistent Gaussian SplattingComments: 15 pages, 8 figures, 5 tables, 2 algorithms, Accepted by AAAI 2026Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [191] arXiv:2512.07338 [pdf, ps, other]
-
Title: Generalized Referring Expression Segmentation on Aerial PhotosComments: Submitted to IEEE J-STARSSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [192] arXiv:2512.07331 [pdf, ps, other]
-
Title: The Inductive Bottleneck: Data-Driven Emergence of Representational Sparsity in Vision TransformersAuthors: Kanishk AwadhiyaSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [193] arXiv:2512.07328 [pdf, ps, other]
-
Title: ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [194] arXiv:2512.07305 [pdf, ps, other]
-
Title: Reevaluating Automated Wildlife Species Detection: A Reproducibility Study on a Custom Image DatasetAuthors: Tobias Abraham HaiderSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [195] arXiv:2512.07302 [pdf, ps, other]
-
Title: Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task PromptsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [196] arXiv:2512.07276 [pdf, ps, other]
-
Title: Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial ImageryComments: Accepted to WACV 2026. Camera-ready-based version with minor edits for readability (no change in the contents)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [197] arXiv:2512.07275 [pdf, ps, other]
- [198] arXiv:2512.07273 [pdf, ps, other]
-
Title: RVLF: A Reinforcing Vision-Language Framework for Gloss-Free Sign Language TranslationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [199] arXiv:2512.07269 [pdf, ps, other]
-
Title: A graph generation pipeline for critical infrastructures based on heuristics, images and depth dataSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [200] arXiv:2512.07253 [pdf, ps, other]
-
Title: DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video EnhancementComments: 18 pages, 8 figures, and 7 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [201] arXiv:2512.07251 [pdf, ps, other]
-
Title: See More, Change Less: Anatomy-Aware Diffusion for Contrast EnhancementAuthors: Junqi Liu, Zejun Wu, Pedro R. A. S. Bassi, Xinze Zhou, Wenxuan Li, Ibrahim E. Hamamci, Sezgin Er, Tianyu Lin, Yi Luo, Szymon Płotka, Bjoern Menze, Daguang Xu, Kai Ding, Kang Wang, Yang Yang, Yucheng Tang, Alan L. Yuille, Zongwei ZhouSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [202] arXiv:2512.07247 [pdf, ps, other]
-
Title: AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven EditingComments: 40 pages, 34 figures, 18 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [203] arXiv:2512.07245 [pdf, ps, other]
-
Title: Zero-Shot Textual Explanations via Translating Decision-Critical FeaturesComments: 11+6 pages, 8 figures, 4 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [204] arXiv:2512.07241 [pdf, ps, other]
-
Title: Squeezed-Eff-Net: Edge-Computed Boost of Tomography Based Brain Tumor Classification leveraging Hybrid Neural Network ArchitectureAuthors: Md. Srabon Chowdhury, Syeda Fahmida Tanzim, Sheekar Banerjee, Ishtiak Al Mamoon, AKM Muzahidul IslamSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [205] arXiv:2512.07237 [pdf, ps, other]
-
Title: Unified Camera Positional Encoding for Controlled Video GenerationAuthors: Cheng Zhang, Boying Li, Meng Wei, Yan-Pei Cao, Camilo Cruz Gambardella, Dinh Phung, Jianfei CaiComments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [206] arXiv:2512.07234 [pdf, ps, other]
-
Title: Dropout Prompt Learning: Towards Robust and Adaptive Vision-Language ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [207] arXiv:2512.07230 [pdf, ps, other]
-
Title: STRinGS: Selective Text Refinement in Gaussian SplattingAuthors: Abhinav Raundhal, Gaurav Behera, P J Narayanan, Ravi Kiran Sarvadevabhatla, Makarand TapaswiComments: Accepted to WACV 2026. Project Page, see this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [208] arXiv:2512.07229 [pdf, ps, other]
-
Title: ReLKD: Inter-Class Relation Learning with Knowledge Distillation for Generalized Category DiscoveryComments: Accepted to the Main Track of the 28th European Conference on Artificial Intelligence (ECAI 2025). To appear in the proceedings published by IOS Press (DOI: 10.3233/FAIA413)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [209] arXiv:2512.07228 [pdf, ps, other]
-
Title: Towards Robust Protective Perturbation against DeepFake Face SwappingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [210] arXiv:2512.07215 [pdf, ps, other]
-
Title: VFM-VLM: Vision Foundation Model and Vision Language Model based Visual Comparison for 3D Pose EstimationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [211] arXiv:2512.07211 [pdf, ps, other]
-
Title: Object Pose Distribution Estimation for Determining Revolution and Reflection Uncertainty in Point CloudsComments: 8 pages, 8 figures, 5 tables, ICCR 2025Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [212] arXiv:2512.07206 [pdf, ps, other]
-
Title: AutoLugano: A Deep Learning Framework for Fully Automated Lymphoma Segmentation and Lugano Staging on FDG-PET/CTAuthors: Boyang Pan, Zeyu Zhang, Hongyu Meng, Bin Cui, Yingying Zhang, Wenli Hou, Junhao Li, Langdi Zhong, Xiaoxiao Chen, Xiaoyu Xu, Changjin Zuo, Chao Cheng, Nan-Jie GongSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [213] arXiv:2512.07203 [pdf, ps, other]
-
Title: MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent ReasoningComments: 7 pages, 1 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [214] arXiv:2512.07201 [pdf, ps, other]
-
Title: Understanding Diffusion Models via Code ExecutionAuthors: Cheng YuSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [215] arXiv:2512.07198 [pdf, ps, other]
-
Title: Generating Storytelling Images with Rich Chains-of-ReasoningSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [216] arXiv:2512.07197 [pdf, ps, other]
-
Title: SUCCESS-GS: Survey of Compactness and Compression for Efficient Static and Dynamic Gaussian SplattingComments: The first three authors contributed equally to this work. The last two authors are co-corresponding authors. Please visit our project page at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [217] arXiv:2512.07192 [pdf, ps, other]
-
Title: HVQ-CGIC: Enabling Hyperprior Entropy Modeling for VQ-Based Controllable Generative Image CompressionComments: 12 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [218] arXiv:2512.07191 [pdf, ps, other]
-
Title: RefLSM: Linearized Structural-Prior Reflectance Model for Medical Image Segmentation and Bias-Field CorrectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [219] arXiv:2512.07190 [pdf, ps, other]
-
Title: Integrating Multi-scale and Multi-filtration Topological Features for Medical Image ClassificationAuthors: Pengfei Gu, Huimin Li, Haoteng Tang, Dongkuan (DK)Xu, Erik Enriquez, DongChul Kim, Bin Fu, Danny Z. ChenSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [220] arXiv:2512.07186 [pdf, ps, other]
-
Title: START: Spatial and Textual Learning for Chart UnderstandingComments: WACV2026 Camera ReadySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [221] arXiv:2512.07171 [pdf, ps, other]
-
Title: TIDE: Two-Stage Inverse Degradation Estimation with Guided Prior Disentanglement for Underwater Image RestorationComments: 21 pages, 11 figures, 5 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [222] arXiv:2512.07170 [pdf, ps, other]
-
Title: Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer ApproachSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [223] arXiv:2512.07166 [pdf, ps, other]
-
Title: When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM EditingComments: 9 pages,7figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [224] arXiv:2512.07165 [pdf, ps, other]
-
Title: MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale AdaptationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [225] arXiv:2512.07155 [pdf, ps, other]
-
Title: CHIMERA: Adaptive Cache Injection and Semantic Anchor Prompting for Zero-shot Image Morphing with Morphing-oriented MetricsComments: Please visit our project page at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [226] arXiv:2512.07141 [pdf, ps, other]
-
Title: Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [227] arXiv:2512.07136 [pdf, ps, other]
-
Title: A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and ReasoningAuthors: Siyang Jiang, Mu Yuan, Xiang Ji, Bufang Yang, Zeyu Liu, Lilin Xu, Yang Li, Yuting He, Liran Dong, Wenrui Lu, Zhenyu Yan, Xiaofan Jiang, Wei Gao, Hongkai Chen, Guoliang XingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [228] arXiv:2512.07135 [pdf, ps, other]
-
Title: TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement LearningAuthors: Zebin Xing, Pengxuan Yang, Linbo Wang, Yichen Zhang, Yiming Hu, Yupeng Zheng, Junli Wang, Yinfeng Gao, Guang Li, Kun Ma, Long Chen, Zhongpu Xia, Qichao Zhang, Hangjun Ye, Dongbin ZhaoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [229] arXiv:2512.07128 [pdf, ps, other]
-
Title: MulCLIP: A Multi-level Alignment Framework for Enhancing Fine-grained Long-context CLIPSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [230] arXiv:2512.07126 [pdf, ps, other]
-
Title: Training-free Clothing Region of Interest Self-correction for Virtual Try-OnComments: 16 pages, 8 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [231] arXiv:2512.07110 [pdf, ps, other]
-
Title: MSN: Multi-directional Similarity Network for Hand-crafted and Deep-synthesized Copy-Move Forgery DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [232] arXiv:2512.07107 [pdf, ps, other]
-
Title: COREA: Coarse-to-Fine 3D Representation Alignment Between Relightable 3D Gaussians and SDF via Bidirectional 3D-to-3D SupervisionComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [233] arXiv:2512.07078 [pdf, ps, other]
-
Title: DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object DetectionComments: 16 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [234] arXiv:2512.07076 [pdf, ps, other]
-
Title: Context-measure: Contextualizing Metric for CamouflageComments: Technical ReportSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [235] arXiv:2512.07065 [pdf, ps, other]
-
Title: Persistent Homology-Guided Frequency Filtering for Image CompressionComments: 17 pages, 8 figures, code available at github.com/RMATH3/persistent-homology-compressionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [236] arXiv:2512.07062 [pdf, ps, other]
-
Title: $\mathrm{D}^{\mathrm{3}}$-Predictor: Noise-Free Deterministic Diffusion for Dense PredictionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [237] arXiv:2512.07052 [pdf, ps, other]
-
Title: RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian SplattingAuthors: Hoang-Nhat Tran, Francesco Di Sario, Gabriele Spadaro, Giuseppe Valenzise, Enzo TartaglioneSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [238] arXiv:2512.07051 [pdf, ps, other]
-
Title: DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image SegmentationComments: 11 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [239] arXiv:2512.07037 [pdf, ps, other]
-
Title: Evaluating and Preserving High-level Fidelity in Super-ResolutionSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [240] arXiv:2512.07034 [pdf, ps, other]
-
Title: Power of Boundary and Reflection: Semantic Transparent Object Segmentation using Pyramid Vision Transformer with Transparent CuesAuthors: Tuan-Anh Vu, Hai Nguyen-Truong, Ziqiang Zheng, Binh-Son Hua, Qing Guo, Ivor Tsang, Sai-Kit YeungComments: Accepted to WACV 2026Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [241] arXiv:2512.06981 [pdf, ps, other]
-
Title: Selective Masking based Self-Supervised Learning for Image Semantic SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [242] arXiv:2512.06949 [pdf, ps, other]
-
Title: Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin HistologyComments: 19 pages, 5 figures, 2 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [243] arXiv:2512.06921 [pdf, ps, other]
-
Title: NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy IdentificationAuthors: Ziyang Song, Zelin Zang, Xiaofan Ye, Boqiang Xu, Long Bai, Jinlin Wu, Hongliang Ren, Hongbin Liu, Jiebo Luo, Zhen LeiComments: Accepted by IEEE ICIA 2025Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [244] arXiv:2512.06905 [pdf, ps, other]
-
Title: Scaling Zero-Shot Reference-to-Video GenerationAuthors: Zijian Zhou, Shikun Liu, Haozhe Liu, Haonan Qiu, Zhaochong An, Weiming Ren, Zhiheng Liu, Xiaoke Huang, Kam Woh Ng, Tian Xie, Xiao Han, Yuren Cong, Hang Li, Chuyan Zhu, Aditya Patel, Tao Xiang, Sen HeComments: Website: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [245] arXiv:2512.06888 [pdf, ps, other]
-
Title: Overcoming Small Data Limitations in Video-Based Infant Respiration EstimationAuthors: Liyang Song, Hardik Bishnoi, Sai Kumar Reddy Manne, Sarah Ostadabbas, Briana J. Taylor, Michael WanSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [246] arXiv:2512.06886 [pdf, ps, other]
-
Title: Balanced Learning for Domain Adaptive Semantic SegmentationComments: Accepted by International Conference on Machine Learning (ICML 2025)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [247] arXiv:2512.06885 [pdf, ps, other]
-
Title: JoPano: Unified Panorama Generation via Joint ModelingComments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [248] arXiv:2512.06882 [pdf, ps, other]
-
Title: Hierarchical Image-Guided 3D Point Cloud Segmentation in Industrial Scenes via Multi-View Bayesian FusionComments: Accepted to BMVC 2025 (Sheffield, UK, Nov 24-27, 2025). Supplementary video and poster available upon requestSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [249] arXiv:2512.06877 [pdf, ps, other]
-
Title: SceneMixer: Exploring Convolutional Mixing Networks for Remote Sensing Scene ClassificationComments: Accepted and presented in ICSPISSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [250] arXiv:2512.06870 [pdf, ps, other]
-
Title: Towards Robust Pseudo-Label Learning in Semantic Segmentation: An Encoding PerspectiveComments: Accepted by Conference on Neural Information Processing Systems (NeurIPS 2025)Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ showing 250 entries per page: fewer | more | all ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, new, 2512, contact, help (Access key information)