We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 4

[ total of 737 entries: 1-25 | 5-29 | 30-54 | 55-79 | 80-104 | ... | 730-737 ]
[ showing 25 entries per page: fewer | more | all ]

Fri, 12 Dec 2025 (continued, showing 25 of 118 entries)

[5]  arXiv:2512.10955 [pdf, ps, other]
Title: Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[6]  arXiv:2512.10954 [pdf, ps, other]
Title: Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7]  arXiv:2512.10950 [pdf, ps, other]
Title: E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training
Comments: Project website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8]  arXiv:2512.10949 [pdf, ps, other]
Title: Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation
Comments: Code is released at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[9]  arXiv:2512.10948 [pdf, ps, other]
Title: ClusIR: Towards Cluster-Guided All-in-One Image Restoration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10]  arXiv:2512.10947 [pdf, ps, other]
Title: Towards Efficient and Effective Multi-Camera Encoding for End-to-End Driving
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[11]  arXiv:2512.10945 [pdf, ps, other]
Title: MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Comments: IEEE TPAMI, Project Page: this https URL
Journal-ref: in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 12, pp. 11400-11416, 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[12]  arXiv:2512.10943 [pdf, ps, other]
Title: AlcheMinT: Fine-grained Temporal Control for Multi-Reference Consistent Video Generation
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[13]  arXiv:2512.10942 [pdf, ps, other]
Title: VL-JEPA: Joint Embedding Predictive Architecture for Vision-language
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[14]  arXiv:2512.10941 [pdf, ps, other]
Title: Mull-Tokens: Modality-Agnostic Latent Thinking
Comments: Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15]  arXiv:2512.10940 [pdf, ps, other]
Title: OmniView: An All-Seeing Diffusion Model for 3D and 4D View Synthesis
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[16]  arXiv:2512.10939 [pdf, ps, other]
Title: GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting
Comments: IEEE/CVF Winter Conference on Applications of Computer Vision 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17]  arXiv:2512.10935 [pdf, ps, other]
Title: Any4D: Unified Feed-Forward Metric 4D Reconstruction
Comments: Project Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[18]  arXiv:2512.10932 [pdf, ps, other]
Title: BabyVLM-V2: Toward Developmentally Grounded Pretraining and Benchmarking of Vision Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[19]  arXiv:2512.10927 [pdf, ps, other]
Title: FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20]  arXiv:2512.10894 [pdf, ps, other]
Title: DuetSVG: Unified Multimodal SVG Generation with Internal Visual Guidance
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21]  arXiv:2512.10888 [pdf, ps, other]
Title: PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction
Comments: 15 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22]  arXiv:2512.10881 [pdf, ps, other]
Title: MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23]  arXiv:2512.10867 [pdf, ps, other]
Title: From Macro to Micro: Benchmarking Microscopic Spatial Intelligence on Molecules via Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[24]  arXiv:2512.10863 [pdf, ps, other]
Title: MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[25]  arXiv:2512.10860 [pdf, ps, other]
Title: SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26]  arXiv:2512.10840 [pdf, ps, other]
Title: PoseGAM: Robust Unseen Object Pose Estimation via Geometry-Aware Multi-View Reasoning
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27]  arXiv:2512.10818 [pdf, ps, other]
Title: Self-Ensemble Post Learning for Noisy Domain Generalization
Authors: Wang Lu, Jindong Wang
Comments: 18 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28]  arXiv:2512.10808 [pdf, ps, other]
Title: Graph Laplacian Transformer with Progressive Sampling for Prostate Cancer Grading
Comments: International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2025
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29]  arXiv:2512.10794 [pdf, ps, other]
Title: What matters for Representation Alignment: Global Information or Spatial Structure?
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Machine Learning (stat.ML)
[ total of 737 entries: 1-25 | 5-29 | 30-54 | 55-79 | 80-104 | ... | 730-737 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2512, contact, help  (Access key information)