Skip to content

Vision Transformer

Vision Transformer

Publish Date Title Authors PDF Code
2025-03-28 Q-Insight: Understanding Image Quality via Visual Reinforcement Learning Weiqi Li et.al. 2503.22679v1 null
2025-03-28 DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness Ruining Li et.al. 2503.22677v1 null
2025-03-28 TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting Boyang et.al. 2503.22676v1 null
2025-03-28 Understanding Co-speech Gestures in-the-wild Sindhu B Hegde et.al. 2503.22668v1 null
2025-03-28 NetSSM: Multi-Flow and State-Aware Network Trace Generation using State-Space Models Andrew Chu et.al. 2503.22663v1 null
2025-03-28 Evaluation of Machine-generated Biomedical Images via A Tally-based Similarity Measure Frank J. Brooks et.al. 2503.22658v1 null
2025-03-28 Unicorn: Text-Only Data Synthesis for Vision Language Model Training Xiaomin Yu et.al. 2503.22655v1 null
2025-03-28 Physics-informed gauge theories Friederike Ihssen et.al. 2503.22638v1 null
2025-03-28 Zero4D: Training-Free 4D Video Generation From Single Video Using Off-the-Shelf Video Diffusion Model Jangho Park et.al. 2503.22622v1 null
2025-03-28 Audio-Plane: Audio Factorization Plane Gaussian Splatting for Real-Time Talking Head Synthesis Shuai Shen et.al. 2503.22605v1 null
2025-03-28 KEVS: Enhancing Segmentation of Visceral Adipose Tissue in Pre-Cystectomy CT with Gaussian Kernel Density Estimation Thomas Boucher et.al. 2503.22592v1 null
2025-03-28 Transforming Siliconization into Slippery Liquid-like Coatings Hernán Barrio-Zhang et.al. 2503.22591v1 null
2025-03-28 Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012 Adam Breuer et.al. 2503.22589v1 null
2025-03-28 LLM-enabled Instance Model Generation Fengjunjie Pan et.al. 2503.22587v1 null
2025-03-28 Next-Best-Trajectory Planning of Robot Manipulators for Effective Observation and Exploration Heiko Renz et.al. 2503.22588v1 null
2025-03-28 Two-dimensional electronic spectroscopy in the condensed phase using equivariant transformer accelerated molecular dynamics simulations Joseph Kelly et.al. 2503.22583v1 null
2025-03-28 Breaking Language Barriers in Visual Language Models via Multilingual Textual Regularization Iñigo Pikabea et.al. 2503.22577v1 null
2025-03-28 RELD: Regularization by Latent Diffusion Models for Image Restoration Pasquale Cascarano et.al. 2503.22563v1 null
2025-03-28 Image Decomposition with G-norm Weighted by Total Symmetric Variation Roy Y. He et.al. 2503.22560v1 null
2025-03-28 MO-CTranS: A unified multi-organ segmentation model learning from multiple heterogeneously labelled datasets Zhendi Gong et.al. 2503.22557v1 null
2025-03-28 Bridging the Dimensional Chasm: Uncover Layer-wise Dimensional Reduction in Transformers through Token Correlation Zhuo-Yang Song et.al. 2503.22547v1 null
2025-03-28 LIM: Large Interpolator Model for Dynamic Reconstruction Remy Sabathier et.al. 2503.22537v1 null
2025-03-28 Deterministic Medical Image Translation via High-fidelity Brownian Bridges Qisheng He et.al. 2503.22531v1 null
2025-03-28 MixFunn: A Neural Network for Differential Equations with Improved Generalization and Interpretability Tiago de Souza Farias et.al. 2503.22528v1 null
2025-03-28 AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization Martin Kišš et.al. 2503.22526v1 null
2025-03-28 Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities Raman Dutt et.al. 2503.22517v1 null
2025-03-28 Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets Martin Kišš et.al. 2503.22513v1 null
2025-03-28 Cross-Technology Generalization in Synthesized Speech Detection: Evaluating AST Models with Modern Voice Generators Andrew Ustinov et.al. 2503.22503v1 null
2025-03-28 Learnable cut flow Jing Li et.al. 2503.22498v1 null
2025-03-28 Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments Luke Rowe et.al. 2503.22496v1 null