Skip to content

Transformer

Transformer

Publish Date Title Authors PDF Code
2025-07-03 Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching Xin Zhou et.al. 2507.02860v1 null
2025-07-03 Neutrino mixing parameters and masses from $Δ(96)\rtimes H_{CP}$ in the tri-direct CP approach Li-Na Yan et.al. 2507.02840v1 null
2025-07-03 DNN-Based Precoding in RIS-Aided mmWave MIMO Systems With Practical Phase Shift Po-Heng Chou et.al. 2507.02824v1 null
2025-07-03 AREE-Based Decoupled Design of Hybrid Beamformers in mmWave XL-MIMO Systems Jiazhe Li et.al. 2507.02802v1 null
2025-07-03 Time-Masked Transformers with Lightweight Test-Time Adaptation for Neural Speech Decoding Ebrahim Feghhi et.al. 2507.02800v1 null
2025-07-03 A Highly Carbon-Rich Dayside and Disequilibrium Chemistry in the Ultra-Hot Jupiter WASP-19b Suman Saha et.al. 2507.02797v1 null
2025-07-03 Ultrafast optical excitation of magnons in 2D antiferromagnets via spin torque exerted by photocurrent of excitons: Signatures in charge pumping and THz emission Jalil Varela-Manjarres et.al. 2507.02793v1 null
2025-07-03 Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs Ken Tsui et.al. 2507.02778v1 null
2025-07-03 Fast and Simplex: 2-Simplicial Attention in Triton Aurko Roy et.al. 2507.02754v1 null
2025-07-03 Linear Attention with Global Context: A Multipole Attention Mechanism for Vision and Physics Alex Colagrande et.al. 2507.02748v1 null
2025-07-03 Leveraging Transformer Models to Capture Multi-Scale Dynamics in Biomolecules by nano-GPT Wenqi Zeng et.al. 2507.02734v1 null
2025-07-03 Quantifying Classifier Utility under Local Differential Privacy Ye Zheng et.al. 2507.02727v1 null
2025-07-03 The Yukawa potential of a non-homogeneous sphere, with new limits on an ultralight boson Pierre Fayet et.al. 2507.02723v1 null
2025-07-03 UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation Qin Guo et.al. 2507.02713v1 null
2025-07-03 Faster Algorithm for Bounded Tree Edit Distance in the Low-Distance Regime Tomasz Kociumaka et.al. 2507.02701v1 null
2025-07-03 RLHGNN: Reinforcement Learning-driven Heterogeneous Graph Neural Network for Next Activity Prediction in Business Processes Jiaxing Wang et.al. 2507.02690v1 null
2025-07-03 Learning few-step posterior samplers by unfolding and distillation of diffusion models Charlesquin Kemajou Mbakam et.al. 2507.02686v1 null
2025-07-03 Moments, Time-Inversion and Source Identification for the Heat Equation Kang Liu et.al. 2507.02677v1 null
2025-07-03 MISCGrasp: Leveraging Multiple Integrated Scales and Contrastive Learning for Enhanced Volumetric Grasping Qingyu Fan et.al. 2507.02672v1 null
2025-07-03 ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning Junyu Wang et.al. 2507.02666v1 null
2025-07-03 Hey AI, Generate Me a Hardware Code! Agentic AI-based Hardware Design & Verification Deepak Narayan Gadde et.al. 2507.02660v1 null
2025-07-03 The geometric phase of rotations and 3D coordinate transformations Luis Garza-Soto et.al. 2507.02647v1 null
2025-07-03 Solving the Hubbard model with Neural Quantum States Yuntian Gu et.al. 2507.02644v1 null
2025-07-03 Classification of $f(R)$ Theories Of Inflation And The Uniqueness of Starobinsky Model Marco Piva et.al. 2507.02637v1 null
2025-07-03 High-Order Deep Meta-Learning with Category-Theoretic Interpretation David H. Mguni et.al. 2507.02634v1 null
2025-07-03 A Matrix Variational Auto-Encoder for Variant Effect Prediction in Pharmacogenes Antoine Honoré et.al. 2507.02624v1 null
2025-07-03 Relativistic Limits of Decoding: Critical Divergence of Kullback-Leibler Information and Free Energy Tatsuaki Tsuruyama et.al. 2507.02596v1 null
2025-07-03 AuroraLong: Bringing RNNs Back to Efficient Open-Ended Video Understanding Weili Xu et.al. 2507.02591v1 null
2025-07-03 Parametric shape models for vessels learned from segmentations via differentiable voxelization Alina F. Dima et.al. 2507.02576v1 null
2025-07-03 Transformers Don't Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and the Implications for Mechanistic Interpretability Luca Baroni et.al. 2507.02559v1 null