2025-06-20 |
VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning |
Zhangyang Qi et.al. |
2506.17221v1 |
null |
2025-06-20 |
Emergent Temporal Correspondences from Video Diffusion Transformers |
Jisu Nam et.al. |
2506.17220v1 |
link |
2025-06-20 |
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens |
Zeyuan Yang et.al. |
2506.17218v1 |
null |
2025-06-20 |
Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation |
Xiuyu Yang et.al. |
2506.17213v1 |
null |
2025-06-20 |
Part$^{2}$GS: Part-aware Modeling of Articulated Objects using 3D Gaussian Splatting |
Tianjiao Yu et.al. |
2506.17212v1 |
null |
2025-06-20 |
DreamCube: 3D Panorama Generation via Multi-plane Synchronization |
Yukun Huang et.al. |
2506.17206v1 |
null |
2025-06-20 |
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation |
Teng Li et.al. |
2506.17202v1 |
null |
2025-06-20 |
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition |
Jiaqi Li et.al. |
2506.17201v1 |
null |
2025-06-20 |
Dex1B: Learning with 1B Demonstrations for Dexterous Manipulation |
Jianglong Ye et.al. |
2506.17198v1 |
null |
2025-06-20 |
Facial Landmark Visualization and Emotion Recognition Through Neural Networks |
Israel Juárez-Jiménez et.al. |
2506.17191v1 |
null |
2025-06-20 |
YASMOT: Yet another stereo image multi-object tracker |
Ketil Malde et.al. |
2506.17186v1 |
null |
2025-06-20 |
Fault Tolerance by Construction |
Benjamin Rodatz et.al. |
2506.17181v1 |
null |
2025-06-20 |
Deep generative models as the probability transformation functions |
Vitalii Bondar et.al. |
2506.17171v1 |
null |
2025-06-20 |
Scaling limits for sample autocovariance operators of Hilbert space-valued linear processes |
Marie-Christine Düker et.al. |
2506.17168v1 |
null |
2025-06-20 |
Proportional Sensitivity in Generative Adversarial Network (GAN)-Augmented Brain Tumor Classification Using Convolutional Neural Network |
Mahin Montasir Afif et.al. |
2506.17165v1 |
null |
2025-06-20 |
The MedPerturb Dataset: What Non-Content Perturbations Reveal About Human and Clinical LLM Decision Making |
Abinitha Gourabathina et.al. |
2506.17163v1 |
null |
2025-06-20 |
Walking Fingerprinting Using Wrist Accelerometry During Activities of Daily Living in NHANES |
Lily Koffman et.al. |
2506.17160v1 |
null |
2025-06-20 |
Co-Seg++: Mutual Prompt-Guided Collaborative Learning for Versatile Medical Segmentation |
Qing Xu et.al. |
2506.17159v1 |
null |
2025-06-20 |
Do We Need Large VLMs for Spotting Soccer Actions? |
Ritabrata Chakraborty et.al. |
2506.17144v1 |
null |
2025-06-20 |
MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor Classification |
David Jacob Drexlin et.al. |
2506.17140v1 |
null |
2025-06-20 |
On the Theory of Conditional Feature Alignment for Unsupervised Domain-Adaptive Counting |
Zhuonan Liang et.al. |
2506.17137v1 |
null |
2025-06-20 |
Semi-Supervised Multi-Modal Medical Image Segmentation for Complex Situations |
Dongdong Meng et.al. |
2506.17136v1 |
null |
2025-06-20 |
Dynamic Watermark Generation for Digital Images using Perimeter Gated SPAD Imager PUFs |
Md Sakibur Sajal et.al. |
2506.17134v1 |
null |
2025-06-20 |
Robust Training with Data Augmentation for Medical Imaging Classification |
Josué Martínez-Martínez et.al. |
2506.17133v1 |
null |
2025-06-20 |
Reassessing Code Authorship Attribution in the Era of Language Models |
Atish Kumar Dipongkor et.al. |
2506.17120v1 |
null |
2025-06-20 |
RGBTrack: Fast, Robust Depth-Free 6D Pose Estimation and Tracking |
Teng Guo et.al. |
2506.17119v1 |
null |
2025-06-20 |
A Vision for Trustworthy, Fair, and Efficient Socio-Technical Control using Karma Economies |
Ezzat Elokda et.al. |
2506.17115v1 |
null |
2025-06-20 |
MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation |
Shoubin Yu et.al. |
2506.17113v1 |
null |
2025-06-20 |
Monocular One-Shot Metric-Depth Alignment for RGB-Based Robot Grasping |
Teng Guo et.al. |
2506.17110v1 |
null |
2025-06-20 |
TransDreamerV3: Implanting Transformer In DreamerV3 |
Shruti Sadanand Dongare et.al. |
2506.17103v1 |
null |