2025-06-20 |
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens |
Zeyuan Yang et.al. |
2506.17218v1 |
null |
2025-06-20 |
DreamCube: 3D Panorama Generation via Multi-plane Synchronization |
Yukun Huang et.al. |
2506.17206v1 |
null |
2025-06-20 |
UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation |
Teng Li et.al. |
2506.17202v1 |
null |
2025-06-20 |
Gravitational lensing observables in stationary and axisymmetric solutions in general relativity |
Matteo Luca Ruggiero et.al. |
2506.17192v1 |
null |
2025-06-20 |
Facial Landmark Visualization and Emotion Recognition Through Neural Networks |
Israel Juárez-Jiménez et.al. |
2506.17191v1 |
null |
2025-06-20 |
YASMOT: Yet another stereo image multi-object tracker |
Ketil Malde et.al. |
2506.17186v1 |
null |
2025-06-20 |
Variational Learning of Disentangled Representations |
Yuli Slavutsky et.al. |
2506.17182v1 |
null |
2025-06-20 |
High-accuracy inference using HfO$_x$S$_y$/HfS$_2$ Memristors |
Aferdita Xhameni et.al. |
2506.17174v1 |
null |
2025-06-20 |
Proportional Sensitivity in Generative Adversarial Network (GAN)-Augmented Brain Tumor Classification Using Convolutional Neural Network |
Mahin Montasir Afif et.al. |
2506.17165v1 |
null |
2025-06-20 |
Walking Fingerprinting Using Wrist Accelerometry During Activities of Daily Living in NHANES |
Lily Koffman et.al. |
2506.17160v1 |
null |
2025-06-20 |
Co-Seg++: Mutual Prompt-Guided Collaborative Learning for Versatile Medical Segmentation |
Qing Xu et.al. |
2506.17159v1 |
null |
2025-06-20 |
Affine semigroups without consecutive small elements |
J. C. Rosales et.al. |
2506.17152v1 |
null |
2025-06-20 |
MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor Classification |
David Jacob Drexlin et.al. |
2506.17140v1 |
null |
2025-06-20 |
Semi-Supervised Multi-Modal Medical Image Segmentation for Complex Situations |
Dongdong Meng et.al. |
2506.17136v1 |
null |
2025-06-20 |
Dynamic Watermark Generation for Digital Images using Perimeter Gated SPAD Imager PUFs |
Md Sakibur Sajal et.al. |
2506.17134v1 |
null |
2025-06-20 |
Robust Training with Data Augmentation for Medical Imaging Classification |
Josué Martínez-Martínez et.al. |
2506.17133v1 |
null |
2025-06-20 |
Real-time Broadband RFI Excision for the Upgraded GMRT |
Ruta Kale et.al. |
2506.17131v1 |
null |
2025-06-20 |
Monocular One-Shot Metric-Depth Alignment for RGB-Based Robot Grasping |
Teng Guo et.al. |
2506.17110v1 |
null |
2025-06-20 |
Open-Path Methane Sensing via Backscattered Light in a Nonlinear Interferometer |
Jinghan Dong et.al. |
2506.17107v1 |
null |
2025-06-20 |
Acquiring and Accumulating Knowledge from Diverse Datasets for Multi-label Driving Scene Classification |
Ke Li et.al. |
2506.17101v1 |
null |
2025-06-20 |
Brain-inspired interpretable reservoir computing with resonant recurrent neural networks |
Mark A. Kramer et.al. |
2506.17083v1 |
null |
2025-06-20 |
Assembler: Scalable 3D Part Assembly via Anchor Point Diffusion |
Wang Zhao et.al. |
2506.17074v1 |
null |
2025-06-20 |
Cross-Modal Epileptic Signal Harmonization: Frequency Domain Mapping Quantization for Pre-training a Unified Neurophysiological Transformer |
Runkai Zhang et.al. |
2506.17068v1 |
null |
2025-06-20 |
Client Selection Strategies for Federated Semantic Communications in Heterogeneous IoT Networks |
Samer Lahoud et.al. |
2506.17063v1 |
null |
2025-06-20 |
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers |
Jingtong Su et.al. |
2506.17052v1 |
null |
2025-06-20 |
MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models |
Xiaolong Wang et.al. |
2506.17046v1 |
null |
2025-06-20 |
Stretching Beyond the Obvious: A Gradient-Free Framework to Unveil the Hidden Landscape of Visual Invariance |
Lorenzo Tausani et.al. |
2506.17040v1 |
null |
2025-06-20 |
Unsupervised Image Super-Resolution Reconstruction Based on Real-World Degradation Patterns |
Yiyang Tie et.al. |
2506.17027v1 |
null |
2025-06-20 |
The Hidden Cost of an Image: Quantifying the Energy Consumption of AI Image Generation |
Giulia Bertazzini et.al. |
2506.17016v1 |
null |
2025-06-20 |
Directional Dark Field for Nanoscale Full-Field Transmission X-Ray Microscopy |
Sami Wirtensohn et.al. |
2506.16998v1 |
null |