2025-05-23 |
REN: Fast and Efficient Region Encodings from Patch-Based Image Encoders |
Savya Khosla et.al. |
2505.18153v1 |
null |
2025-05-23 |
Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs |
Wafa Alghallabi et.al. |
2505.18152v1 |
null |
2025-05-23 |
Boosting Open Set Recognition Performance through Modulated Representation Learning |
Amit Kumar Kundu et.al. |
2505.18137v1 |
null |
2025-05-23 |
Frankentext: Stitching random text fragments into long-form narratives |
Chau Minh Pham et.al. |
2505.18128v1 |
null |
2025-05-23 |
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations |
Alan Arazi et.al. |
2505.18125v1 |
null |
2025-05-23 |
Bidirectional Knowledge Distillation for Enhancing Sequential Recommendation with Large Language Models |
Jiongran Wu et.al. |
2505.18120v1 |
null |
2025-05-23 |
From Temporal to Spatial: Designing Spatialized Interactions with Segmented-audios in Immersive Environments for Active Engagement with Performing Arts Intangible Cultural Heritage |
Yuqi Wang et.al. |
2505.18112v1 |
null |
2025-05-23 |
Adapting SAM 2 for Visual Object Tracking: 1st Place Solution for MMVPR Challenge Multi-Modal Tracking |
Cheng-Yen Yang et.al. |
2505.18111v1 |
null |
2025-05-23 |
F-ANcGAN: An Attention-Enhanced Cycle Consistent Generative Adversarial Architecture for Synthetic Image Generation of Nanoparticles |
Varun Ajith et.al. |
2505.18106v1 |
null |
2025-05-23 |
Dynamic Dual Buffer with Divide-and-Conquer Strategy for Online Continual Learning |
Congren Dai et.al. |
2505.18101v1 |
null |
2025-05-23 |
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays |
Hyungyung Lee et.al. |
2505.18087v1 |
null |
2025-05-23 |
Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding |
Xiaoyi Zhang et.al. |
2505.18079v1 |
null |
2025-05-23 |
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation |
Junhao Chen et.al. |
2505.18078v1 |
null |
2025-05-23 |
Semantic Correspondence: Unified Benchmarking and a Strong Baseline |
Kaiyan Zhang et.al. |
2505.18060v1 |
link |
2025-05-23 |
FDBPL: Faster Distillation-Based Prompt Learning for Region-Aware Vision-Language Models Adaptation |
Zherui Zhang et.al. |
2505.18053v1 |
null |
2025-05-23 |
BOTM: Echocardiography Segmentation via Bi-directional Optimal Token Matching |
Zhihua Liu et.al. |
2505.18052v1 |
null |
2025-05-23 |
LookWhere? Efficient Visual Recognition by Learning Where to Look and What to See from Self-Supervision |
Anthony Fuller et.al. |
2505.18051v1 |
null |
2025-05-23 |
RemoteSAM: Towards Segment Anything for Earth Observation |
Liang Yao et.al. |
2505.18022v1 |
null |
2025-05-23 |
LLM assisted web application functional requirements generation: A case study of four popular LLMs over a Mess Management System |
Rashmi Gupta et.al. |
2505.18019v1 |
null |
2025-05-23 |
SemSegBench & DetecBench: Benchmarking Reliability and Generalization Beyond Classification |
Shashank Agnihotri et.al. |
2505.18015v1 |
null |
2025-05-23 |
Classification of assembly tasks combining multiple primitive actions using Transformers and xLSTMs |
Miguel Neves et.al. |
2505.18012v1 |
null |
2025-05-23 |
TRACE for Tracking the Emergence of Semantic Representations in Transformers |
Nura Aljaafari et.al. |
2505.17998v1 |
null |
2025-05-23 |
Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation |
Zhihua Liu et.al. |
2505.17994v1 |
null |
2025-05-23 |
ADLGen: Synthesizing Symbolic, Event-Triggered Sensor Sequences for Human Activity Modeling |
Weihang You et.al. |
2505.17987v1 |
null |
2025-05-23 |
Few-Shot Learning from Gigapixel Images via Hierarchical Vision-Language Alignment and Modeling |
Bryan Wong et.al. |
2505.17982v1 |
null |
2025-05-23 |
To Glue or Not to Glue? Classical vs Learned Image Matching for Mobile Mapping Cameras to Textured Semantic 3D Building Models |
Simone Gaisbauer et.al. |
2505.17973v1 |
null |
2025-05-23 |
MR-EEGWaveNet: Multiresolutional EEGWaveNet for Seizure Detection from Long EEG Recordings |
Kazi Mahmudul Hassan et.al. |
2505.17972v1 |
null |
2025-05-23 |
Explainable Anatomy-Guided AI for Prostate MRI: Foundation Models and In Silico Clinical Trials for Virtual Biopsy-based Risk Assessment |
Danial Khan et.al. |
2505.17971v1 |
null |
2025-05-23 |
Mind the Domain Gap: Measuring the Domain Gap Between Real-World and Synthetic Point Clouds for Automated Driving Development |
Nguyen Duc et.al. |
2505.17959v1 |
null |
2025-05-23 |
AutoMiSeg: Automatic Medical Image Segmentation via Test-Time Adaptation of Foundation Models |
Xingjian Li et.al. |
2505.17931v1 |
null |