Multi Object Tracking

Multi-Object Tracking

Publish Date	Title	Authors	PDF	Code
2025-07-03	LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans	Zhening Huang et.al.	2507.02861v1	null
2025-07-03	RefTok: Reference-Based Tokenization for Video Generation	Xiang Fan et.al.	2507.02862v1	null
2025-07-03	Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching	Xin Zhou et.al.	2507.02860v1	null
2025-07-03	Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation	Jiaer Xia et.al.	2507.02859v1	null
2025-07-03	Answer Matching Outperforms Multiple Choice for Language Model Evaluation	Nikhil Chandak et.al.	2507.02856v1	null
2025-07-03	MvHo-IB: Multi-View Higher-Order Information Bottleneck for Brain Disorder Diagnosis	Kunyu Zhang et.al.	2507.02847v1	null
2025-07-03	Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection	Ziqi Miao et.al.	2507.02844v1	null
2025-07-03	USAD: An Unsupervised Data Augmentation Spatio-Temporal Attention Diffusion Network	Ying Yu et.al.	2507.02827v1	null
2025-07-03	Establishing Best Practices for Building Rigorous Agentic Benchmarks	Yuxuan Zhu et.al.	2507.02825v1	null
2025-07-03	Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks	Luke Guerdan et.al.	2507.02819v1	null
2025-07-03	Towards Perception-Informed Latent HRTF Representations	You Zhang et.al.	2507.02815v1	null
2025-07-03	HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars	Gent Serifi et.al.	2507.02803v1	null
2025-07-03	No time to train! Training-Free Reference-Based Instance Segmentation	Miguel Espinosa et.al.	2507.02798v1	null
2025-07-03	RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation	Liheng Zhang et.al.	2507.02792v1	null
2025-07-03	Self-Steering Deep Non-Linear Spatially Selective Filters for Efficient Extraction of Moving Speakers under Weak Guidance	Jakob Kienegger et.al.	2507.02791v1	null
2025-07-03	From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding	Xiangfeng Wang et.al.	2507.02790v1	null
2025-07-03	From Pixels to Damage Severity: Estimating Earthquake Impacts Using Semantic Segmentation of Social Media Images	Danrong Zhang et.al.	2507.02781v1	null
2025-07-03	Discovery and Preliminary Characterization of a Third Interstellar Object: 3I/ATLAS	Darryl Z. Seligman et.al.	2507.02757v1	null
2025-07-03	Partial Weakly-Supervised Oriented Object Detection	Mingxin Liu et.al.	2507.02751v1	null
2025-07-03	DexVLG: Dexterous Vision-Language-Grasp Model at Scale	Jiawei He et.al.	2507.02747v1	null
2025-07-03	Early Signs of Steganographic Capabilities in Frontier LLMs	Artur Zolkowski et.al.	2507.02737v1	null
2025-07-03	RIS-Aided Cooperative ISAC Networks for Structural Health Monitoring	Jie Yang et.al.	2507.02731v1	null
2025-07-03	A Systematic Search for Spectral Hardening in Blazar Flares with the Fermi-Large Area Telescope	Adithiya Dinesh et.al.	2507.02718v1	null
2025-07-03	FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models	Yuxuan Wang et.al.	2507.02714v1	null
2025-07-03	UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation	Qin Guo et.al.	2507.02713v1	null
2025-07-03	XPPLORE: Import, visualize, and analyze XPPAUT data in MATLAB	Matteo Martin et.al.	2507.02709v1	null
2025-07-03	The ESO SupJup Survey VIII. Chemical fingerprints of young L dwarf twins	N. Grasser et.al.	2507.02706v1	null
2025-07-03	CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation	Xiangyang Luo et.al.	2507.02691v1	null
2025-07-03	On the Convergence of Large Language Model Optimizer for Black-Box Network Management	Hoon Lee et.al.	2507.02689v1	null
2025-07-03	A wireless, inexpensive optical tracker for the CAVE	Ehud Sharlin et.al.	2507.02682v1	null