May. 2nd, 2024: Vision Mamba (Vim) is accepted by ICML2024. 🎉 Conference page can be found here. Feb. 10th, 2024: We update Vim-tiny/small weights and training scripts. By placing the class token at ...
Abstract: Real-time dense mapping with high-fidelity textures in large-scale environments is such a challenge in robots, digital twins, and AR/VR applications. Neural Radiance Field (NeRF) has ...
Abstract: This paper aims at investigating servo commands of the UAV PTZ vision system for target tracking. The comprehensive analyses of parameters influencing target tracking are developed in the ...
To address the degradation of visual-language (VL) representations during VLA supervised fine-tuning (SFT), we introduce Visual Representation Alignment. During SFT, we pull a VLA’s visual tokens ...