Abstract: Most Vision-and-Language Navigation (VLN) algorithms are prone to making inaccurate decisions due to their lack of visual common sense and limited reasoning capabilities. To address this ...
Abstract: Fine-grained visual categorization is a challenging issue owing to high intra-class and low inter-class variances. Classical approaches rely on pre-trained models or many fine annotations.