The internet is absolutely overflowing with AI-generated videos, making it difficult to tell what's real. We can help you pick out the deepfakes.
Abstract: Vision-and-Language Navigation (VLN) agents are tasked with navigating an unseen environment using natural language instructions. In this work, we study if visual representations of ...