VideoPrism is a general-purpose video encoder designed to handle a wide spectrum of video understanding tasks, including classification, retrieval, localization, captioning, and question answering. It ...
Abstract: In recent years, there has been a growing demand for remote sensing image semantic segmentation in various applications. The key to semantic segmentation lies in the ability to globally ...
We propose an encoder-decoder for open-vocabulary semantic segmentation comprising a hierarchical encoder-based cost map generation and a gradual fusion decoder. We introduce a category early ...
T5Gemma 2 follows the same adaptation idea introduced in T5Gemma, initialize an encoder-decoder model from a decoder-only checkpoint, then adapt with UL2. In the above figure the research team show ...
Abstract: Recent deep learning methods for vessel trajectory prediction are able to learn complex maritime patterns from historical automatic identification system (AIS) data and accurately predict ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results