April. 30th, 2025: We have performed comparative experiments with mainstream tracking-by-detection methods on the OVIS datasets(see Fig.5). Task-Specific SpatioTemporal Context-Aware Decoupling for ...
Abstract: Video-based human-object interaction (HOI) recognition aims at labeling human and object sequences with multiple human-object interaction classes. The efficiency of existing methods still ...
Abstract: Referring Video Object Segmentation (R-VOS) methods face challenges in maintaining consistent object segmentation due to temporal context variability and the presence of other visually ...