Video Captioning Using Python

End-to-End Dense Video Captioning Model Based on Multimodal Feature Fusion

Abstract: The goal of dense video captioning is to generate multiple text descriptions related to different timestamps from a video. Traditional methods adopt a two-stage "localize-then-describe" ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

End-to-End Dense Video Captioning Model Based on Multimodal Feature Fusion

Trending now