Abstract: In this work, we present FoleyGRAM, a novel approach to video-to-audio generation that emphasizes semantic conditioning through the use of aligned multimodal encoders. Building on prior ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results