Abstract: Interpretable image classification is crucial for making decisions in high-stakes scenarios. Recent advancements have demonstrated that interpretable models can achieve performance ...
1 University of Science and Technology of China 2 WeChat, Tencent Inc. 1. A Novel Parameter Space Alignment Paradigm Recent MLLMs follow an input space alignment paradigm that aligns visual features ...
In the field of cognitive neuroscience, understanding how humans process and integrate information from different sensory modalities is a crucial topic. Attention mechanisms play a vital role in this ...
Abstract: Text-to-image diffusion models have shown powerful ability on conditional image synthesis. With large-scale vision-language pre-training, diffusion models are able to generate high-quality ...