Abstract: This paper introduces a GPU architecture named Duplo that minimizes redundant memory accesses of convolutions in deep neural networks (DNNs). Convolution is one of the fundamental operations ...