[J17] VWC-SDK: Convolutional Weight Mapping Using Shifted and Duplicated Kernel with Variable Windows and Channels

Johnny Rhe (이존이), Sungmin Moon (문성민), Jong Hwan Ko (고종환)

May, 2022

Abstract

With their high energy efficiency, processing-in-memory (PIM) arrays are being increasingly used for the convolutional neural network (CNN) inference. In the PIM-based CNN inference, the computational latency and energy are dependent on how the CNN weights are mapped to the PIM array. A recent study proposed the shifted and duplicated kernel (SDK) mapping that reuses input feature maps with a unit of a parallel window, which is convolved with duplicated kernels to obtain multiple output elements in parallel. However, the existing SDK-based mapping algorithm does not always result in minimum computing cycles because it only maps a square-shaped parallel window with the entire channels. In this paper, we introduce a novel mapping algorithm called variable-window SDK (VW-SDK), which adaptively determines the shape of the parallel window that leads to the minimum computing cycles for the given convolutional layer and PIM array. By allowing rectangular-shaped windows with partial channels, VW-SDK utilizes the PIM array more efficiently, thereby further reducing the number of computing cycles. To further remove the inefficient computing cycles caused by the residual channels, we extend the VW-SDK algorithm into VWC-SDK (SDK with variable windows and channels) that additionally performs residual channel pruning. The simulation with a 512×512 PIM array and ResNet-20 shows that VW-SDK improves the inference speed by 1.29× compared to the existing SDK-based algorithm. The results also show that residual channel pruning improves the inference speed of ResNet-20 by up to ∼1.38× when compared to the original network without pruning.

Type

Journal article

Publication

IEEE Journal on Emerging and Selected Topics in Circuits and Systems (JETCAS)