[C62] Row-Efficient Pruning for In-Memory Convolutional Weight Mapping

Abstract

In this paper, we propose a mapping-aware weight pruning method for in-memory computing (IMC) architectures that operate on a row-by-row basis. Our proposed method can dynamically skip unnecessary row operations to minimize energy consumption while maintaining the inference accuracy of pre-trained model. To achieve this, it calculates the importance of weight elements considering the weight mapping method, which helps preserve critical weights in pre-trained model, thereby minimizing inference accuracy loss. The simulation results demonstrate that our method not only enhances scalability by allowing users to select the number of row operations to be skipped, but also achieves superior inference accuracy compared to existing pruning method in both ResNet-20 and WRN16-4 networks.

Publication
International SoC Design Conference 2024
Johnny Rhe (이존이)
Johnny Rhe (이존이)
Combined MS-PhD student