[J33] GAROS: Genetic Algorithm-Aided Row-Skipping for Shift and Duplicate Kernel Mapping in Processing-In-Memory Architectures

Abstract

Processing-in-memory (PIM) architecture is becoming a promising candidate for convolutional neural network (CNN) inference. A recent mapping method, shift and duplicate kernel (SDK), enhances latency by improving array utilization through shifting the same kernels into idle columns. Although pattern-based pruning effectively enables row-skipping, traditional pattern designs are suboptimal for SDK mapping due to the irregular kernel shifts, complicating row-skipping. To address this, we proposed pruning-aided row-skipping (PAIRS), which adopts SDK-optimized layer-wise patterns. However, PAIRS has two key limitations: it offers discrete row-skipping by using single pattern set, restricting precise control over the weight matrix compression for varying layer and array sizes, and it risks accuracy loss by pruning critical weights. To overcome these challenges, we introduce genetic algorithm-aided row-skipping (GAROS), which employs input channel (IC)-wise patterns. GAROS enables finer control over row-skipping by assigning several pattern sets and selecting optimal patterns to each IC for preserving critical weights. Consequently, this approach enables continuous weight matrix compression while balancing the trade-off between row-skipping and accuracy. Simulation results in WRN16-4 demonstrate that GAROS improved accuracy by up to +2.4% compared to PAIRS and achieved up to a 1.74x speedup compared to baseline when 128x128 sub-array is used.

Publication
Journal of Systems Architecture 2025 (JCR Q1)
Johnny Rhe (이존이)
Johnny Rhe (이존이)
Combined MS-PhD student
Kang Eun Jeon (전강은)
Kang Eun Jeon (전강은)
Post-doctoral researcher