Compute-in-memory (CIM) architectures are becoming increasingly important for real-time, low-power deep neural network (DNN) inference due to their ability to perform matrix-vector multiplications directly within memory. However, conventional CIM systems often struggle to balance efficient data processing with the computational demands of large-scale DNNs, particularly in scenarios involving sparse data. These systems tend to perform unnecessary computations, leading to reduced throughput and increased energy consumption. In this paper, we propose a novel CIM architecture that eliminates certain multi-bit operations, thereby avoiding unnecessary calculations and optimizing resource usage. By leveraging a new numerical system specifically tailored for CIM array operations, we reduce computing cycles by 50% compared to traditional methods, with only a minimal accuracy loss of 1%.