Deep Learning Model Compression and Acceleration

Deep neural networks (DNNs) are widely adopted at the IoT edge devices to enable more intelligence. However, the biggest challenge is their storage demand and computational complexity. As DNNs contain a large number of synaptic weights, the memory demand is a key challenge for application of DNNs, especially for memory-constrained platforms such as mobile systems. Another bottleneck of DNNs is the computation demand, mostly due to their large number of multiplications in convolution layers. Therefore, reducing the storage and computation demand of neural networks is critical, specifically, to support in-field and on-chip training and inference. To enable deep learning based intelligent multimedia processing at the IoT edge devices with limited hardware resource, the research targets for neural network design with lower storage and computation demand.