Clustered Network Adaptation Methodology for the Resource Constrained Platform
Verma, Pratibha and Sasmal, Pradip and Pal, Chandrajit and Channappayya, Sumohana and Acharyya, Amit (2022) Clustered Network Adaptation Methodology for the Resource Constrained Platform. In: 20th IEEE International Interregional NEWCAS Conference, NEWCAS 2022, 19 June 2022 through 22 June 2022, Quebec City.
Text
NEWCAS_2022.pdf - Published Version Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
Recently, deep neural networks (DNNs) are widely used for many artificial intelligence (AI) applications, including robotics, self driving vehicles etc. Although DNNs deliver state-of-the-art accuracy on many AI tasks, but are both computationally and memory intensive. This hinders the deployment of DNNs on mobile and IoT edge devices with limited hardware resources and power budgets. Traditional DNN compression is applicable during training to obtain an efficient inference engine. When the inference engine runs on the hardware platform, constrained by battery backup it brings additional challenge in terms of reducing the complexity (like memory requirement, area on hardware etc.). To reduce the memory complexity, we are proposing a new low complex methodology named as "Clustering Algorithm"to eliminate the redundancies present within the filter coefficients (i.e. weights). This algorithm is a three stage pipeline: quantization, coefficient clustering and code-assignment, that work together to reduce the memory storage of neural networks on the cost of low run-time memory requirement. To show the efficacy of the proposed "Clustering Algorithm", we executed it on the network obtained after applying NetAdapt Algorithm, a popular inference engine on pre-trained AlexNet for CIFAR-10 dataset. We observe that our "Clustering Algorithm"reduces the storage required by five layers of already pruned pre-trained AlexNet by 3x (approximately 65%) on FPGA, and 2x (approximately 51%) on CPU. This allows the model to fit into on-chip SRAM cache rather than off-chip DRAM memory. © 2022 IEEE.
IITH Creators: |
|
||||||
---|---|---|---|---|---|---|---|
Item Type: | Conference or Workshop Item (Paper) | ||||||
Additional Information: | VI. ACKNOWLEDGEMENT This work is partially supported by the grant received from Ministry of Electronics and Information Technology (MEITY), Government of India. | ||||||
Uncontrolled Keywords: | Clustering; Convolutional Neural Network; Deep Learning; Memory Complexity; Model Compression | ||||||
Subjects: | Electrical Engineering | ||||||
Divisions: | Department of Electrical Engineering | ||||||
Depositing User: | . LibTrainee 2021 | ||||||
Date Deposited: | 07 Oct 2022 11:53 | ||||||
Last Modified: | 07 Oct 2022 11:53 | ||||||
URI: | http://raiithold.iith.ac.in/id/eprint/10851 | ||||||
Publisher URL: | http://doi.org/10.1109/NEWCAS52662.2022.9842117 | ||||||
Related URLs: |
Actions (login required)
View Item |
Statistics for this ePrint Item |