Abstract
Long Short Term Memory (LSTM) networks as one of the most used Recurrent Neural Networks (RNN) structures offer high accuracy for sequence learning tasks. However, it is challenging to offer low latency and high throughput while satisfying the low power constraints at the same time for computationally expensive LSTM operations. This work offers a two-pronged approach to accelerate inference in RNN networks. First, linear quantization technique is applied to reduce the complexity of operations, power consumption and required memory resources. Then, a new activation implementation method is proposed, called lookupx, to accelerate sigmoid function computation during inference. It is shown that lowering precision to 4-bit integer numbers for inputs causes only 2% accuracy loss and the lookupx activation methodology has 1.9x better performance and 50x lower power consumption while decreasing the required chip area 1.2x compared to integer domain activation functions with the same accuracy result.
Original language | English |
---|---|
Title of host publication | ICECS 2023 - 2023 30th IEEE International Conference on Electronics, Circuits and Systems |
Subtitle of host publication | Technosapiens for Saving Humanity |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9798350326499 |
DOIs | |
Publication status | Published - 2023 |
Event | 30th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2023 - Istanbul, Turkey Duration: 4 Dec 2023 → 7 Dec 2023 |
Publication series
Name | ICECS 2023 - 2023 30th IEEE International Conference on Electronics, Circuits and Systems: Technosapiens for Saving Humanity |
---|
Conference
Conference | 30th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2023 |
---|---|
Country/Territory | Turkey |
City | Istanbul |
Period | 4/12/23 → 7/12/23 |
Bibliographical note
Publisher Copyright:© 2023 IEEE.
Keywords
- accelerator
- low power
- LSTM
- nonlinear activation functions
- quantization
- RNN