A Many-core Architecture for an Ensemble Ternary Neural Network Toward High-Throughput Inference

Ryota Kayanoma, Akira Jinguji, Hiroki Nakahara

研究成果: 書籍の章/レポート/Proceedings会議への寄与査読

抄録

Machine learning is expanding in various applications, such as image processing in data centers. With the spread of deep learning, neural-network-based models have frequently been adopted in recent years. Due to the slow processing speed of machine learning evaluation on a CPU, high-speed, dedicated hardware accelerators are often used. In particular, the demand for hardware accelerators in data centers is increasing, with a need for low power consumption and high-speed processing in a limited space. Here, we propose an implementation method for a ternary neural network, utilizing the rewritable look-up table (LUT) of a field-programmable gate array (FPGA). Ternary neural networks (TNNs), quantized to 2 bits, can be realized with LUT-based combinational circuits, allowing inference processing in a single cycle. Thus, a very high-speed inference system can be realized. Moreover, we have reduced the hardware quantity by 70% by introducing sparsity, i.e., approximating the parameters to zero. However, there was a downside of reduced recognition accuracy due to the low-bit representation. In this paper, we used an ensemble to achieve recognition accuracy equivalent to that of the 32-bit float model to prevent the decrease in recognition accuracy. We also designed a voting circuit for the ensemble TNN that does not decrease throughput. By implementing it on the AMD Alveo U50 FPGA card, we achieved a high processing speed of 100 Mega Frames Per Second (MFPS). Our FPGA-based system was 1,286 times faster than the CPU and 1,364 times faster than the GPU. Therefore, we achieve a high-speed inference system without compromising recognition accuracy.

本文言語英語
ホスト出版物のタイトルProceedings - 2023 16th IEEE International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2023
出版社Institute of Electrical and Electronics Engineers Inc.
ページ446-453
ページ数8
ISBN(電子版)9798350393613
DOI
出版ステータス出版済み - 2023
イベント16th IEEE International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2023 - Singapore, シンガポール
継続期間: 2023 12月 182023 12月 21

出版物シリーズ

名前Proceedings - 2023 16th IEEE International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2023

会議

会議16th IEEE International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2023
国/地域シンガポール
CitySingapore
Period23/12/1823/12/21

フィンガープリント

「A Many-core Architecture for an Ensemble Ternary Neural Network Toward High-Throughput Inference」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル