In-Hardware Training Chip Based on CMOS Invertible Logic for Machine Learning

Naoya Onizawa, Sean C. Smithson, Brett H. Meyer, Warren J. Gross, Takahiro Hanyu

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)


Deep Neural Networks (DNNs) have recently shown state-of-The-Art results on various applications, such as computer vision and recognition tasks. DNN inference engines can be implemented in hardware with high energy efficiency as the computation can be realized using a low-precision fixed point or even binary precision with sufficient cognition accuracies. On the other hand, training DNNs using the well-known back-propagation algorithm requires high-precision floating-point computations on a CPU and/or GPU causing significant power dissipation (more than hundreds of kW) and long training time (several days or more). In this paper, we demonstrate a training chip fabricated using a commercial 65-nm CMOS technology for machine learning. The chip performs training without back propagation by using invertible logic with stochastic computing that can directly obtain weight values using input/output training data with low precision suitable for inference. When training neurons that compute the weighted sum of all inputs and then apply a non-linear activation function, our chip demonstrates a reduction of power dissipation and latency by 99.98% and 99.95%, respectively, in comparison with a state-of-The-Art software implementation.

Original languageEnglish
Article number8946714
Pages (from-to)1541-1550
Number of pages10
JournalIEEE Transactions on Circuits and Systems I: Regular Papers
Issue number5
Publication statusPublished - 2020 May


  • digital circuits
  • neural networks
  • Stochastic computing


Dive into the research topics of 'In-Hardware Training Chip Based on CMOS Invertible Logic for Machine Learning'. Together they form a unique fingerprint.

Cite this