File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TCAD.2025.3595830
- Scopus: eid_2-s2.0-105013344940
- Find via

Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Binary Weight Multi-Bit Activation Quantization for Compute-in-Memory CNN Accelerators
| Title | Binary Weight Multi-Bit Activation Quantization for Compute-in-Memory CNN Accelerators |
|---|---|
| Authors | |
| Keywords | Compute-in-Memory FeFET Model Qquantization RRAM SRAM |
| Issue Date | 1-Jan-2025 |
| Publisher | Institute of Electrical and Electronics Engineers |
| Citation | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025 How to Cite? |
| Abstract | Compute-in-memory (CIM) accelerators have emerged as a promising way for enhancing the energy efficiency of convolutional neural networks (CNNs). Deploying CNNs on CIM platforms generally requires quantization of network weights and activations to meet hardware constraints. However, existing approaches either prioritize hardware efficiency with binary weight and activation quantization at the cost of accuracy, or utilize multi-bit weights and activations for greater accuracy but limited efficiency. In this paper, we introduce a novel binary weight multi-bit activation (BWMA) method for CNNs on CIM-based accelerators. Our contributions include: deriving closed-form solutions for weight quantization in each layer, significantly improving the representational capabilities of binarized weights; and developing a differentiable function for activation quantization, approximating the ideal multi-bit function while bypassing the extensive search for optimal settings. Through comprehensive experiments on CIFAR-10 and ImageNet datasets, we show that BWMA achieves notable accuracy improvements over existing methods, registering gains of 1.44%-5.46% and 0.35%-5.37% on respective datasets. Moreover, hardware simulation results indicate that 4-bit activation quantization strikes the optimal balance between hardware cost and model performance. |
| Persistent Identifier | http://hdl.handle.net/10722/362535 |
| ISSN | 2023 Impact Factor: 2.7 2023 SCImago Journal Rankings: 0.957 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Zhou, Wenyong | - |
| dc.contributor.author | Liu, Zhengwu | - |
| dc.contributor.author | Ren, Yuan | - |
| dc.contributor.author | Wong, Ngai | - |
| dc.date.accessioned | 2025-09-26T00:35:59Z | - |
| dc.date.available | 2025-09-26T00:35:59Z | - |
| dc.date.issued | 2025-01-01 | - |
| dc.identifier.citation | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025 | - |
| dc.identifier.issn | 0278-0070 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/362535 | - |
| dc.description.abstract | Compute-in-memory (CIM) accelerators have emerged as a promising way for enhancing the energy efficiency of convolutional neural networks (CNNs). Deploying CNNs on CIM platforms generally requires quantization of network weights and activations to meet hardware constraints. However, existing approaches either prioritize hardware efficiency with binary weight and activation quantization at the cost of accuracy, or utilize multi-bit weights and activations for greater accuracy but limited efficiency. In this paper, we introduce a novel binary weight multi-bit activation (BWMA) method for CNNs on CIM-based accelerators. Our contributions include: deriving closed-form solutions for weight quantization in each layer, significantly improving the representational capabilities of binarized weights; and developing a differentiable function for activation quantization, approximating the ideal multi-bit function while bypassing the extensive search for optimal settings. Through comprehensive experiments on CIFAR-10 and ImageNet datasets, we show that BWMA achieves notable accuracy improvements over existing methods, registering gains of 1.44%-5.46% and 0.35%-5.37% on respective datasets. Moreover, hardware simulation results indicate that 4-bit activation quantization strikes the optimal balance between hardware cost and model performance. | - |
| dc.language | eng | - |
| dc.publisher | Institute of Electrical and Electronics Engineers | - |
| dc.relation.ispartof | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | - |
| dc.subject | Compute-in-Memory | - |
| dc.subject | FeFET | - |
| dc.subject | Model Qquantization | - |
| dc.subject | RRAM | - |
| dc.subject | SRAM | - |
| dc.title | Binary Weight Multi-Bit Activation Quantization for Compute-in-Memory CNN Accelerators | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.1109/TCAD.2025.3595830 | - |
| dc.identifier.scopus | eid_2-s2.0-105013344940 | - |
| dc.identifier.eissn | 1937-4151 | - |
| dc.identifier.issnl | 0278-0070 | - |
