Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search

Loong, Kam Chi; 龍錦賜

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Electrical & Electronic Engineering: Theses

postgraduate thesis: Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search

Title	Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search
Authors	Loong, Kam Chi 龍錦賜
Advisors	Advisor(s):Wong, N So, HKH
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Loong, K. C. [龍錦賜]. (2024). Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Deep neural networks (DNNs) underpin many advanced applications, with performance often scaling with size and complexity. Conventional computing architectures struggle with the demands of DNNs, leading to inefficiencies and higher costs. Edge computing mitigates latency, enhances privacy, and conserves bandwidth by processing data near the data sources, despite having less computational power than cloud systems. Recent advancements in AI accelerators, particularly Resistive Random-Access Memory (ReRAM) arrays, have shown promise for in-memory computing (IMC), significantly boosting AI computation speed. This thesis presents a novel edge compilation framework and ReRAM-aware neural architecture search (NAS) to optimize DNN deployment on resource-constrained edge devices. This research explores how layer partitioning, weight duplication, and network packing can optimize ReRAM crossbar array utilization. The goal is to improve computational efficiency and resource management for a given network and hardware configuration. The ReRAM-aware NAS framework utilizes genetic algorithms to discover optimal neural network designs while considering hardware constraints, such as the number of available crossbar arrays. This method facilitates high-performance DNN implementation on ReRAM-based accelerators, achieving a delicate balance between accuracy and efficiency. By incorporating ReRAM-specific metrics into the NAS process, our approach ensures that the resulting neural architectures are optimized for edge ReRAM technology. Comprehensive evaluations demonstrate that our solutions surpass existing methods in crossbar utilization, latency reduction, and energy efficiency. This work represents a significant advance in practical and efficient edge AI deployment, harnessing ReRAM technology to accelerate DNN computations on resource-limited edge devices.
Degree	Master of Philosophy
Subject	Edge computing Random access memory
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/354726

DC Field	Value	Language
dc.contributor.advisor	Wong, N	-
dc.contributor.advisor	So, HKH	-
dc.contributor.author	Loong, Kam Chi	-
dc.contributor.author	龍錦賜	-
dc.date.accessioned	2025-03-04T09:30:55Z	-
dc.date.available	2025-03-04T09:30:55Z	-
dc.date.issued	2024	-
dc.identifier.citation	Loong, K. C. [龍錦賜]. (2024). Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/354726	-
dc.description.abstract	Deep neural networks (DNNs) underpin many advanced applications, with performance often scaling with size and complexity. Conventional computing architectures struggle with the demands of DNNs, leading to inefficiencies and higher costs. Edge computing mitigates latency, enhances privacy, and conserves bandwidth by processing data near the data sources, despite having less computational power than cloud systems. Recent advancements in AI accelerators, particularly Resistive Random-Access Memory (ReRAM) arrays, have shown promise for in-memory computing (IMC), significantly boosting AI computation speed. This thesis presents a novel edge compilation framework and ReRAM-aware neural architecture search (NAS) to optimize DNN deployment on resource-constrained edge devices. This research explores how layer partitioning, weight duplication, and network packing can optimize ReRAM crossbar array utilization. The goal is to improve computational efficiency and resource management for a given network and hardware configuration. The ReRAM-aware NAS framework utilizes genetic algorithms to discover optimal neural network designs while considering hardware constraints, such as the number of available crossbar arrays. This method facilitates high-performance DNN implementation on ReRAM-based accelerators, achieving a delicate balance between accuracy and efficiency. By incorporating ReRAM-specific metrics into the NAS process, our approach ensures that the resulting neural architectures are optimized for edge ReRAM technology. Comprehensive evaluations demonstrate that our solutions surpass existing methods in crossbar utilization, latency reduction, and energy efficiency. This work represents a significant advance in practical and efficient edge AI deployment, harnessing ReRAM technology to accelerate DNN computations on resource-limited edge devices.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Edge computing	-
dc.subject.lcsh	Random access memory	-
dc.title	Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search	-
dc.type	PG_Thesis	-
dc.description.thesisname	Master of Philosophy	-
dc.description.thesislevel	Master	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2025	-
dc.identifier.mmsid	991044911108003414	-

File Download

Supplementary

postgraduate thesis: Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats