File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search
Title | Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search |
---|---|
Authors | |
Advisors | |
Issue Date | 2024 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Loong, K. C. [龍錦賜]. (2024). Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | Deep neural networks (DNNs) underpin many advanced applications, with performance often scaling with size and complexity. Conventional computing architectures struggle with the demands of DNNs, leading to inefficiencies and higher costs. Edge computing mitigates latency, enhances privacy, and conserves bandwidth by processing data near the data sources, despite having less computational power than cloud systems. Recent advancements in AI accelerators, particularly Resistive Random-Access Memory (ReRAM) arrays, have shown promise for in-memory computing (IMC), significantly boosting AI computation speed.
This thesis presents a novel edge compilation framework and ReRAM-aware neural architecture search (NAS) to optimize DNN deployment on resource-constrained edge devices. This research explores how layer partitioning, weight duplication, and network packing can optimize ReRAM crossbar array utilization. The goal is to improve computational efficiency and resource management for a given network and hardware configuration.
The ReRAM-aware NAS framework utilizes genetic algorithms to discover optimal neural network designs while considering hardware constraints, such as the number of available crossbar arrays. This method facilitates high-performance DNN implementation on ReRAM-based accelerators, achieving a delicate balance between accuracy and efficiency. By incorporating ReRAM-specific metrics into the NAS process, our approach ensures that the resulting neural architectures are optimized for edge ReRAM technology.
Comprehensive evaluations demonstrate that our solutions surpass existing methods in crossbar utilization, latency reduction, and energy efficiency. This work represents a significant advance in practical and efficient edge AI deployment, harnessing ReRAM technology to accelerate DNN computations on resource-limited edge devices. |
Degree | Master of Philosophy |
Subject | Edge computing Random access memory |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/354726 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Wong, N | - |
dc.contributor.advisor | So, HKH | - |
dc.contributor.author | Loong, Kam Chi | - |
dc.contributor.author | 龍錦賜 | - |
dc.date.accessioned | 2025-03-04T09:30:55Z | - |
dc.date.available | 2025-03-04T09:30:55Z | - |
dc.date.issued | 2024 | - |
dc.identifier.citation | Loong, K. C. [龍錦賜]. (2024). Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/354726 | - |
dc.description.abstract | Deep neural networks (DNNs) underpin many advanced applications, with performance often scaling with size and complexity. Conventional computing architectures struggle with the demands of DNNs, leading to inefficiencies and higher costs. Edge computing mitigates latency, enhances privacy, and conserves bandwidth by processing data near the data sources, despite having less computational power than cloud systems. Recent advancements in AI accelerators, particularly Resistive Random-Access Memory (ReRAM) arrays, have shown promise for in-memory computing (IMC), significantly boosting AI computation speed. This thesis presents a novel edge compilation framework and ReRAM-aware neural architecture search (NAS) to optimize DNN deployment on resource-constrained edge devices. This research explores how layer partitioning, weight duplication, and network packing can optimize ReRAM crossbar array utilization. The goal is to improve computational efficiency and resource management for a given network and hardware configuration. The ReRAM-aware NAS framework utilizes genetic algorithms to discover optimal neural network designs while considering hardware constraints, such as the number of available crossbar arrays. This method facilitates high-performance DNN implementation on ReRAM-based accelerators, achieving a delicate balance between accuracy and efficiency. By incorporating ReRAM-specific metrics into the NAS process, our approach ensures that the resulting neural architectures are optimized for edge ReRAM technology. Comprehensive evaluations demonstrate that our solutions surpass existing methods in crossbar utilization, latency reduction, and energy efficiency. This work represents a significant advance in practical and efficient edge AI deployment, harnessing ReRAM technology to accelerate DNN computations on resource-limited edge devices. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Edge computing | - |
dc.subject.lcsh | Random access memory | - |
dc.title | Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search | - |
dc.type | PG_Thesis | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.date.hkucongregation | 2025 | - |
dc.identifier.mmsid | 991044911108003414 | - |