File Download
Supplementary

postgraduate thesis: Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search

TitleEdge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search
Authors
Advisors
Advisor(s):Wong, NSo, HKH
Issue Date2024
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Loong, K. C. [龍錦賜]. (2024). Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractDeep neural networks (DNNs) underpin many advanced applications, with performance often scaling with size and complexity. Conventional computing architectures struggle with the demands of DNNs, leading to inefficiencies and higher costs. Edge computing mitigates latency, enhances privacy, and conserves bandwidth by processing data near the data sources, despite having less computational power than cloud systems. Recent advancements in AI accelerators, particularly Resistive Random-Access Memory (ReRAM) arrays, have shown promise for in-memory computing (IMC), significantly boosting AI computation speed. This thesis presents a novel edge compilation framework and ReRAM-aware neural architecture search (NAS) to optimize DNN deployment on resource-constrained edge devices. This research explores how layer partitioning, weight duplication, and network packing can optimize ReRAM crossbar array utilization. The goal is to improve computational efficiency and resource management for a given network and hardware configuration. The ReRAM-aware NAS framework utilizes genetic algorithms to discover optimal neural network designs while considering hardware constraints, such as the number of available crossbar arrays. This method facilitates high-performance DNN implementation on ReRAM-based accelerators, achieving a delicate balance between accuracy and efficiency. By incorporating ReRAM-specific metrics into the NAS process, our approach ensures that the resulting neural architectures are optimized for edge ReRAM technology. Comprehensive evaluations demonstrate that our solutions surpass existing methods in crossbar utilization, latency reduction, and energy efficiency. This work represents a significant advance in practical and efficient edge AI deployment, harnessing ReRAM technology to accelerate DNN computations on resource-limited edge devices.
DegreeMaster of Philosophy
SubjectEdge computing
Random access memory
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/354726

 

DC FieldValueLanguage
dc.contributor.advisorWong, N-
dc.contributor.advisorSo, HKH-
dc.contributor.authorLoong, Kam Chi-
dc.contributor.author龍錦賜-
dc.date.accessioned2025-03-04T09:30:55Z-
dc.date.available2025-03-04T09:30:55Z-
dc.date.issued2024-
dc.identifier.citationLoong, K. C. [龍錦賜]. (2024). Edge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/354726-
dc.description.abstractDeep neural networks (DNNs) underpin many advanced applications, with performance often scaling with size and complexity. Conventional computing architectures struggle with the demands of DNNs, leading to inefficiencies and higher costs. Edge computing mitigates latency, enhances privacy, and conserves bandwidth by processing data near the data sources, despite having less computational power than cloud systems. Recent advancements in AI accelerators, particularly Resistive Random-Access Memory (ReRAM) arrays, have shown promise for in-memory computing (IMC), significantly boosting AI computation speed. This thesis presents a novel edge compilation framework and ReRAM-aware neural architecture search (NAS) to optimize DNN deployment on resource-constrained edge devices. This research explores how layer partitioning, weight duplication, and network packing can optimize ReRAM crossbar array utilization. The goal is to improve computational efficiency and resource management for a given network and hardware configuration. The ReRAM-aware NAS framework utilizes genetic algorithms to discover optimal neural network designs while considering hardware constraints, such as the number of available crossbar arrays. This method facilitates high-performance DNN implementation on ReRAM-based accelerators, achieving a delicate balance between accuracy and efficiency. By incorporating ReRAM-specific metrics into the NAS process, our approach ensures that the resulting neural architectures are optimized for edge ReRAM technology. Comprehensive evaluations demonstrate that our solutions surpass existing methods in crossbar utilization, latency reduction, and energy efficiency. This work represents a significant advance in practical and efficient edge AI deployment, harnessing ReRAM technology to accelerate DNN computations on resource-limited edge devices.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshEdge computing-
dc.subject.lcshRandom access memory-
dc.titleEdge AI with ReRAM-based in-memory computing and ReRAM-aware neural architecture search-
dc.typePG_Thesis-
dc.description.thesisnameMaster of Philosophy-
dc.description.thesislevelMaster-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2025-
dc.identifier.mmsid991044911108003414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats