Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network

WANG, M; LEE, KCM; CHUNG, BMF; Bogaraju, SV; Ng, HC; Wong, JS; Shum, HC; Tsia, KK; So, HKH

File Download

content.pdf

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TNNLS.2020.3046452
Scopus: eid_2-s2.0-85099543895
PMID: 33434136
WOS: WOS:000733493900001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Electrical & Electronic Engineering: Journal/Magazine Articles
- Mechanical Engineering: Journal/Magazine Articles

See more details

Article: Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network

Title	Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network
Authors	WANG, M LEE, KCM CHUNG, BMF Bogaraju, SV Ng, HC Wong, JS Shum, HC Tsia, KK So, HKH
Keywords	Cell image classification convolutional neural network (CNN) field-programmable gate array (FPGA) hardware architecture low-latency inference
Issue Date	2021
Publisher	Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=72
Citation	IEEE Transactions on Neural Networks and Learning Systems, 2021, Epub 2021-01-12 How to Cite? DOI: http://dx.doi.org/10.1109/TNNLS.2020.3046452
Abstract	Real-time in situ image analytics impose stringent latency requirements on intelligent neural network inference operations. While conventional software-based implementations on the graphic processing unit (GPU)-accelerated platforms are flexible and have achieved very high inference throughput, they are not suitable for latency-sensitive applications where real-time feedback is needed. Here, we demonstrate that high-performance reconfigurable computing platforms based on field-programmable gate array (FPGA) processing can successfully bridge the gap between low-level hardware processing and high-level intelligent image analytics algorithm deployment within a unified system. The proposed design performs inference operations on a stream of individual images as they are produced and has a deeply pipelined hardware design that allows all layers of a quantized convolutional neural network (QCNN) to compute concurrently with partial image inputs. Using the case of label-free classification of human peripheral blood mononuclear cell (PBMC) subtypes as a proof-of-concept illustration, our system achieves an ultralow classification latency of 34.2 μs with over 95% end-to-end accuracy by using a QCNN, while the cells are imaged at throughput exceeding 29,200 cells/s. Our QCNN design is modular and is readily adaptable to other QCNNs with different latency and resource requirements.
Description	Hybrid open access
Persistent Identifier	http://hdl.handle.net/10722/296324
ISSN	2162-237X 2023 Impact Factor: 10.2 2023 SCImago Journal Rankings: 4.170
ISI Accession Number ID	WOS:000733493900001

DC Field	Value	Language
dc.contributor.author	WANG, M	-
dc.contributor.author	LEE, KCM	-
dc.contributor.author	CHUNG, BMF	-
dc.contributor.author	Bogaraju, SV	-
dc.contributor.author	Ng, HC	-
dc.contributor.author	Wong, JS	-
dc.contributor.author	Shum, HC	-
dc.contributor.author	Tsia, KK	-
dc.contributor.author	So, HKH	-
dc.date.accessioned	2021-02-22T04:53:41Z	-
dc.date.available	2021-02-22T04:53:41Z	-
dc.date.issued	2021	-
dc.identifier.citation	IEEE Transactions on Neural Networks and Learning Systems, 2021, Epub 2021-01-12	-
dc.identifier.issn	2162-237X	-
dc.identifier.uri	http://hdl.handle.net/10722/296324	-
dc.description	Hybrid open access	-
dc.description.abstract	Real-time in situ image analytics impose stringent latency requirements on intelligent neural network inference operations. While conventional software-based implementations on the graphic processing unit (GPU)-accelerated platforms are flexible and have achieved very high inference throughput, they are not suitable for latency-sensitive applications where real-time feedback is needed. Here, we demonstrate that high-performance reconfigurable computing platforms based on field-programmable gate array (FPGA) processing can successfully bridge the gap between low-level hardware processing and high-level intelligent image analytics algorithm deployment within a unified system. The proposed design performs inference operations on a stream of individual images as they are produced and has a deeply pipelined hardware design that allows all layers of a quantized convolutional neural network (QCNN) to compute concurrently with partial image inputs. Using the case of label-free classification of human peripheral blood mononuclear cell (PBMC) subtypes as a proof-of-concept illustration, our system achieves an ultralow classification latency of 34.2 μs with over 95% end-to-end accuracy by using a QCNN, while the cells are imaged at throughput exceeding 29,200 cells/s. Our QCNN design is modular and is readily adaptable to other QCNNs with different latency and resource requirements.	-
dc.language	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=72	-
dc.relation.ispartof	IEEE Transactions on Neural Networks and Learning Systems	-
dc.rights	IEEE Transactions on Neural Networks and Learning Systems. Copyright © Institute of Electrical and Electronics Engineers.	-
dc.rights	©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject	Cell image classification	-
dc.subject	convolutional neural network (CNN)	-
dc.subject	field-programmable gate array (FPGA)	-
dc.subject	hardware architecture	-
dc.subject	low-latency inference	-
dc.title	Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network	-
dc.type	Article	-
dc.identifier.email	Wong, JS: jsjwong@hku.hk	-
dc.identifier.email	Shum, HC: ashum@hku.hk	-
dc.identifier.email	Tsia, KK: tsia@hku.hk	-
dc.identifier.email	So, HKH: hso@eee.hku.hk	-
dc.identifier.authority	Shum, HC=rp01439	-
dc.identifier.authority	Tsia, KK=rp01389	-
dc.identifier.authority	So, HKH=rp00169	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.1109/TNNLS.2020.3046452	-
dc.identifier.pmid	33434136	-
dc.identifier.scopus	eid_2-s2.0-85099543895	-
dc.identifier.hkuros	321305	-
dc.identifier.volume	Epub 2021-01-12	-
dc.identifier.isi	WOS:000733493900001	-
dc.publisher.place	United States	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats