File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: FTDL: A Tailored FPGA-Overlay for Deep Learning with High Scalability

TitleFTDL: A Tailored FPGA-Overlay for Deep Learning with High Scalability
Authors
KeywordsField programmable gate arrays
Computer architecture
Random access memory
Machine learning
System-on-chip
Issue Date2020
PublisherIEEE, Computer Society. The Journal's web site is located at https://ieeexplore.ieee.org/xpl/conhome/1000196/all-proceedings
Citation
Proceedings of 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 20-24 July 2020, p. 1-6 How to Cite?
AbstractFast inference is of paramount value to a wide range of deep learning applications. This work presents FTDL, a highly-scalable FPGA overlay framework for deep learning applications, to address the architecture and hardware mismatch faced by traditional efforts. The FTDL overlay is specifically optimized for the tiled structure of FPGAs, thereby achieving post-place-and-route operating frequencies exceeding 88 % of the theoretical maximum across different devices and design scales. A flexible compilation framework efficiently schedules matrix multiply and convolution operations of large neural network inference on the overlay and achieved over 80 % hardware efficiency on average. Taking advantage of both high operating frequency and hardware efficiency, FTDL achieves 402.6 and 151.2 FPS with GoogLeNet and ResNet50 on ImageNet, respectively, while operating at a power efficiency of 27.6 GOPS/W, making it up to 7.7× higher performance and 1.9× more power-efficient than the state-of-the-art.
Persistent Identifierhttp://hdl.handle.net/10722/289185
ISSN
2020 SCImago Journal Rankings: 0.518

 

DC FieldValueLanguage
dc.contributor.authorShi, R-
dc.contributor.authorDing, Y-
dc.contributor.authorWei, X-
dc.contributor.authorLi, H-
dc.contributor.authorLiu, H-
dc.contributor.authorSo, HKH-
dc.contributor.authorDing, C-
dc.date.accessioned2020-10-22T08:09:03Z-
dc.date.available2020-10-22T08:09:03Z-
dc.date.issued2020-
dc.identifier.citationProceedings of 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 20-24 July 2020, p. 1-6-
dc.identifier.issn0738-100X-
dc.identifier.urihttp://hdl.handle.net/10722/289185-
dc.description.abstractFast inference is of paramount value to a wide range of deep learning applications. This work presents FTDL, a highly-scalable FPGA overlay framework for deep learning applications, to address the architecture and hardware mismatch faced by traditional efforts. The FTDL overlay is specifically optimized for the tiled structure of FPGAs, thereby achieving post-place-and-route operating frequencies exceeding 88 % of the theoretical maximum across different devices and design scales. A flexible compilation framework efficiently schedules matrix multiply and convolution operations of large neural network inference on the overlay and achieved over 80 % hardware efficiency on average. Taking advantage of both high operating frequency and hardware efficiency, FTDL achieves 402.6 and 151.2 FPS with GoogLeNet and ResNet50 on ImageNet, respectively, while operating at a power efficiency of 27.6 GOPS/W, making it up to 7.7× higher performance and 1.9× more power-efficient than the state-of-the-art.-
dc.languageeng-
dc.publisherIEEE, Computer Society. The Journal's web site is located at https://ieeexplore.ieee.org/xpl/conhome/1000196/all-proceedings-
dc.relation.ispartofACM/IEEE Design Automation Conference Proceedings-
dc.rightsACM/IEEE Design Automation Conference Proceedings. Copyright © IEEE, Computer Society.-
dc.rights©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.-
dc.subjectField programmable gate arrays-
dc.subjectComputer architecture-
dc.subjectRandom access memory-
dc.subjectMachine learning-
dc.subjectSystem-on-chip-
dc.titleFTDL: A Tailored FPGA-Overlay for Deep Learning with High Scalability-
dc.typeConference_Paper-
dc.identifier.emailSo, HKH: hso@eee.hku.hk-
dc.identifier.authoritySo, HKH=rp00169-
dc.description.naturepostprint-
dc.identifier.doi10.1109/DAC18072.2020.9218581-
dc.identifier.scopuseid_2-s2.0-85093973629-
dc.identifier.hkuros316791-
dc.identifier.spage1-
dc.identifier.epage6-
dc.publisher.placeUnited States-
dc.identifier.issnl0738-100X-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats