File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1145/3357596
- Scopus: eid_2-s2.0-85077790628
- WOS: WOS:000535725900005
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: GraVF-M: Graph Processing System Generation for Multi-FPGA Platforms
Title | GraVF-M: Graph Processing System Generation for Multi-FPGA Platforms |
---|---|
Authors | |
Keywords | FPGA Graph processing GraVF-M Multi-FPGA architecture Performance modelling Vertex centric |
Issue Date | 2019 |
Publisher | The Association for Computing Machinery. The Journal's web site is located at http://trets.cse.sc.edu/index.html |
Citation | ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2019, v. 12 n. 4, article no. 21 How to Cite? |
Abstract | Due to the irregular nature of connections in most graph datasets, partitioning graph analysis algorithms across multiple computational nodes that do not share a common memory inevitably leads to large amounts of interconnect traffic. Previous research has shown that FPGAs can outcompete software-based graph processing in shared memory contexts, but it remains an open question if this advantage can be maintained in distributed systems. In this work, we present GraVF-M, a framework designed to ease the implementation of FPGA-based graph processing accelerators for multi-FPGA platforms with distributed memory. Based on a lightweight description of the algorithm kernel, the framework automatically generates optimized RTL code for the whole multi-FPGA design. We exploit an aspect of the programming model to present a familiar message-passing paradigm to the user, while under the hood implementing a more efficient architecture that can reduce the necessary inter-FPGA network traffic by a factor equal to the average degree of the input graph. A performance model based on a theoretical analysis of the factors influencing performance serves to evaluate the efficiency of our implementation. With a throughput of up to 5.8GTEPS (billions of traversed edges per second) on a 4-FPGA system, the designs generated by GraVF-M compare favorably to state-of-the-art frameworks from the literature and reach 94% of the projected performance limit of the system. |
Persistent Identifier | http://hdl.handle.net/10722/288080 |
ISSN | 2023 Impact Factor: 3.1 2023 SCImago Journal Rankings: 0.802 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Engelhardt, N | - |
dc.contributor.author | So, HKH | - |
dc.date.accessioned | 2020-10-05T12:07:35Z | - |
dc.date.available | 2020-10-05T12:07:35Z | - |
dc.date.issued | 2019 | - |
dc.identifier.citation | ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2019, v. 12 n. 4, article no. 21 | - |
dc.identifier.issn | 1936-7406 | - |
dc.identifier.uri | http://hdl.handle.net/10722/288080 | - |
dc.description.abstract | Due to the irregular nature of connections in most graph datasets, partitioning graph analysis algorithms across multiple computational nodes that do not share a common memory inevitably leads to large amounts of interconnect traffic. Previous research has shown that FPGAs can outcompete software-based graph processing in shared memory contexts, but it remains an open question if this advantage can be maintained in distributed systems. In this work, we present GraVF-M, a framework designed to ease the implementation of FPGA-based graph processing accelerators for multi-FPGA platforms with distributed memory. Based on a lightweight description of the algorithm kernel, the framework automatically generates optimized RTL code for the whole multi-FPGA design. We exploit an aspect of the programming model to present a familiar message-passing paradigm to the user, while under the hood implementing a more efficient architecture that can reduce the necessary inter-FPGA network traffic by a factor equal to the average degree of the input graph. A performance model based on a theoretical analysis of the factors influencing performance serves to evaluate the efficiency of our implementation. With a throughput of up to 5.8GTEPS (billions of traversed edges per second) on a 4-FPGA system, the designs generated by GraVF-M compare favorably to state-of-the-art frameworks from the literature and reach 94% of the projected performance limit of the system. | - |
dc.language | eng | - |
dc.publisher | The Association for Computing Machinery. The Journal's web site is located at http://trets.cse.sc.edu/index.html | - |
dc.relation.ispartof | ACM Transactions on Reconfigurable Technology and Systems (TRETS) | - |
dc.rights | ACM Transactions on Reconfigurable Technology and Systems (TRETS). Copyright © The Association for Computing Machinery. | - |
dc.rights | ©ACM, 2019. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2019, v. 12 n. 4, article no. 21 (November 2019). http://doi.acm.org/10.1145/3357596 | - |
dc.subject | FPGA | - |
dc.subject | Graph processing | - |
dc.subject | GraVF-M | - |
dc.subject | Multi-FPGA architecture | - |
dc.subject | Performance modelling | - |
dc.subject | Vertex centric | - |
dc.title | GraVF-M: Graph Processing System Generation for Multi-FPGA Platforms | - |
dc.type | Article | - |
dc.identifier.email | So, HKH: hso@eee.hku.hk | - |
dc.identifier.authority | So, HKH=rp00169 | - |
dc.description.nature | postprint | - |
dc.identifier.doi | 10.1145/3357596 | - |
dc.identifier.scopus | eid_2-s2.0-85077790628 | - |
dc.identifier.hkuros | 315334 | - |
dc.identifier.volume | 12 | - |
dc.identifier.issue | 4 | - |
dc.identifier.spage | article no. 21 | - |
dc.identifier.epage | article no. 21 | - |
dc.identifier.isi | WOS:000535725900005 | - |
dc.publisher.place | United States | - |
dc.identifier.issnl | 1936-7406 | - |