File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.jpdc.2019.06.009
- Scopus: eid_2-s2.0-85067975117
- WOS: WOS:000488138800005
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Efficient low-latency packet processing using On-GPU Thread-Data Remapping
Title | Efficient low-latency packet processing using On-GPU Thread-Data Remapping |
---|---|
Authors | |
Keywords | Packet processing Software router GPU control flow divergence SIMD |
Issue Date | 2019 |
Publisher | Academic Press. The Journal's web site is located at http://www.elsevier.com/locate/jpdc |
Citation | Journal of Parallel and Distributed Computing, 2019, v. 133, p. 51-62 How to Cite? |
Abstract | Graphics processing units are widely-used for packet processing acceleration in both physical and virtual networks. However, real-life packets come in highly-divergent sizes, causing severe GPU control flow divergence. Previous solutions rely on CPU preprocessing to reduce divergence, but it forbids the more efficient NIC–GPU packet streaming as packet batches have to stop completely at host machine. To fully utilize both GPU and PCIe resources, we propose Blink as a GPU modular software router. Instead of CPU pre-processing, the Blink router uses On-GPU Thread-Data Remapping to reduce divergence, and our novel Cross-Iteration Thread Event Signaling mechanism filters unnecessary inter-thread synchronization, doubling the performance gain achieved by traditional solution. Serving as a TCP/IP router with Deep Packet Inspection (DPI) firewall, Blink can sustain processing throughput of 31.5 GBit/s over a PCIe bandwidth of 32 GBit/s. Given a certain bandwidth, Blink reduces processing latency at least by half compared with other works. |
Persistent Identifier | http://hdl.handle.net/10722/283322 |
ISSN | 2023 Impact Factor: 3.4 2023 SCImago Journal Rankings: 1.187 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | LIN, H | - |
dc.contributor.author | Wang, CL | - |
dc.date.accessioned | 2020-06-22T02:55:00Z | - |
dc.date.available | 2020-06-22T02:55:00Z | - |
dc.date.issued | 2019 | - |
dc.identifier.citation | Journal of Parallel and Distributed Computing, 2019, v. 133, p. 51-62 | - |
dc.identifier.issn | 0743-7315 | - |
dc.identifier.uri | http://hdl.handle.net/10722/283322 | - |
dc.description.abstract | Graphics processing units are widely-used for packet processing acceleration in both physical and virtual networks. However, real-life packets come in highly-divergent sizes, causing severe GPU control flow divergence. Previous solutions rely on CPU preprocessing to reduce divergence, but it forbids the more efficient NIC–GPU packet streaming as packet batches have to stop completely at host machine. To fully utilize both GPU and PCIe resources, we propose Blink as a GPU modular software router. Instead of CPU pre-processing, the Blink router uses On-GPU Thread-Data Remapping to reduce divergence, and our novel Cross-Iteration Thread Event Signaling mechanism filters unnecessary inter-thread synchronization, doubling the performance gain achieved by traditional solution. Serving as a TCP/IP router with Deep Packet Inspection (DPI) firewall, Blink can sustain processing throughput of 31.5 GBit/s over a PCIe bandwidth of 32 GBit/s. Given a certain bandwidth, Blink reduces processing latency at least by half compared with other works. | - |
dc.language | eng | - |
dc.publisher | Academic Press. The Journal's web site is located at http://www.elsevier.com/locate/jpdc | - |
dc.relation.ispartof | Journal of Parallel and Distributed Computing | - |
dc.subject | Packet processing | - |
dc.subject | Software router | - |
dc.subject | GPU control flow divergence | - |
dc.subject | SIMD | - |
dc.title | Efficient low-latency packet processing using On-GPU Thread-Data Remapping | - |
dc.type | Article | - |
dc.identifier.email | Wang, CL: clwang@cs.hku.hk | - |
dc.identifier.authority | Wang, CL=rp00183 | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1016/j.jpdc.2019.06.009 | - |
dc.identifier.scopus | eid_2-s2.0-85067975117 | - |
dc.identifier.hkuros | 310354 | - |
dc.identifier.volume | 133 | - |
dc.identifier.spage | 51 | - |
dc.identifier.epage | 62 | - |
dc.identifier.isi | WOS:000488138800005 | - |
dc.publisher.place | United States | - |
dc.identifier.issnl | 0743-7315 | - |