File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: BAShuffler: maximizing network bandwidth utilization in the shuffle of YARN

TitleBAShuffler: maximizing network bandwidth utilization in the shuffle of YARN
Authors
KeywordsYARN
MapReduce
Shuffle
Network Scheduling
Issue Date2016
PublisherACM.
Citation
The 25th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2016), Kyoto, Japan, 31 May-4 June 2016. In Conference Proceedings, 2016, p. 281-284 How to Cite?
AbstractYARN is a popular cluster resource management platform. It does not, however, manage the network bandwidth resources which can significantly affect the execution performance of those tasks having large volumes of data to transfer within the cluster. The shuffle phase of MapReduce jobs features many such tasks. The impact of underutilization of the network bandwidth in shuffle tasks is more pronounced if the network bandwidth capacities of the nodes in the cluster are varied. We present BAShuffler, a bandwidth-aware shuffle scheduler, that can maximize the overall network bandwidth utilization by scheduling the source nodes of the fetch flows at the application level. BAShuffler can fully utilize the net-work bandwidth capacity in a max-min fair network. The experimental results for a variety of realistic benchmarks show that BAShuffler can substantially improve the cluster's shuffle throughput and reduce the execution time of shuffle tasks as compared to the original YARN, especially in heterogeneous network bandwidth environments.
DescriptionSession 8: Potpourri (Short Paper)
Persistent Identifierhttp://hdl.handle.net/10722/232186
ISBN

 

DC FieldValueLanguage
dc.contributor.authorLiang, F-
dc.contributor.authorLau, FCM-
dc.creatorsml 161107-
dc.date.accessioned2016-09-20T05:28:19Z-
dc.date.available2016-09-20T05:28:19Z-
dc.date.issued2016-
dc.identifier.citationThe 25th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2016), Kyoto, Japan, 31 May-4 June 2016. In Conference Proceedings, 2016, p. 281-284-
dc.identifier.isbn978-1-4503-4314-5-
dc.identifier.urihttp://hdl.handle.net/10722/232186-
dc.descriptionSession 8: Potpourri (Short Paper)-
dc.description.abstractYARN is a popular cluster resource management platform. It does not, however, manage the network bandwidth resources which can significantly affect the execution performance of those tasks having large volumes of data to transfer within the cluster. The shuffle phase of MapReduce jobs features many such tasks. The impact of underutilization of the network bandwidth in shuffle tasks is more pronounced if the network bandwidth capacities of the nodes in the cluster are varied. We present BAShuffler, a bandwidth-aware shuffle scheduler, that can maximize the overall network bandwidth utilization by scheduling the source nodes of the fetch flows at the application level. BAShuffler can fully utilize the net-work bandwidth capacity in a max-min fair network. The experimental results for a variety of realistic benchmarks show that BAShuffler can substantially improve the cluster's shuffle throughput and reduce the execution time of shuffle tasks as compared to the original YARN, especially in heterogeneous network bandwidth environments.-
dc.languageeng-
dc.publisherACM.-
dc.relation.ispartofProceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2016-
dc.subjectYARN-
dc.subjectMapReduce-
dc.subjectShuffle-
dc.subjectNetwork Scheduling-
dc.titleBAShuffler: maximizing network bandwidth utilization in the shuffle of YARN-
dc.typeConference_Paper-
dc.identifier.emailLau, FCM: fcmlau@cs.hku.hk-
dc.identifier.authorityLau, FCM=rp00221-
dc.description.naturelink_to_OA_fulltext-
dc.identifier.doi10.1145/2907294.2907296-
dc.identifier.scopuseid_2-s2.0-84978512005-
dc.identifier.hkuros267167-
dc.identifier.spage281-
dc.identifier.epage284-
dc.publisher.placeUnited States-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats