File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Online Placement and Scaling of Geo-Distributed Machine Learning Jobs via Volume-Discounting Brokerage

TitleOnline Placement and Scaling of Geo-Distributed Machine Learning Jobs via Volume-Discounting Brokerage
Authors
KeywordsGeo-distributed machine learning
online placement
volume discount brokerage
Issue Date2020
PublisherInstitute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=71
Citation
IEEE Transactions on Parallel and Distributed Systems, 2020, v. 31 n. 4, p. 948-966 How to Cite?
AbstractGeo-distributed machine learning (ML) often uses large geo-dispersed data collections produced over time to train global models, without consolidating the data to a central site. In the parameter server architecture, “workers” and “parameter servers” for a geo-distributed ML job should be strategically deployed and adjusted on the fly, to allow easy access to the datasets and fast exchange of the model parameters at anytime. Despite many cloud platforms now provide volume discounts to encourage the usage of their ML resources, different geo-distributed ML jobs that run in the clouds often rent cloud resources separately and respectively, thus rarely enjoying the benefit of discounts. We study an ML broker service that aggregates geo-distributed ML jobs into cloud data centers for volume discounts via dynamic online placement and scaling of workers and parameter servers in individual jobs for long-term cost minimization. To decide the number and the placement of workers and parameter servers, we propose an efficient online algorithm which first decomposes the online problem into a series of one-shot optimization problems solvable at each individual time slot by the technique of regularization, and afterwards round the fractional decisions to the integer ones via a carefully-designed dependent rounding method. We prove a parameterized-constant competitive ratio for our online algorithm as the theoretical performance analysis, and also conduct extensive simulation studies to exhibit its close-to-offline-optimum practical performance in realistic settings.
Persistent Identifierhttp://hdl.handle.net/10722/301455
ISSN
2021 Impact Factor: 3.757
2020 SCImago Journal Rankings: 0.760

 

DC FieldValueLanguage
dc.contributor.authorLi, X-
dc.contributor.authorZhou, R-
dc.contributor.authorJiao, L-
dc.contributor.authorWu, C-
dc.contributor.authorDeng, Y-
dc.contributor.authorLi, Z-
dc.date.accessioned2021-07-27T08:11:20Z-
dc.date.available2021-07-27T08:11:20Z-
dc.date.issued2020-
dc.identifier.citationIEEE Transactions on Parallel and Distributed Systems, 2020, v. 31 n. 4, p. 948-966-
dc.identifier.issn1045-9219-
dc.identifier.urihttp://hdl.handle.net/10722/301455-
dc.description.abstractGeo-distributed machine learning (ML) often uses large geo-dispersed data collections produced over time to train global models, without consolidating the data to a central site. In the parameter server architecture, “workers” and “parameter servers” for a geo-distributed ML job should be strategically deployed and adjusted on the fly, to allow easy access to the datasets and fast exchange of the model parameters at anytime. Despite many cloud platforms now provide volume discounts to encourage the usage of their ML resources, different geo-distributed ML jobs that run in the clouds often rent cloud resources separately and respectively, thus rarely enjoying the benefit of discounts. We study an ML broker service that aggregates geo-distributed ML jobs into cloud data centers for volume discounts via dynamic online placement and scaling of workers and parameter servers in individual jobs for long-term cost minimization. To decide the number and the placement of workers and parameter servers, we propose an efficient online algorithm which first decomposes the online problem into a series of one-shot optimization problems solvable at each individual time slot by the technique of regularization, and afterwards round the fractional decisions to the integer ones via a carefully-designed dependent rounding method. We prove a parameterized-constant competitive ratio for our online algorithm as the theoretical performance analysis, and also conduct extensive simulation studies to exhibit its close-to-offline-optimum practical performance in realistic settings.-
dc.languageeng-
dc.publisherInstitute of Electrical and Electronics Engineers. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=71-
dc.relation.ispartofIEEE Transactions on Parallel and Distributed Systems-
dc.rightsIEEE Transactions on Parallel and Distributed Systems. Copyright © Institute of Electrical and Electronics Engineers.-
dc.rights©20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.-
dc.subjectGeo-distributed machine learning-
dc.subjectonline placement-
dc.subjectvolume discount brokerage-
dc.titleOnline Placement and Scaling of Geo-Distributed Machine Learning Jobs via Volume-Discounting Brokerage-
dc.typeArticle-
dc.identifier.emailWu, C: cwu@cs.hku.hk-
dc.identifier.authorityWu, C=rp01397-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/TPDS.2019.2955935-
dc.identifier.scopuseid_2-s2.0-85075918980-
dc.identifier.hkuros323507-
dc.identifier.volume31-
dc.identifier.issue4-
dc.identifier.spage948-
dc.identifier.epage966-
dc.publisher.placeUnited States-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats