Generalization performance of multi-pass stochastic gradient descent with convex loss functions

Lei, Y; Hu, T; Tang, K

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Find via

Supplementary

Citations:
Appears in Collections:
- Mathematics: Journal/Magazine Articles

Article: Generalization performance of multi-pass stochastic gradient descent with convex loss functions

Title	Generalization performance of multi-pass stochastic gradient descent with convex loss functions
Authors	Lei, Y Hu, T Tang, K
Issue Date	31-Jan-2021
Publisher	Journal of Machine Learning Research
Citation	Journal of Machine Learning Research, 2021, v. 22, n. 25, p. 1-41 How to Cite?
Abstract	Stochastic gradient descent (SGD) has become the method of choice to tackle large-scale datasets due to its low computational cost and good practical performance. Learning rate analysis, either capacity-independent or capacity-dependent, provides a unifying viewpoint to study the computational and statistical properties of SGD, as well as the implicit regularization by tuning the number of passes. Existing capacity-independent learning rates require a nontrivial bounded subgradient assumption and a smoothness assumption to be optimal. Furthermore, existing capacity-dependent learning rates are only established for the specific least squares loss with a special structure. In this paper, we provide both optimal capacity-independent and capacity-dependent learning rates for SGD with general convex loss functions. Our results require neither bounded subgradient assumptions nor smoothness assumptions, and are stated with high probability. We achieve this improvement by a refined estimate on the norm of SGD iterates based on a careful martingale analysis and concentration inequalities on empirical processes.
Persistent Identifier	http://hdl.handle.net/10722/337191
ISSN	1532-4435 2023 Impact Factor: 4.3 2023 SCImago Journal Rankings: 2.796

DC Field	Value	Language
dc.contributor.author	Lei, Y	-
dc.contributor.author	Hu, T	-
dc.contributor.author	Tang, K	-
dc.date.accessioned	2024-03-11T10:18:48Z	-
dc.date.available	2024-03-11T10:18:48Z	-
dc.date.issued	2021-01-31	-
dc.identifier.citation	Journal of Machine Learning Research, 2021, v. 22, n. 25, p. 1-41	-
dc.identifier.issn	1532-4435	-
dc.identifier.uri	http://hdl.handle.net/10722/337191	-
dc.description.abstract	Stochastic gradient descent (SGD) has become the method of choice to tackle large-scale datasets due to its low computational cost and good practical performance. Learning rate analysis, either capacity-independent or capacity-dependent, provides a unifying viewpoint to study the computational and statistical properties of SGD, as well as the implicit regularization by tuning the number of passes. Existing capacity-independent learning rates require a nontrivial bounded subgradient assumption and a smoothness assumption to be optimal. Furthermore, existing capacity-dependent learning rates are only established for the specific least squares loss with a special structure. In this paper, we provide both optimal capacity-independent and capacity-dependent learning rates for SGD with general convex loss functions. Our results require neither bounded subgradient assumptions nor smoothness assumptions, and are stated with high probability. We achieve this improvement by a refined estimate on the norm of SGD iterates based on a careful martingale analysis and concentration inequalities on empirical processes.	-
dc.language	eng	-
dc.publisher	Journal of Machine Learning Research	-
dc.relation.ispartof	Journal of Machine Learning Research	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.title	Generalization performance of multi-pass stochastic gradient descent with convex loss functions	-
dc.type	Article	-
dc.identifier.volume	22	-
dc.identifier.issue	25	-
dc.identifier.spage	1	-
dc.identifier.epage	41	-
dc.identifier.eissn	1533-7928	-
dc.identifier.issnl	1532-4435	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Generalization performance of multi-pass stochastic gradient descent with convex loss functions

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats