Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning

Lei, Yunwen

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TPAMI.2025.3621591
Scopus: eid_2-s2.0-105019626552
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Mathematics: Journal/Magazine Articles

Article: Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning

Title	Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning
Authors	Lei, Yunwen
Keywords	Generalization Analysis Learning Theory Stochastic Gradient Descent Stochastic Optimization
Issue Date	1-Jan-2025
Publisher	Institute of Electrical and Electronics Engineers
Citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, p. 1-14 How to Cite? DOI: http://dx.doi.org/10.1109/TPAMI.2025.3621591
Abstract	Stochastic optimization is the workhorse behind the success of many machine learning algorithms. The existing theoretical analysis of stochastic optimization mainly focuses on the behavior on the training dataset or requires a convexity assumption. In this paper, we provide a comprehensive analysis on the generalization behavior of stochastic optimization with nonconvex problems. We first present both upper and lower bounds on the uniform convergence of gradients. Our analysis outperforms existing results by incorporating the 2nd moment of the gradient at a single model into the upper bound. Based on this uniform convergence, we provide a high-probability bound on the gradient norm of population risks for stochastic gradient descent (SGD), which significantly improves the existing results. We show that better bounds can be achieved under further assumptions such as quasi-convexity or Polyak-Łojasiewicz condition. Our analysis shows the computation cost can be further decreased by taking the variance-reduction trick. Finally, we study the utility guarantee of SGD under a privacy constraint. Our results show a linear speed up with respect to the batch size, which shows the benefit of computing gradients in a distributed manner.
Persistent Identifier	http://hdl.handle.net/10722/366758
ISSN	0162-8828 2023 Impact Factor: 20.8 2023 SCImago Journal Rankings: 6.158

DC Field	Value	Language
dc.contributor.author	Lei, Yunwen	-
dc.date.accessioned	2025-11-25T04:21:40Z	-
dc.date.available	2025-11-25T04:21:40Z	-
dc.date.issued	2025-01-01	-
dc.identifier.citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, p. 1-14	-
dc.identifier.issn	0162-8828	-
dc.identifier.uri	http://hdl.handle.net/10722/366758	-
dc.description.abstract	Stochastic optimization is the workhorse behind the success of many machine learning algorithms. The existing theoretical analysis of stochastic optimization mainly focuses on the behavior on the training dataset or requires a convexity assumption. In this paper, we provide a comprehensive analysis on the generalization behavior of stochastic optimization with nonconvex problems. We first present both upper and lower bounds on the uniform convergence of gradients. Our analysis outperforms existing results by incorporating the 2nd moment of the gradient at a single model into the upper bound. Based on this uniform convergence, we provide a high-probability bound on the gradient norm of population risks for stochastic gradient descent (SGD), which significantly improves the existing results. We show that better bounds can be achieved under further assumptions such as quasi-convexity or Polyak-Łojasiewicz condition. Our analysis shows the computation cost can be further decreased by taking the variance-reduction trick. Finally, we study the utility guarantee of SGD under a privacy constraint. Our results show a linear speed up with respect to the batch size, which shows the benefit of computing gradients in a distributed manner.	-
dc.language	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers	-
dc.relation.ispartof	IEEE Transactions on Pattern Analysis and Machine Intelligence	-
dc.subject	Generalization Analysis	-
dc.subject	Learning Theory	-
dc.subject	Stochastic Gradient Descent	-
dc.subject	Stochastic Optimization	-
dc.title	Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning	-
dc.type	Article	-
dc.identifier.doi	10.1109/TPAMI.2025.3621591	-
dc.identifier.scopus	eid_2-s2.0-105019626552	-
dc.identifier.spage	1	-
dc.identifier.epage	14	-
dc.identifier.eissn	1939-3539	-
dc.identifier.issnl	0162-8828	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats