File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning

TitleTowards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning
Authors
KeywordsGeneralization Analysis
Learning Theory
Stochastic Gradient Descent
Stochastic Optimization
Issue Date1-Jan-2025
PublisherInstitute of Electrical and Electronics Engineers
Citation
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, p. 1-14 How to Cite?
AbstractStochastic optimization is the workhorse behind the success of many machine learning algorithms. The existing theoretical analysis of stochastic optimization mainly focuses on the behavior on the training dataset or requires a convexity assumption. In this paper, we provide a comprehensive analysis on the generalization behavior of stochastic optimization with nonconvex problems. We first present both upper and lower bounds on the uniform convergence of gradients. Our analysis outperforms existing results by incorporating the 2nd moment of the gradient at a single model into the upper bound. Based on this uniform convergence, we provide a high-probability bound on the gradient norm of population risks for stochastic gradient descent (SGD), which significantly improves the existing results. We show that better bounds can be achieved under further assumptions such as quasi-convexity or Polyak-Łojasiewicz condition. Our analysis shows the computation cost can be further decreased by taking the variance-reduction trick. Finally, we study the utility guarantee of SGD under a privacy constraint. Our results show a linear speed up with respect to the batch size, which shows the benefit of computing gradients in a distributed manner.
Persistent Identifierhttp://hdl.handle.net/10722/366758
ISSN
2023 Impact Factor: 20.8
2023 SCImago Journal Rankings: 6.158

 

DC FieldValueLanguage
dc.contributor.authorLei, Yunwen-
dc.date.accessioned2025-11-25T04:21:40Z-
dc.date.available2025-11-25T04:21:40Z-
dc.date.issued2025-01-01-
dc.identifier.citationIEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, p. 1-14-
dc.identifier.issn0162-8828-
dc.identifier.urihttp://hdl.handle.net/10722/366758-
dc.description.abstractStochastic optimization is the workhorse behind the success of many machine learning algorithms. The existing theoretical analysis of stochastic optimization mainly focuses on the behavior on the training dataset or requires a convexity assumption. In this paper, we provide a comprehensive analysis on the generalization behavior of stochastic optimization with nonconvex problems. We first present both upper and lower bounds on the uniform convergence of gradients. Our analysis outperforms existing results by incorporating the 2nd moment of the gradient at a single model into the upper bound. Based on this uniform convergence, we provide a high-probability bound on the gradient norm of population risks for stochastic gradient descent (SGD), which significantly improves the existing results. We show that better bounds can be achieved under further assumptions such as quasi-convexity or Polyak-Łojasiewicz condition. Our analysis shows the computation cost can be further decreased by taking the variance-reduction trick. Finally, we study the utility guarantee of SGD under a privacy constraint. Our results show a linear speed up with respect to the batch size, which shows the benefit of computing gradients in a distributed manner.-
dc.languageeng-
dc.publisherInstitute of Electrical and Electronics Engineers-
dc.relation.ispartofIEEE Transactions on Pattern Analysis and Machine Intelligence-
dc.subjectGeneralization Analysis-
dc.subjectLearning Theory-
dc.subjectStochastic Gradient Descent-
dc.subjectStochastic Optimization-
dc.titleTowards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning-
dc.typeArticle-
dc.identifier.doi10.1109/TPAMI.2025.3621591-
dc.identifier.scopuseid_2-s2.0-105019626552-
dc.identifier.spage1-
dc.identifier.epage14-
dc.identifier.eissn1939-3539-
dc.identifier.issnl0162-8828-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats