File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TPAMI.2025.3621591
- Scopus: eid_2-s2.0-105019626552
- Find via

Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning
| Title | Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning |
|---|---|
| Authors | |
| Keywords | Generalization Analysis Learning Theory Stochastic Gradient Descent Stochastic Optimization |
| Issue Date | 1-Jan-2025 |
| Publisher | Institute of Electrical and Electronics Engineers |
| Citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, p. 1-14 How to Cite? |
| Abstract | Stochastic optimization is the workhorse behind the success of many machine learning algorithms. The existing theoretical analysis of stochastic optimization mainly focuses on the behavior on the training dataset or requires a convexity assumption. In this paper, we provide a comprehensive analysis on the generalization behavior of stochastic optimization with nonconvex problems. We first present both upper and lower bounds on the uniform convergence of gradients. Our analysis outperforms existing results by incorporating the 2nd moment of the gradient at a single model into the upper bound. Based on this uniform convergence, we provide a high-probability bound on the gradient norm of population risks for stochastic gradient descent (SGD), which significantly improves the existing results. We show that better bounds can be achieved under further assumptions such as quasi-convexity or Polyak-Łojasiewicz condition. Our analysis shows the computation cost can be further decreased by taking the variance-reduction trick. Finally, we study the utility guarantee of SGD under a privacy constraint. Our results show a linear speed up with respect to the batch size, which shows the benefit of computing gradients in a distributed manner. |
| Persistent Identifier | http://hdl.handle.net/10722/366758 |
| ISSN | 2023 Impact Factor: 20.8 2023 SCImago Journal Rankings: 6.158 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Lei, Yunwen | - |
| dc.date.accessioned | 2025-11-25T04:21:40Z | - |
| dc.date.available | 2025-11-25T04:21:40Z | - |
| dc.date.issued | 2025-01-01 | - |
| dc.identifier.citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, p. 1-14 | - |
| dc.identifier.issn | 0162-8828 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/366758 | - |
| dc.description.abstract | Stochastic optimization is the workhorse behind the success of many machine learning algorithms. The existing theoretical analysis of stochastic optimization mainly focuses on the behavior on the training dataset or requires a convexity assumption. In this paper, we provide a comprehensive analysis on the generalization behavior of stochastic optimization with nonconvex problems. We first present both upper and lower bounds on the uniform convergence of gradients. Our analysis outperforms existing results by incorporating the 2nd moment of the gradient at a single model into the upper bound. Based on this uniform convergence, we provide a high-probability bound on the gradient norm of population risks for stochastic gradient descent (SGD), which significantly improves the existing results. We show that better bounds can be achieved under further assumptions such as quasi-convexity or Polyak-Łojasiewicz condition. Our analysis shows the computation cost can be further decreased by taking the variance-reduction trick. Finally, we study the utility guarantee of SGD under a privacy constraint. Our results show a linear speed up with respect to the batch size, which shows the benefit of computing gradients in a distributed manner. | - |
| dc.language | eng | - |
| dc.publisher | Institute of Electrical and Electronics Engineers | - |
| dc.relation.ispartof | IEEE Transactions on Pattern Analysis and Machine Intelligence | - |
| dc.subject | Generalization Analysis | - |
| dc.subject | Learning Theory | - |
| dc.subject | Stochastic Gradient Descent | - |
| dc.subject | Stochastic Optimization | - |
| dc.title | Towards Better Generalization Bounds of Stochastic Optimization for Nonconvex Learning | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.1109/TPAMI.2025.3621591 | - |
| dc.identifier.scopus | eid_2-s2.0-105019626552 | - |
| dc.identifier.spage | 1 | - |
| dc.identifier.epage | 14 | - |
| dc.identifier.eissn | 1939-3539 | - |
| dc.identifier.issnl | 0162-8828 | - |
