File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TNNLS.2019.2952219
- Scopus: eid_2-s2.0-85092680343
- PMID: 31831449
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions
Title | Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions |
---|---|
Authors | |
Keywords | Learning theory nonconvex optimization Polyak-Łojasiewicz condition stochastic gradient descent (SGD) |
Issue Date | 2020 |
Citation | IEEE Transactions on Neural Networks and Learning Systems, 2020, v. 31, n. 10, p. 4394-4400 How to Cite? |
Abstract | Stochastic gradient descent (SGD) is a popular and efficient method with wide applications in training deep neural nets and other nonconvex models. While the behavior of SGD is well understood in the convex learning setting, the existing theoretical results for SGD applied to nonconvex objective functions are far from mature. For example, existing results require to impose a nontrivial assumption on the uniform boundedness of gradients for all iterates encountered in the learning process, which is hard to verify in practical implementations. In this article, we establish a rigorous theoretical foundation for SGD in nonconvex learning by showing that this boundedness assumption can be removed without affecting convergence rates, and relaxing the standard smoothness assumption to Hölder continuity of gradients. In particular, we establish sufficient conditions for almost sure convergence as well as optimal convergence rates for SGD applied to both general nonconvex and gradient-dominated objective functions. A linear convergence is further derived in the case with zero variances. |
Persistent Identifier | http://hdl.handle.net/10722/329652 |
ISSN | 2021 Impact Factor: 14.255 2020 SCImago Journal Rankings: 2.882 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lei, Yunwen | - |
dc.contributor.author | Hu, Ting | - |
dc.contributor.author | Li, Guiying | - |
dc.contributor.author | Tang, Ke | - |
dc.date.accessioned | 2023-08-09T03:34:21Z | - |
dc.date.available | 2023-08-09T03:34:21Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | IEEE Transactions on Neural Networks and Learning Systems, 2020, v. 31, n. 10, p. 4394-4400 | - |
dc.identifier.issn | 2162-237X | - |
dc.identifier.uri | http://hdl.handle.net/10722/329652 | - |
dc.description.abstract | Stochastic gradient descent (SGD) is a popular and efficient method with wide applications in training deep neural nets and other nonconvex models. While the behavior of SGD is well understood in the convex learning setting, the existing theoretical results for SGD applied to nonconvex objective functions are far from mature. For example, existing results require to impose a nontrivial assumption on the uniform boundedness of gradients for all iterates encountered in the learning process, which is hard to verify in practical implementations. In this article, we establish a rigorous theoretical foundation for SGD in nonconvex learning by showing that this boundedness assumption can be removed without affecting convergence rates, and relaxing the standard smoothness assumption to Hölder continuity of gradients. In particular, we establish sufficient conditions for almost sure convergence as well as optimal convergence rates for SGD applied to both general nonconvex and gradient-dominated objective functions. A linear convergence is further derived in the case with zero variances. | - |
dc.language | eng | - |
dc.relation.ispartof | IEEE Transactions on Neural Networks and Learning Systems | - |
dc.subject | Learning theory | - |
dc.subject | nonconvex optimization | - |
dc.subject | Polyak-Łojasiewicz condition | - |
dc.subject | stochastic gradient descent (SGD) | - |
dc.title | Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TNNLS.2019.2952219 | - |
dc.identifier.pmid | 31831449 | - |
dc.identifier.scopus | eid_2-s2.0-85092680343 | - |
dc.identifier.volume | 31 | - |
dc.identifier.issue | 10 | - |
dc.identifier.spage | 4394 | - |
dc.identifier.epage | 4400 | - |
dc.identifier.eissn | 2162-2388 | - |