File Download
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: A generalized neural tangent kernel analysis for two-layer neural networks
Title | A generalized neural tangent kernel analysis for two-layer neural networks |
---|---|
Authors | |
Issue Date | 2020 |
Citation | 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Virtual Conference, 6-12 Decemeber 2020. In Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020 How to Cite? |
Abstract | A recent breakthrough in deep learning theory shows that the training of overparameterized deep neural networks can be characterized by a kernel function called neural tangent kernel (NTK). However, it is known that this type of results does not perfectly match the practice, as NTK-based analysis requires the network weights to stay very close to their initialization throughout training, and cannot handle regularizers or gradient noises. In this paper, we provide a generalized neural tangent kernel analysis and show that noisy gradient descent with weight decay can still exhibit a “kernel-like” behavior. This implies that the training loss converges linearly up to a certain accuracy. We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay. |
Persistent Identifier | http://hdl.handle.net/10722/303785 |
ISSN | 2020 SCImago Journal Rankings: 1.399 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chen, Zixiang | - |
dc.contributor.author | Cao, Yuan | - |
dc.contributor.author | Gu, Quanquan | - |
dc.contributor.author | Zhang, Tong | - |
dc.date.accessioned | 2021-09-15T08:26:01Z | - |
dc.date.available | 2021-09-15T08:26:01Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Virtual Conference, 6-12 Decemeber 2020. In Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020 | - |
dc.identifier.issn | 1049-5258 | - |
dc.identifier.uri | http://hdl.handle.net/10722/303785 | - |
dc.description.abstract | A recent breakthrough in deep learning theory shows that the training of overparameterized deep neural networks can be characterized by a kernel function called neural tangent kernel (NTK). However, it is known that this type of results does not perfectly match the practice, as NTK-based analysis requires the network weights to stay very close to their initialization throughout training, and cannot handle regularizers or gradient noises. In this paper, we provide a generalized neural tangent kernel analysis and show that noisy gradient descent with weight decay can still exhibit a “kernel-like” behavior. This implies that the training loss converges linearly up to a certain accuracy. We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay. | - |
dc.language | eng | - |
dc.relation.ispartof | Advances in Neural Information Processing Systems 33 (NeurIPS 2020) | - |
dc.title | A generalized neural tangent kernel analysis for two-layer neural networks | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_OA_fulltext | - |
dc.identifier.scopus | eid_2-s2.0-85108454055 | - |