File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: A generalized neural tangent kernel analysis for two-layer neural networks

TitleA generalized neural tangent kernel analysis for two-layer neural networks
Authors
Issue Date2020
Citation
34th Conference on Neural Information Processing Systems (NeurIPS 2020), Virtual Conference, 6-12 Decemeber 2020. In Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020 How to Cite?
AbstractA recent breakthrough in deep learning theory shows that the training of overparameterized deep neural networks can be characterized by a kernel function called neural tangent kernel (NTK). However, it is known that this type of results does not perfectly match the practice, as NTK-based analysis requires the network weights to stay very close to their initialization throughout training, and cannot handle regularizers or gradient noises. In this paper, we provide a generalized neural tangent kernel analysis and show that noisy gradient descent with weight decay can still exhibit a “kernel-like” behavior. This implies that the training loss converges linearly up to a certain accuracy. We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay.
Persistent Identifierhttp://hdl.handle.net/10722/303785
ISSN
2020 SCImago Journal Rankings: 1.399

 

DC FieldValueLanguage
dc.contributor.authorChen, Zixiang-
dc.contributor.authorCao, Yuan-
dc.contributor.authorGu, Quanquan-
dc.contributor.authorZhang, Tong-
dc.date.accessioned2021-09-15T08:26:01Z-
dc.date.available2021-09-15T08:26:01Z-
dc.date.issued2020-
dc.identifier.citation34th Conference on Neural Information Processing Systems (NeurIPS 2020), Virtual Conference, 6-12 Decemeber 2020. In Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020-
dc.identifier.issn1049-5258-
dc.identifier.urihttp://hdl.handle.net/10722/303785-
dc.description.abstractA recent breakthrough in deep learning theory shows that the training of overparameterized deep neural networks can be characterized by a kernel function called neural tangent kernel (NTK). However, it is known that this type of results does not perfectly match the practice, as NTK-based analysis requires the network weights to stay very close to their initialization throughout training, and cannot handle regularizers or gradient noises. In this paper, we provide a generalized neural tangent kernel analysis and show that noisy gradient descent with weight decay can still exhibit a “kernel-like” behavior. This implies that the training loss converges linearly up to a certain accuracy. We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay.-
dc.languageeng-
dc.relation.ispartofAdvances in Neural Information Processing Systems 33 (NeurIPS 2020)-
dc.titleA generalized neural tangent kernel analysis for two-layer neural networks-
dc.typeConference_Paper-
dc.description.naturelink_to_OA_fulltext-
dc.identifier.scopuseid_2-s2.0-85108454055-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats