Learning overparameterized neural networks via stochastic gradient descent on structured data

Li, Yuanzhi; Liang, Yingyu

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Scopus: eid_2-s2.0-85064818888
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: Learning overparameterized neural networks via stochastic gradient descent on structured data

Title	Learning overparameterized neural networks via stochastic gradient descent on structured data
Authors	Li, Yuanzhi Liang, Yingyu
Issue Date	2018
Citation	Advances in Neural Information Processing Systems, 2018, v. 2018-December, p. 8157-8166 How to Cite?
Abstract	Neural networks have many successful applications, while much less theoretical understanding has been gained. Towards bridging this gap, we study the problem of learning a two-layer overparameterized ReLU neural network for multi-class classification via stochastic gradient descent (SGD) from random initialization. In the overparameterized setting, when the data comes from mixtures of well-separated distributions, we prove that SGD learns a network with a small generalization error, albeit the network has enough capacity to fit arbitrary labels. Furthermore, the analysis provides interesting insights into several aspects of learning neural networks and can be verified based on empirical studies on synthetic data and on the MNIST dataset.
Persistent Identifier	http://hdl.handle.net/10722/341245
ISSN	1049-5258 2020 SCImago Journal Rankings: 1.399

DC Field	Value	Language
dc.contributor.author	Li, Yuanzhi	-
dc.contributor.author	Liang, Yingyu	-
dc.date.accessioned	2024-03-13T08:41:18Z	-
dc.date.available	2024-03-13T08:41:18Z	-
dc.date.issued	2018	-
dc.identifier.citation	Advances in Neural Information Processing Systems, 2018, v. 2018-December, p. 8157-8166	-
dc.identifier.issn	1049-5258	-
dc.identifier.uri	http://hdl.handle.net/10722/341245	-
dc.description.abstract	Neural networks have many successful applications, while much less theoretical understanding has been gained. Towards bridging this gap, we study the problem of learning a two-layer overparameterized ReLU neural network for multi-class classification via stochastic gradient descent (SGD) from random initialization. In the overparameterized setting, when the data comes from mixtures of well-separated distributions, we prove that SGD learns a network with a small generalization error, albeit the network has enough capacity to fit arbitrary labels. Furthermore, the analysis provides interesting insights into several aspects of learning neural networks and can be verified based on empirical studies on synthetic data and on the MNIST dataset.	-
dc.language	eng	-
dc.relation.ispartof	Advances in Neural Information Processing Systems	-
dc.title	Learning overparameterized neural networks via stochastic gradient descent on structured data	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.scopus	eid_2-s2.0-85064818888	-
dc.identifier.volume	2018-December	-
dc.identifier.spage	8157	-
dc.identifier.epage	8166	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Learning overparameterized neural networks via stochastic gradient descent on structured data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats