Tight sample complexity of learning one-hidden-layer convolutional neural networks

Cao, Yuan; Gu, Quanquan

File Download

re01.htm

Links for fulltext

(May Require Subscription)

Scopus: eid_2-s2.0-85090172205
WOS: WOS:000535866902026
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Statistics & Actuarial Science: Conference papers

Conference Paper: Tight sample complexity of learning one-hidden-layer convolutional neural networks

Title	Tight sample complexity of learning one-hidden-layer convolutional neural networks
Authors	Cao, Yuan Gu, Quanquan
Issue Date	2019
Citation	33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, 8-14 Decemeber 2019. In Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 2020 How to Cite?
Abstract	We study the sample complexity of learning one-hidden-layer convolutional neural networks (CNNs) with non-overlapping filters. We propose a novel algorithm called approximate gradient descent for training CNNs, and show that, with high probability, the proposed algorithm with random initialization grants a linear convergence to the ground-truth parameters up to statistical precision. Compared with existing work, our result applies to general non-trivial, monotonic and Lipschitz continuous activation functions including ReLU, Leaky ReLU, Sigmod and Softplus etc. Moreover, our sample complexity beats existing results in the dependency of the number of hidden nodes and filter size. In fact, our result matches the information-theoretic lower bound for learning one-hidden-layer CNNs with linear activation functions, suggesting that our sample complexity is tight. Our theoretical analysis is backed up by numerical experiments.
Persistent Identifier	http://hdl.handle.net/10722/303694
ISSN	1049-5258 2020 SCImago Journal Rankings: 1.399
ISI Accession Number ID	WOS:000535866902026

DC Field	Value	Language
dc.contributor.author	Cao, Yuan	-
dc.contributor.author	Gu, Quanquan	-
dc.date.accessioned	2021-09-15T08:25:50Z	-
dc.date.available	2021-09-15T08:25:50Z	-
dc.date.issued	2019	-
dc.identifier.citation	33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, 8-14 Decemeber 2019. In Advances in Neural Information Processing Systems 32 (NeurIPS 2019), 2020	-
dc.identifier.issn	1049-5258	-
dc.identifier.uri	http://hdl.handle.net/10722/303694	-
dc.description.abstract	We study the sample complexity of learning one-hidden-layer convolutional neural networks (CNNs) with non-overlapping filters. We propose a novel algorithm called approximate gradient descent for training CNNs, and show that, with high probability, the proposed algorithm with random initialization grants a linear convergence to the ground-truth parameters up to statistical precision. Compared with existing work, our result applies to general non-trivial, monotonic and Lipschitz continuous activation functions including ReLU, Leaky ReLU, Sigmod and Softplus etc. Moreover, our sample complexity beats existing results in the dependency of the number of hidden nodes and filter size. In fact, our result matches the information-theoretic lower bound for learning one-hidden-layer CNNs with linear activation functions, suggesting that our sample complexity is tight. Our theoretical analysis is backed up by numerical experiments.	-
dc.language	eng	-
dc.relation.ispartof	Advances in Neural Information Processing Systems 32 (NeurIPS 2019)	-
dc.title	Tight sample complexity of learning one-hidden-layer convolutional neural networks	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_OA_fulltext	-
dc.identifier.scopus	eid_2-s2.0-85090172205	-
dc.identifier.isi	WOS:000535866902026	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Tight sample complexity of learning one-hidden-layer convolutional neural networks

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats