Online Knowledge Distillation via Collaborative Learning

Guo, Q; Wang, X; Wu, Y; Yu, Z; Liang, D; Hu, X; Luo, P

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR42600.2020.01103
Scopus: eid_2-s2.0-85094605661
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Online Knowledge Distillation via Collaborative Learning

Title	Online Knowledge Distillation via Collaborative Learning
Authors	Guo, Q Wang, X Wu, Y Yu, Z Liang, D Hu, X Luo, P
Keywords	Knowledge engineering Collaborative work Perturbation methods Learning (artificial intelligence) Neural networks
Issue Date	2020
Publisher	IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147
Citation	Proceedings of IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR 2020), Seattle, USA, 13-19 June 2020, p. 11017-11026 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR42600.2020.01103
Abstract	This work presents an efficient yet effective online Knowledge Distillation method via Collaborative Learning, termed KDCL, which is able to consistently improve the generalization ability of deep neural networks (DNNs) that have different learning capacities. Unlike existing twostage knowledge distillation approaches that pre-train a DNN with large capacity as the “teacher” and then transfer the teacher’s knowledge to another “student” DNN unidirectionally (i.e. one-way), KDCL treats all DNNs as “students” and collaboratively trains them in a single stage (knowledge is transferred among arbitrary students during collaborative training), enabling parallel computing, fast computations, and appealing generalization ability. Specifically, we carefully design multiple methods to generate soft target as supervisions by effectively ensembling predictions of students and distorting the input images. Extensive experiments show that KDCL consistently improves all the “students” on different datasets, including CIFAR100 and ImageNet. For example, when trained together by using KDCL, ResNet-50 and MobileNetV2 achieve 78.2% and 74.0% top-1 accuracy on ImageNet, outperforming the original results by 1.4% and 2.0% respectively. We also verify that models pre-trained with KDCL transfer well to object detection and semantic segmentation on MS COCO dataset. For instance, the FPN detector is improved by 0.9% mAP.
Description	Session: Oral 3.2A — Recognition (Detection, Categorization) (2) - Poster no. 5; Paper ID 6687 CVPR 2020 held virtually due to COVID-19
Persistent Identifier	http://hdl.handle.net/10722/284162
ISSN	1063-6919 2020 SCImago Journal Rankings: 4.658

DC Field	Value	Language
dc.contributor.author	Guo, Q	-
dc.contributor.author	Wang, X	-
dc.contributor.author	Wu, Y	-
dc.contributor.author	Yu, Z	-
dc.contributor.author	Liang, D	-
dc.contributor.author	Hu, X	-
dc.contributor.author	Luo, P	-
dc.date.accessioned	2020-07-20T05:56:34Z	-
dc.date.available	2020-07-20T05:56:34Z	-
dc.date.issued	2020	-
dc.identifier.citation	Proceedings of IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR 2020), Seattle, USA, 13-19 June 2020, p. 11017-11026	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/284162	-
dc.description	Session: Oral 3.2A — Recognition (Detection, Categorization) (2) - Poster no. 5; Paper ID 6687	-
dc.description	CVPR 2020 held virtually due to COVID-19	-
dc.description.abstract	This work presents an efficient yet effective online Knowledge Distillation method via Collaborative Learning, termed KDCL, which is able to consistently improve the generalization ability of deep neural networks (DNNs) that have different learning capacities. Unlike existing twostage knowledge distillation approaches that pre-train a DNN with large capacity as the “teacher” and then transfer the teacher’s knowledge to another “student” DNN unidirectionally (i.e. one-way), KDCL treats all DNNs as “students” and collaboratively trains them in a single stage (knowledge is transferred among arbitrary students during collaborative training), enabling parallel computing, fast computations, and appealing generalization ability. Specifically, we carefully design multiple methods to generate soft target as supervisions by effectively ensembling predictions of students and distorting the input images. Extensive experiments show that KDCL consistently improves all the “students” on different datasets, including CIFAR100 and ImageNet. For example, when trained together by using KDCL, ResNet-50 and MobileNetV2 achieve 78.2% and 74.0% top-1 accuracy on ImageNet, outperforming the original results by 1.4% and 2.0% respectively. We also verify that models pre-trained with KDCL transfer well to object detection and semantic segmentation on MS COCO dataset. For instance, the FPN detector is improved by 0.9% mAP.	-
dc.language	eng	-
dc.publisher	IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147	-
dc.relation.ispartof	IEEE Conference on Computer Vision and Pattern Recognition. Proceedings	-
dc.rights	IEEE Conference on Computer Vision and Pattern Recognition. Proceedings. Copyright © IEEE Computer Society.	-
dc.rights	©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	-
dc.subject	Knowledge engineering	-
dc.subject	Collaborative work	-
dc.subject	Perturbation methods	-
dc.subject	Learning (artificial intelligence)	-
dc.subject	Neural networks	-
dc.title	Online Knowledge Distillation via Collaborative Learning	-
dc.type	Conference_Paper	-
dc.identifier.email	Luo, P: pluo@hku.hk	-
dc.identifier.authority	Luo, P=rp02575	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR42600.2020.01103	-
dc.identifier.scopus	eid_2-s2.0-85094605661	-
dc.identifier.hkuros	311022	-
dc.identifier.spage	11017	-
dc.identifier.epage	11026	-
dc.publisher.place	United States	-
dc.identifier.issnl	1063-6919	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Online Knowledge Distillation via Collaborative Learning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats