File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TPAMI.2022.3159581
- Scopus: eid_2-s2.0-85126511871
- PMID: 35294341
- WOS: WOS:000912386000003
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Adaptive Perspective Distillation for Semantic Segmentation
Title | Adaptive Perspective Distillation for Semantic Segmentation |
---|---|
Authors | |
Keywords | Knowledge distillation scene understanding semantic segmentation |
Issue Date | 2023 |
Citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, v. 45, n. 2, p. 1372-1387 How to Cite? |
Abstract | Strong semantic segmentation models require large backbones to achieve promising performance, making it hard to adapt to real applications where effective real-time algorithms are needed. Knowledge distillation tackles this issue by letting the smaller model (student) produce similar pixel-wise predictions to that of a larger model (teacher). However, the classifier, which can be deemed as the perspective by which models perceive the encoded features for yielding observations (i.e., predictions), is shared by all training samples, fitting a universal feature distribution. Since good generalization to the entire distribution may bring the inferior specification to individual samples with a certain capacity, the shared universal perspective often overlooks details existing in each sample, causing degradation of knowledge distillation. In this paper, we propose Adaptive Perspective Distillation (APD) that creates an adaptive local perspective for each individual training sample. It extracts detailed contextual information from each training sample specifically, mining more details from the teacher and thus achieving better knowledge distillation results on the student. APD has no structural constraints to both teacher and student models, thus generalizing well to different semantic segmentation models. Extensive experiments on Cityscapes, ADE20K, and PASCAL-Context manifest the effectiveness of our proposed APD. Besides, APD can yield favorable performance gain to the models in both object detection and instance segmentation without bells and whistles. |
Persistent Identifier | http://hdl.handle.net/10722/332255 |
ISSN | 2023 Impact Factor: 20.8 2023 SCImago Journal Rankings: 6.158 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Tian, Zhuotao | - |
dc.contributor.author | Chen, Pengguang | - |
dc.contributor.author | Lai, Xin | - |
dc.contributor.author | Jiang, Li | - |
dc.contributor.author | Liu, Shu | - |
dc.contributor.author | Zhao, Hengshuang | - |
dc.contributor.author | Yu, Bei | - |
dc.contributor.author | Yang, Ming Chang | - |
dc.contributor.author | Jia, Jiaya | - |
dc.date.accessioned | 2023-10-06T05:10:04Z | - |
dc.date.available | 2023-10-06T05:10:04Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, v. 45, n. 2, p. 1372-1387 | - |
dc.identifier.issn | 0162-8828 | - |
dc.identifier.uri | http://hdl.handle.net/10722/332255 | - |
dc.description.abstract | Strong semantic segmentation models require large backbones to achieve promising performance, making it hard to adapt to real applications where effective real-time algorithms are needed. Knowledge distillation tackles this issue by letting the smaller model (student) produce similar pixel-wise predictions to that of a larger model (teacher). However, the classifier, which can be deemed as the perspective by which models perceive the encoded features for yielding observations (i.e., predictions), is shared by all training samples, fitting a universal feature distribution. Since good generalization to the entire distribution may bring the inferior specification to individual samples with a certain capacity, the shared universal perspective often overlooks details existing in each sample, causing degradation of knowledge distillation. In this paper, we propose Adaptive Perspective Distillation (APD) that creates an adaptive local perspective for each individual training sample. It extracts detailed contextual information from each training sample specifically, mining more details from the teacher and thus achieving better knowledge distillation results on the student. APD has no structural constraints to both teacher and student models, thus generalizing well to different semantic segmentation models. Extensive experiments on Cityscapes, ADE20K, and PASCAL-Context manifest the effectiveness of our proposed APD. Besides, APD can yield favorable performance gain to the models in both object detection and instance segmentation without bells and whistles. | - |
dc.language | eng | - |
dc.relation.ispartof | IEEE Transactions on Pattern Analysis and Machine Intelligence | - |
dc.subject | Knowledge distillation | - |
dc.subject | scene understanding | - |
dc.subject | semantic segmentation | - |
dc.title | Adaptive Perspective Distillation for Semantic Segmentation | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TPAMI.2022.3159581 | - |
dc.identifier.pmid | 35294341 | - |
dc.identifier.scopus | eid_2-s2.0-85126511871 | - |
dc.identifier.volume | 45 | - |
dc.identifier.issue | 2 | - |
dc.identifier.spage | 1372 | - |
dc.identifier.epage | 1387 | - |
dc.identifier.eissn | 1939-3539 | - |
dc.identifier.isi | WOS:000912386000003 | - |