ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

Xu, L; Guan, Y; Jin, S; Liu, W; Qian, C; Luo, P; Ouyang, W; Wang, X

File Download

There are no files associated with this item.

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

Title	ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search
Authors	Xu, L Guan, Y Jin, S Liu, W Qian, C Luo, P Ouyang, W Wang, X
Issue Date	2021
Citation	Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, 19-25 June 2021, p. 16072-16081 How to Cite?
Abstract	Human pose estimation has achieved significant progress in recent years. However, most of the recent methods focus on improving accuracy using complicated models and ignoring real-time efficiency. To achieve a better trade-off between accuracy and efficiency, we propose a novel neural architecture search (NAS) method, termed ViPNAS, to search networks in both spatial and temporal levels for fast online video pose estimation. In the spatial level, we carefully design the search space with five different dimensions including network depth, width, kernel size, group number, and attentions. In the temporal level, we search from a series of temporal feature fusions to optimize the total accuracy and speed across multiple video frames. To the best of our knowledge, we are the first to search for the temporal feature fusion and automatic computation allocation in videos. Extensive experiments demonstrate the effectiveness of our approach on the challenging COCO2017 and PoseTrack2018 datasets. Our discovered model family, S-ViPNAS and T-ViPNAS, achieve significantly higher inference speed (CPU real-time) without sacrificing the accuracy compared to the previous state-of-the-art methods.
Description	Paper Session Twelve: Paper ID 5887
Persistent Identifier	http://hdl.handle.net/10722/301429

DC Field	Value	Language
dc.contributor.author	Xu, L	-
dc.contributor.author	Guan, Y	-
dc.contributor.author	Jin, S	-
dc.contributor.author	Liu, W	-
dc.contributor.author	Qian, C	-
dc.contributor.author	Luo, P	-
dc.contributor.author	Ouyang, W	-
dc.contributor.author	Wang, X	-
dc.date.accessioned	2021-07-27T08:10:56Z	-
dc.date.available	2021-07-27T08:10:56Z	-
dc.date.issued	2021	-
dc.identifier.citation	Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, 19-25 June 2021, p. 16072-16081	-
dc.identifier.uri	http://hdl.handle.net/10722/301429	-
dc.description	Paper Session Twelve: Paper ID 5887	-
dc.description.abstract	Human pose estimation has achieved significant progress in recent years. However, most of the recent methods focus on improving accuracy using complicated models and ignoring real-time efficiency. To achieve a better trade-off between accuracy and efficiency, we propose a novel neural architecture search (NAS) method, termed ViPNAS, to search networks in both spatial and temporal levels for fast online video pose estimation. In the spatial level, we carefully design the search space with five different dimensions including network depth, width, kernel size, group number, and attentions. In the temporal level, we search from a series of temporal feature fusions to optimize the total accuracy and speed across multiple video frames. To the best of our knowledge, we are the first to search for the temporal feature fusion and automatic computation allocation in videos. Extensive experiments demonstrate the effectiveness of our approach on the challenging COCO2017 and PoseTrack2018 datasets. Our discovered model family, S-ViPNAS and T-ViPNAS, achieve significantly higher inference speed (CPU real-time) without sacrificing the accuracy compared to the previous state-of-the-art methods.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Computer Vision and Pattern Recognition (CVPR) Proceedings	-
dc.title	ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search	-
dc.type	Conference_Paper	-
dc.identifier.email	Luo, P: pluo@hku.hk	-
dc.identifier.authority	Luo, P=rp02575	-
dc.identifier.hkuros	323747	-
dc.identifier.spage	16072	-
dc.identifier.epage	16081	-

File Download

Supplementary

Conference Paper: ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats