Intra- and inter-action understanding via temporal action parsing

Shao, Dian; Zhao, Yue; Dai, Bo; Lin, Dahua

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR42600.2020.00081
Scopus: eid_2-s2.0-85094159450
WOS: WOS:000620679500074
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: Intra- and inter-action understanding via temporal action parsing

Title	Intra- and inter-action understanding via temporal action parsing
Authors	Shao, Dian Zhao, Yue Dai, Bo Lin, Dahua
Issue Date	2020
Citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, p. 727-736 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR42600.2020.00081
Abstract	Current methods for action recognition primarily rely on deep convolutional networks to derive feature embeddings of visual and motion features. While these methods have demonstrated remarkable performance on standard benchmarks, we are still in need of a better understanding as to how the videos, in particular their internal structures, relate to high-level semantics, which may lead to benefits in multiple aspects, e.g. interpretable predictions and even new methods that can take the recognition performances to a next level. Towards this goal, we construct TAPOS, a new dataset developed on sport videos with manual annotations of sub-actions, and conduct a study on temporal action parsing on top 1. Our study shows that a sport activity usually consists of multiple sub-actions and that the awareness of such temporal structures is beneficial to action recognition. We also investigate a number of temporal parsing methods, and thereon devise an improved method that is capable of mining sub-actions from training data without knowing the labels of them. On the constructed TAPOS, the proposed method is shown to reveal intra-action information, i.e. how action instances are made of sub-actions, and inter-action information, i.e. one specific sub-action may commonly appear in various actions.
Persistent Identifier	http://hdl.handle.net/10722/352214
ISSN	1063-6919 2023 SCImago Journal Rankings: 10.331
ISI Accession Number ID	WOS:000620679500074

DC Field	Value	Language
dc.contributor.author	Shao, Dian	-
dc.contributor.author	Zhao, Yue	-
dc.contributor.author	Dai, Bo	-
dc.contributor.author	Lin, Dahua	-
dc.date.accessioned	2024-12-16T03:57:21Z	-
dc.date.available	2024-12-16T03:57:21Z	-
dc.date.issued	2020	-
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, p. 727-736	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/352214	-
dc.description.abstract	Current methods for action recognition primarily rely on deep convolutional networks to derive feature embeddings of visual and motion features. While these methods have demonstrated remarkable performance on standard benchmarks, we are still in need of a better understanding as to how the videos, in particular their internal structures, relate to high-level semantics, which may lead to benefits in multiple aspects, e.g. interpretable predictions and even new methods that can take the recognition performances to a next level. Towards this goal, we construct TAPOS, a new dataset developed on sport videos with manual annotations of sub-actions, and conduct a study on temporal action parsing on top 1. Our study shows that a sport activity usually consists of multiple sub-actions and that the awareness of such temporal structures is beneficial to action recognition. We also investigate a number of temporal parsing methods, and thereon devise an improved method that is capable of mining sub-actions from training data without knowing the labels of them. On the constructed TAPOS, the proposed method is shown to reveal intra-action information, i.e. how action instances are made of sub-actions, and inter-action information, i.e. one specific sub-action may commonly appear in various actions.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.title	Intra- and inter-action understanding via temporal action parsing	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR42600.2020.00081	-
dc.identifier.scopus	eid_2-s2.0-85094159450	-
dc.identifier.spage	727	-
dc.identifier.epage	736	-
dc.identifier.isi	WOS:000620679500074	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Intra- and inter-action understanding via temporal action parsing

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats