FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding

Shao, Dian; Zhao, Yue; Dai, Bo; Lin, Dahua

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR42600.2020.00269
Scopus: eid_2-s2.0-85094130169
WOS: WOS:000620679502088
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding

Title	FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding
Authors	Shao, Dian Zhao, Yue Dai, Bo Lin, Dahua
Issue Date	2020
Citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, p. 2613-2622 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR42600.2020.00269
Abstract	On public benchmarks, current action recognition techniques have achieved great success. However, when used in real-world applications, e.g. sport analysis, which requires the capability of parsing an activity into phases and differentiating between subtly different actions, their performances remain far from being satisfactory. To take action recognition to a new level, we develop FineGym, a new dataset built on top of gymnasium videos. Compared to existing action recognition datasets, FineGym is distinguished in richness, quality, and diversity. In particular, it provides temporal annotations at both action and sub-Action levels with a three-level semantic hierarchy. For example, a 'balance beam' activity will be annotated as a sequence of elementary sub-Actions derived from five sets: 'leap-jump-hop', 'beam-Turns', 'flight-salto', 'flight-handspring', and 'dismount', where the sub-Action in each set will be further annotated with finely defined class labels. This new level of granularity presents significant challenges for action recognition, e.g. how to parse the temporal structures from a coherent action, and how to distinguish between subtly different action classes. We systematically investigates different methods on this dataset and obtains a number of interesting findings. We hope this dataset could advance research towards action understanding.
Persistent Identifier	http://hdl.handle.net/10722/352211
ISSN	1063-6919 2023 SCImago Journal Rankings: 10.331
ISI Accession Number ID	WOS:000620679502088

DC Field	Value	Language
dc.contributor.author	Shao, Dian	-
dc.contributor.author	Zhao, Yue	-
dc.contributor.author	Dai, Bo	-
dc.contributor.author	Lin, Dahua	-
dc.date.accessioned	2024-12-16T03:57:20Z	-
dc.date.available	2024-12-16T03:57:20Z	-
dc.date.issued	2020	-
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, p. 2613-2622	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/352211	-
dc.description.abstract	On public benchmarks, current action recognition techniques have achieved great success. However, when used in real-world applications, e.g. sport analysis, which requires the capability of parsing an activity into phases and differentiating between subtly different actions, their performances remain far from being satisfactory. To take action recognition to a new level, we develop FineGym, a new dataset built on top of gymnasium videos. Compared to existing action recognition datasets, FineGym is distinguished in richness, quality, and diversity. In particular, it provides temporal annotations at both action and sub-Action levels with a three-level semantic hierarchy. For example, a 'balance beam' activity will be annotated as a sequence of elementary sub-Actions derived from five sets: 'leap-jump-hop', 'beam-Turns', 'flight-salto', 'flight-handspring', and 'dismount', where the sub-Action in each set will be further annotated with finely defined class labels. This new level of granularity presents significant challenges for action recognition, e.g. how to parse the temporal structures from a coherent action, and how to distinguish between subtly different action classes. We systematically investigates different methods on this dataset and obtains a number of interesting findings. We hope this dataset could advance research towards action understanding.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.title	FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR42600.2020.00269	-
dc.identifier.scopus	eid_2-s2.0-85094130169	-
dc.identifier.spage	2613	-
dc.identifier.epage	2622	-
dc.identifier.isi	WOS:000620679502088	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats