File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Few-Shot Action Recognition with Permutation-Invariant Attention

TitleFew-Shot Action Recognition with Permutation-Invariant Attention
Authors
Issue Date2020
PublisherSpringer.
Citation
Proceedings of the 16th European Conference on Computer Vision (ECCV), Online, Glasgow, UK, 23-28 August 2020, pt V, p. 525-542 How to Cite?
AbstractMany few-shot learning models focus on recognising images. In contrast, we tackle a challenging task of few-shot action recognition from videos. We build on a C3D encoder for spatio-temporal video blocks to capture short-range action patterns. Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependencies whose patterns are unlikely to repeat even in clips of the same class. Subsequently, the pooled representations are combined into simple relation descriptors which encode so-called query and support clips. Finally, relation descriptors are fed to the comparator with the goal of similarity learning between query and support clips. Importantly, to re-weight block contributions during pooling, we exploit spatial and temporal attention modules and self-supervision. In naturalistic clips (of the same class) there exists a temporal distribution shift–the locations of discriminative temporal action hotspots vary. Thus, we permute blocks of a clip and align the resulting attention regions with similarly permuted attention regions of non-permuted clip to train the attention mechanism invariant to block (and thus long-term hotspot) permutations. Our method outperforms the state of the art on the HMDB51, UCF101, miniMIT datasets.
Persistent Identifierhttp://hdl.handle.net/10722/294711
ISBN
Series/Report no.Lecture Notes in Computer Science (LNCS); v. 12350

 

DC FieldValueLanguage
dc.contributor.authorZhang, H-
dc.contributor.authorZhang, L-
dc.contributor.authorQi, X-
dc.contributor.authorLi, H-
dc.contributor.authorTorr, PHS-
dc.contributor.authorKoniusz, P-
dc.date.accessioned2020-12-08T07:40:46Z-
dc.date.available2020-12-08T07:40:46Z-
dc.date.issued2020-
dc.identifier.citationProceedings of the 16th European Conference on Computer Vision (ECCV), Online, Glasgow, UK, 23-28 August 2020, pt V, p. 525-542-
dc.identifier.isbn9783030585570-
dc.identifier.urihttp://hdl.handle.net/10722/294711-
dc.description.abstractMany few-shot learning models focus on recognising images. In contrast, we tackle a challenging task of few-shot action recognition from videos. We build on a C3D encoder for spatio-temporal video blocks to capture short-range action patterns. Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependencies whose patterns are unlikely to repeat even in clips of the same class. Subsequently, the pooled representations are combined into simple relation descriptors which encode so-called query and support clips. Finally, relation descriptors are fed to the comparator with the goal of similarity learning between query and support clips. Importantly, to re-weight block contributions during pooling, we exploit spatial and temporal attention modules and self-supervision. In naturalistic clips (of the same class) there exists a temporal distribution shift–the locations of discriminative temporal action hotspots vary. Thus, we permute blocks of a clip and align the resulting attention regions with similarly permuted attention regions of non-permuted clip to train the attention mechanism invariant to block (and thus long-term hotspot) permutations. Our method outperforms the state of the art on the HMDB51, UCF101, miniMIT datasets.-
dc.languageeng-
dc.publisherSpringer.-
dc.relation.ispartofEuropean Conference on Computer Vision (ECCV) 2020-
dc.relation.ispartofseriesLecture Notes in Computer Science (LNCS); v. 12350-
dc.titleFew-Shot Action Recognition with Permutation-Invariant Attention-
dc.typeConference_Paper-
dc.identifier.emailQi, X: xjqi@eee.hku.hk-
dc.identifier.authorityQi, X=rp02666-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/978-3-030-58558-7_31-
dc.identifier.scopuseid_2-s2.0-85097379613-
dc.identifier.hkuros320336-
dc.identifier.volumept V-
dc.identifier.spage525-
dc.identifier.epage542-
dc.publisher.placeCham-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats