Video classification via relational feature encoding networks

Zhou, Yao; Feng, Litong; Ren, Jiamin; Qiu, Shi; Li, Jingyu; Luo, Ping

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1145/3134263.3134265
Scopus: eid_2-s2.0-85035790237

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Video classification via relational feature encoding networks

Title	Video classification via relational feature encoding networks
Authors	Zhou, Yao Feng, Litong Ren, Jiamin Qiu, Shi Li, Jingyu Luo, Ping
Keywords	Video classification Relational Feature Encoding Temporal aggregation Temporal segment networks
Issue Date	2017
Citation	LSVC 2017 - Proceedings of the Workshop on Large-Scale Video Classification Challenge, co-located with MM 2017, 2017, p. 9-13 How to Cite? DOI: http://dx.doi.org/10.1145/3134263.3134265
Abstract	© 2017 Association for Computing Machinery. In this paper, we propose a novel Relational Feature Encoding Network for video classification. The proposed network uses a set of relational functions wired on top of a backbone convolutional neural network (ConvNet) to generate multiple complementary feature streams on the fy, which are then combined by an aggregation module to form a video-level representation for recognition. The relational functions compute new relational features by applying element-wise operations or a simple projection to pairs of raw ConvNet features, and thus encode the underlying temporal dynamics and relationship of contextual frames which are critical for recognizing video contents. In this work, we explore a number of design choices for both the relational functions and the aggregation functions, and evaluate the resulting deep model on a number of video classification benchmarks, including the extended Fudan-Columbia Video dataset, UCF101, and Kinetics. Experimental results demonstrate that our model is not only well-suited for action recognition, but also exhibits promising performance for general videos.
Persistent Identifier	http://hdl.handle.net/10722/273731

DC Field	Value	Language
dc.contributor.author	Zhou, Yao	-
dc.contributor.author	Feng, Litong	-
dc.contributor.author	Ren, Jiamin	-
dc.contributor.author	Qiu, Shi	-
dc.contributor.author	Li, Jingyu	-
dc.contributor.author	Luo, Ping	-
dc.date.accessioned	2019-08-12T09:56:30Z	-
dc.date.available	2019-08-12T09:56:30Z	-
dc.date.issued	2017	-
dc.identifier.citation	LSVC 2017 - Proceedings of the Workshop on Large-Scale Video Classification Challenge, co-located with MM 2017, 2017, p. 9-13	-
dc.identifier.uri	http://hdl.handle.net/10722/273731	-
dc.description.abstract	© 2017 Association for Computing Machinery. In this paper, we propose a novel Relational Feature Encoding Network for video classification. The proposed network uses a set of relational functions wired on top of a backbone convolutional neural network (ConvNet) to generate multiple complementary feature streams on the fy, which are then combined by an aggregation module to form a video-level representation for recognition. The relational functions compute new relational features by applying element-wise operations or a simple projection to pairs of raw ConvNet features, and thus encode the underlying temporal dynamics and relationship of contextual frames which are critical for recognizing video contents. In this work, we explore a number of design choices for both the relational functions and the aggregation functions, and evaluate the resulting deep model on a number of video classification benchmarks, including the extended Fudan-Columbia Video dataset, UCF101, and Kinetics. Experimental results demonstrate that our model is not only well-suited for action recognition, but also exhibits promising performance for general videos.	-
dc.language	eng	-
dc.relation.ispartof	LSVC 2017 - Proceedings of the Workshop on Large-Scale Video Classification Challenge, co-located with MM 2017	-
dc.subject	Video classification	-
dc.subject	Relational Feature Encoding	-
dc.subject	Temporal aggregation	-
dc.subject	Temporal segment networks	-
dc.title	Video classification via relational feature encoding networks	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1145/3134263.3134265	-
dc.identifier.scopus	eid_2-s2.0-85035790237	-
dc.identifier.spage	9	-
dc.identifier.epage	13	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Video classification via relational feature encoding networks

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats