File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Fine-Grained Video Categorization with Redundancy Reduction Attention

TitleFine-Grained Video Categorization with Redundancy Reduction Attention
Authors
KeywordsAttention mechanism
Fine-grained video categorization
Issue Date2018
Citation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, v. 11209 LNCS, p. 139-155 How to Cite?
AbstractFor fine-grained categorization tasks, videos could serve as a better source than static images as videos have a higher chance of containing discriminative patterns. Nevertheless, a video sequence could also contain a lot of redundant and irrelevant frames. How to locate critical information of interest is a challenging task. In this paper, we propose a new network structure, known as Redundancy Reduction Attention (RRA), which learns to focus on multiple discriminative patterns by suppressing redundant feature channels. Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform. Suppression is achieved by modulating the feature maps and threshing out weak activations. The updated feature maps are then used in the next iteration. Finally, the video is classified based on multiple summaries. The proposed method achieves outstanding performances in multiple video classification datasets. Furthermore, we have collected two large-scale video datasets, YouTube-Birds and YouTube-Cars, for future researches on fine-grained video categorization. The datasets are available at http://www.cs.umd.edu/~chenzhu/fgvc.
Persistent Identifierhttp://hdl.handle.net/10722/327209
ISSN
2023 SCImago Journal Rankings: 0.606
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorZhu, Chen-
dc.contributor.authorTan, Xiao-
dc.contributor.authorZhou, Feng-
dc.contributor.authorLiu, Xiao-
dc.contributor.authorYue, Kaiyu-
dc.contributor.authorDing, Errui-
dc.contributor.authorMa, Yi-
dc.date.accessioned2023-03-31T05:29:44Z-
dc.date.available2023-03-31T05:29:44Z-
dc.date.issued2018-
dc.identifier.citationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, v. 11209 LNCS, p. 139-155-
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/10722/327209-
dc.description.abstractFor fine-grained categorization tasks, videos could serve as a better source than static images as videos have a higher chance of containing discriminative patterns. Nevertheless, a video sequence could also contain a lot of redundant and irrelevant frames. How to locate critical information of interest is a challenging task. In this paper, we propose a new network structure, known as Redundancy Reduction Attention (RRA), which learns to focus on multiple discriminative patterns by suppressing redundant feature channels. Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform. Suppression is achieved by modulating the feature maps and threshing out weak activations. The updated feature maps are then used in the next iteration. Finally, the video is classified based on multiple summaries. The proposed method achieves outstanding performances in multiple video classification datasets. Furthermore, we have collected two large-scale video datasets, YouTube-Birds and YouTube-Cars, for future researches on fine-grained video categorization. The datasets are available at http://www.cs.umd.edu/~chenzhu/fgvc.-
dc.languageeng-
dc.relation.ispartofLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)-
dc.subjectAttention mechanism-
dc.subjectFine-grained video categorization-
dc.titleFine-Grained Video Categorization with Redundancy Reduction Attention-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/978-3-030-01228-1_9-
dc.identifier.scopuseid_2-s2.0-85055104808-
dc.identifier.volume11209 LNCS-
dc.identifier.spage139-
dc.identifier.epage155-
dc.identifier.eissn1611-3349-
dc.identifier.isiWOS:000594216400009-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats