File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: DEFEATnet - A deep conventional image representation for image classification

TitleDEFEATnet - A deep conventional image representation for image classification
Authors
KeywordsConventional Image Representation
Deep Architecture
Feature Encoding
Local Max Pooling
Issue Date2016
Citation
IEEE Transactions on Circuits and Systems for Video Technology, 2016, v. 26, n. 3, p. 494-505 How to Cite?
AbstractTo study underlying possibilities for the successes of conventional image representation and deep neural networks (DNNs) in image representation, we propose a deep feature extraction, encoding, and pooling network (DEFEATnet) architecture, which is a marriage between conventional image representation approaches and DNNs. In particular, in DEFEATnet, each layer consists of three components: feature extraction, feature encoding, and pooling. The primary advantage of DEFEATnet is twofold. First, it consolidates the prior knowledge (e.g., translation invariance) from extracting, encoding, and pooling handcrafted features, as in the conventional feature representation approaches. Second, it represents the object parts at different granularities by gradually increasing the local receptive fields in different layers, as in DNNs. Moreover, DEFEATnet is a generalized framework that can readily incorporate all types of local features as well as all kinds of well-designed feature encoding and pooling methods. Since prior knowledge is preserved in DEFEATnet, it is especially useful for image representation on small/medium size data sets, where DNNs usually fail due to the lack of sufficient training data. Promising experimental results clearly show that DEFEATnets outperform shallow conventional image representation approaches by a large margin when the same type of features, feature encoding and pooling are used. The extensive experiments also demonstrate the effectiveness of the deep architecture of our DEFEATnet in improving the robustness for image presentation.
Persistent Identifierhttp://hdl.handle.net/10722/345213
ISSN
2023 Impact Factor: 8.3
2023 SCImago Journal Rankings: 2.299

 

DC FieldValueLanguage
dc.contributor.authorGao, Shenghua-
dc.contributor.authorDuan, Lixin-
dc.contributor.authorTsang, Ivor W.-
dc.date.accessioned2024-08-15T09:25:57Z-
dc.date.available2024-08-15T09:25:57Z-
dc.date.issued2016-
dc.identifier.citationIEEE Transactions on Circuits and Systems for Video Technology, 2016, v. 26, n. 3, p. 494-505-
dc.identifier.issn1051-8215-
dc.identifier.urihttp://hdl.handle.net/10722/345213-
dc.description.abstractTo study underlying possibilities for the successes of conventional image representation and deep neural networks (DNNs) in image representation, we propose a deep feature extraction, encoding, and pooling network (DEFEATnet) architecture, which is a marriage between conventional image representation approaches and DNNs. In particular, in DEFEATnet, each layer consists of three components: feature extraction, feature encoding, and pooling. The primary advantage of DEFEATnet is twofold. First, it consolidates the prior knowledge (e.g., translation invariance) from extracting, encoding, and pooling handcrafted features, as in the conventional feature representation approaches. Second, it represents the object parts at different granularities by gradually increasing the local receptive fields in different layers, as in DNNs. Moreover, DEFEATnet is a generalized framework that can readily incorporate all types of local features as well as all kinds of well-designed feature encoding and pooling methods. Since prior knowledge is preserved in DEFEATnet, it is especially useful for image representation on small/medium size data sets, where DNNs usually fail due to the lack of sufficient training data. Promising experimental results clearly show that DEFEATnets outperform shallow conventional image representation approaches by a large margin when the same type of features, feature encoding and pooling are used. The extensive experiments also demonstrate the effectiveness of the deep architecture of our DEFEATnet in improving the robustness for image presentation.-
dc.languageeng-
dc.relation.ispartofIEEE Transactions on Circuits and Systems for Video Technology-
dc.subjectConventional Image Representation-
dc.subjectDeep Architecture-
dc.subjectFeature Encoding-
dc.subjectLocal Max Pooling-
dc.titleDEFEATnet - A deep conventional image representation for image classification-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/TCSVT.2015.2389413-
dc.identifier.scopuseid_2-s2.0-84964324864-
dc.identifier.volume26-
dc.identifier.issue3-
dc.identifier.spage494-
dc.identifier.epage505-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats