File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TCSVT.2015.2389413
- Scopus: eid_2-s2.0-84964324864
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: DEFEATnet - A deep conventional image representation for image classification
Title | DEFEATnet - A deep conventional image representation for image classification |
---|---|
Authors | |
Keywords | Conventional Image Representation Deep Architecture Feature Encoding Local Max Pooling |
Issue Date | 2016 |
Citation | IEEE Transactions on Circuits and Systems for Video Technology, 2016, v. 26, n. 3, p. 494-505 How to Cite? |
Abstract | To study underlying possibilities for the successes of conventional image representation and deep neural networks (DNNs) in image representation, we propose a deep feature extraction, encoding, and pooling network (DEFEATnet) architecture, which is a marriage between conventional image representation approaches and DNNs. In particular, in DEFEATnet, each layer consists of three components: feature extraction, feature encoding, and pooling. The primary advantage of DEFEATnet is twofold. First, it consolidates the prior knowledge (e.g., translation invariance) from extracting, encoding, and pooling handcrafted features, as in the conventional feature representation approaches. Second, it represents the object parts at different granularities by gradually increasing the local receptive fields in different layers, as in DNNs. Moreover, DEFEATnet is a generalized framework that can readily incorporate all types of local features as well as all kinds of well-designed feature encoding and pooling methods. Since prior knowledge is preserved in DEFEATnet, it is especially useful for image representation on small/medium size data sets, where DNNs usually fail due to the lack of sufficient training data. Promising experimental results clearly show that DEFEATnets outperform shallow conventional image representation approaches by a large margin when the same type of features, feature encoding and pooling are used. The extensive experiments also demonstrate the effectiveness of the deep architecture of our DEFEATnet in improving the robustness for image presentation. |
Persistent Identifier | http://hdl.handle.net/10722/345213 |
ISSN | 2023 Impact Factor: 8.3 2023 SCImago Journal Rankings: 2.299 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Gao, Shenghua | - |
dc.contributor.author | Duan, Lixin | - |
dc.contributor.author | Tsang, Ivor W. | - |
dc.date.accessioned | 2024-08-15T09:25:57Z | - |
dc.date.available | 2024-08-15T09:25:57Z | - |
dc.date.issued | 2016 | - |
dc.identifier.citation | IEEE Transactions on Circuits and Systems for Video Technology, 2016, v. 26, n. 3, p. 494-505 | - |
dc.identifier.issn | 1051-8215 | - |
dc.identifier.uri | http://hdl.handle.net/10722/345213 | - |
dc.description.abstract | To study underlying possibilities for the successes of conventional image representation and deep neural networks (DNNs) in image representation, we propose a deep feature extraction, encoding, and pooling network (DEFEATnet) architecture, which is a marriage between conventional image representation approaches and DNNs. In particular, in DEFEATnet, each layer consists of three components: feature extraction, feature encoding, and pooling. The primary advantage of DEFEATnet is twofold. First, it consolidates the prior knowledge (e.g., translation invariance) from extracting, encoding, and pooling handcrafted features, as in the conventional feature representation approaches. Second, it represents the object parts at different granularities by gradually increasing the local receptive fields in different layers, as in DNNs. Moreover, DEFEATnet is a generalized framework that can readily incorporate all types of local features as well as all kinds of well-designed feature encoding and pooling methods. Since prior knowledge is preserved in DEFEATnet, it is especially useful for image representation on small/medium size data sets, where DNNs usually fail due to the lack of sufficient training data. Promising experimental results clearly show that DEFEATnets outperform shallow conventional image representation approaches by a large margin when the same type of features, feature encoding and pooling are used. The extensive experiments also demonstrate the effectiveness of the deep architecture of our DEFEATnet in improving the robustness for image presentation. | - |
dc.language | eng | - |
dc.relation.ispartof | IEEE Transactions on Circuits and Systems for Video Technology | - |
dc.subject | Conventional Image Representation | - |
dc.subject | Deep Architecture | - |
dc.subject | Feature Encoding | - |
dc.subject | Local Max Pooling | - |
dc.title | DEFEATnet - A deep conventional image representation for image classification | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TCSVT.2015.2389413 | - |
dc.identifier.scopus | eid_2-s2.0-84964324864 | - |
dc.identifier.volume | 26 | - |
dc.identifier.issue | 3 | - |
dc.identifier.spage | 494 | - |
dc.identifier.epage | 505 | - |