Any region can be perceived equally and effectively on rotation pretext task using full rotation and weighted-region mixture

Dai, Wei; Wu, Tianyi; Liu, Rui; Wang, Min; Yin, Jianqin; Liu, Jun

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/j.neunet.2024.106350
Scopus: eid_2-s2.0-85192291180
PMID: 38723309
WOS: WOS:001240104000001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Industrial & Manufacturing Systems Engineering: Journal/Magazine Articles

Article: Any region can be perceived equally and effectively on rotation pretext task using full rotation and weighted-region mixture

Title	Any region can be perceived equally and effectively on rotation pretext task using full rotation and weighted-region mixture
Authors	Dai, Wei Wu, Tianyi Liu, Rui Wang, Min Yin, Jianqin Liu, Jun
Keywords	Data mixing Full rotation Self-supervised learning Vision impairment
Issue Date	2024
Citation	Neural Networks, 2024, v. 176, article no. 106350 How to Cite? DOI: http://dx.doi.org/10.1016/j.neunet.2024.106350
Abstract	In recent years, self-supervised learning has emerged as a powerful approach to learning visual representations without requiring extensive manual annotation. One popular technique involves using rotation transformations of images, which provide a clear visual signal for learning semantic representation. However, in this work, we revisit the pretext task of predicting image rotation in self-supervised learning and discover that it tends to marginalise the perception of features located near the centre of an image. To address this limitation, we propose a new self-supervised learning method, namely FullRot, which spotlights underrated regions by resizing the randomly selected and cropped regions of images. Moreover, FullRot increases the complexity of the rotation pretext task by applying the degree-free rotation to the region cropped into a circle. To encourage models to learn from different general parts of an image, we introduce a new data mixture technique called WRMix, which merges two random intra-image patches. By combining these innovative crop and rotation methods with the data mixture scheme, our approach, FullRot + WRMix, surpasses the state-of-the-art self-supervision methods in classification, segmentation, and object detection tasks on ten benchmark datasets with an improvement of up to +13.98% accuracy on STL-10, +8.56% accuracy on CIFAR-10, +10.20% accuracy on Sports-100, +15.86% accuracy on Mammals-45, +15.15% accuracy on PAD-UFES-20, +32.44% mIoU on VOC 2012, +7.62% mIoU on ISIC 2018, +9.70% mIoU on FloodArea, +25.16% AP50 on VOC 2007, and +58.69% AP50 on UTDAC 2020. The code is available at https://github.com/anthonyweidai/FullRot_WRMix.
Persistent Identifier	http://hdl.handle.net/10722/350070
ISSN	0893-6080 2023 Impact Factor: 6.0 2023 SCImago Journal Rankings: 2.605
ISI Accession Number ID	WOS:001240104000001

DC Field	Value	Language
dc.contributor.author	Dai, Wei	-
dc.contributor.author	Wu, Tianyi	-
dc.contributor.author	Liu, Rui	-
dc.contributor.author	Wang, Min	-
dc.contributor.author	Yin, Jianqin	-
dc.contributor.author	Liu, Jun	-
dc.date.accessioned	2024-10-17T07:02:52Z	-
dc.date.available	2024-10-17T07:02:52Z	-
dc.date.issued	2024	-
dc.identifier.citation	Neural Networks, 2024, v. 176, article no. 106350	-
dc.identifier.issn	0893-6080	-
dc.identifier.uri	http://hdl.handle.net/10722/350070	-
dc.description.abstract	In recent years, self-supervised learning has emerged as a powerful approach to learning visual representations without requiring extensive manual annotation. One popular technique involves using rotation transformations of images, which provide a clear visual signal for learning semantic representation. However, in this work, we revisit the pretext task of predicting image rotation in self-supervised learning and discover that it tends to marginalise the perception of features located near the centre of an image. To address this limitation, we propose a new self-supervised learning method, namely FullRot, which spotlights underrated regions by resizing the randomly selected and cropped regions of images. Moreover, FullRot increases the complexity of the rotation pretext task by applying the degree-free rotation to the region cropped into a circle. To encourage models to learn from different general parts of an image, we introduce a new data mixture technique called WRMix, which merges two random intra-image patches. By combining these innovative crop and rotation methods with the data mixture scheme, our approach, FullRot + WRMix, surpasses the state-of-the-art self-supervision methods in classification, segmentation, and object detection tasks on ten benchmark datasets with an improvement of up to +13.98% accuracy on STL-10, +8.56% accuracy on CIFAR-10, +10.20% accuracy on Sports-100, +15.86% accuracy on Mammals-45, +15.15% accuracy on PAD-UFES-20, +32.44% mIoU on VOC 2012, +7.62% mIoU on ISIC 2018, +9.70% mIoU on FloodArea, +25.16% AP50 on VOC 2007, and +58.69% AP50 on UTDAC 2020. The code is available at https://github.com/anthonyweidai/FullRot_WRMix.	-
dc.language	eng	-
dc.relation.ispartof	Neural Networks	-
dc.subject	Data mixing	-
dc.subject	Full rotation	-
dc.subject	Self-supervised learning	-
dc.subject	Vision impairment	-
dc.title	Any region can be perceived equally and effectively on rotation pretext task using full rotation and weighted-region mixture	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1016/j.neunet.2024.106350	-
dc.identifier.pmid	38723309	-
dc.identifier.scopus	eid_2-s2.0-85192291180	-
dc.identifier.volume	176	-
dc.identifier.spage	article no. 106350	-
dc.identifier.epage	article no. 106350	-
dc.identifier.eissn	1879-2782	-
dc.identifier.isi	WOS:001240104000001	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Any region can be perceived equally and effectively on rotation pretext task using full rotation and weighted-region mixture

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats