SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning

Wang, Hongjun; Vaze, Sagar; Han, Kai

File Download

There are no files associated with this item.

Supplementary

Citations:
Appears in Collections:
- Statistics & Actuarial Science: Conference papers

Conference Paper: SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning

Title	SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning
Authors	Wang, Hongjun Vaze, Sagar Han, Kai
Issue Date	17-May-2024
Abstract	Generalized Category Discovery (GCD) aims to classify unlabelled images from both ‘seen’ and ‘unseen’ classes by transferring knowledge from a set of labelled ‘seen’ class images. A key theme in existing GCD approaches is adapting largescale pretrained models for the GCD task. An alternate perspective, however, is to adapt the data representation itself for better alignment with the pretrained model. As such, in this paper, we introduce a two-stage adaptation approach termed SPTNet, which iteratively optimizes model parameters (i.e., model-finetuning) and data parameters (i.e., prompt learning). Furthermore, we propose a novel spatial prompt tuning method (SPT) which considers the spatial property of image data, enabling the method to better focus on object parts, which can transfer between seen and unseen classes. We thoroughly evaluate our SPTNet on standard benchmarks and demonstrate that our method outperforms existing GCD methods. Notably, we find our method achieving an average accuracy of 61.4% on the SSB, surpassing prior state-of-the-art methods by approximately 10%. The improvement is particularly remarkable as our method yields extra parameters amounting to only 0.039% of those in the backbone architecture.
Persistent Identifier	http://hdl.handle.net/10722/341723

DC Field	Value	Language
dc.contributor.author	Wang, Hongjun	-
dc.contributor.author	Vaze, Sagar	-
dc.contributor.author	Han, Kai	-
dc.date.accessioned	2024-03-20T06:58:34Z	-
dc.date.available	2024-03-20T06:58:34Z	-
dc.date.issued	2024-05-17	-
dc.identifier.uri	http://hdl.handle.net/10722/341723	-
dc.description.abstract	<p>Generalized Category Discovery (GCD) aims to classify unlabelled images from both ‘seen’ and ‘unseen’ classes by transferring knowledge from a set of labelled ‘seen’ class images. A key theme in existing GCD approaches is adapting largescale pretrained models for the GCD task. An alternate perspective, however, is to adapt the data representation itself for better alignment with the pretrained model. As such, in this paper, we introduce a two-stage adaptation approach termed SPTNet, which iteratively optimizes model parameters (i.e., model-finetuning) and data parameters (i.e., prompt learning). Furthermore, we propose a novel spatial prompt tuning method (SPT) which considers the spatial property of image data, enabling the method to better focus on object parts, which can transfer between seen and unseen classes. We thoroughly evaluate our SPTNet on standard benchmarks and demonstrate that our method outperforms existing GCD methods. Notably, we find our method achieving an average accuracy of 61.4% on the SSB, surpassing prior state-of-the-art methods by approximately 10%. The improvement is particularly remarkable as our method yields extra parameters amounting to only 0.039% of those in the backbone architecture.</p>	-
dc.language	eng	-
dc.relation.ispartof	The Twelfth International Conference on Learning Representations (13/05/2024-17/05/2024, , , Vienna Austria)	-
dc.title	SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning	-
dc.type	Conference_Paper	-

File Download

Supplementary

Conference Paper: SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats