Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks

<p dir="ltr">Traditional deep learning models often struggle in few-shot learning scenarios, where limited labeled data is available.</p><p dir="ltr">While the Contrastive Language-Image Pre-training (CLIP) model demonstrates impressive zero-shot capabilities, i...

Full description

Saved in:
Bibliographic Details
Main Author: Bowen Chen (12156618) (author)
Other Authors: Yun Sing Koh (1221624) (author), Gill Dobbie (1192893) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852015728468688896
author Bowen Chen (12156618)
author2 Yun Sing Koh (1221624)
Gill Dobbie (1192893)
author2_role author
author
author_facet Bowen Chen (12156618)
Yun Sing Koh (1221624)
Gill Dobbie (1192893)
author_role author
dc.creator.none.fl_str_mv Bowen Chen (12156618)
Yun Sing Koh (1221624)
Gill Dobbie (1192893)
dc.date.none.fl_str_mv 2025-10-16T23:52:30Z
dc.identifier.none.fl_str_mv 10.17608/k6.auckland.30380488.v1
dc.relation.none.fl_str_mv https://figshare.com/articles/poster/Bowen_Chen_SSAT-Adapter_Enhancing_Vision-Language_Model_Few-shot_Learning_with_Auxiliary_Tasks/30380488
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Computer vision
Vision-Language Models
Few-shot Learning
Auxiliary Learning
dc.title.none.fl_str_mv Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
dc.type.none.fl_str_mv Image
Poster
info:eu-repo/semantics/publishedVersion
image
description <p dir="ltr">Traditional deep learning models often struggle in few-shot learning scenarios, where limited labeled data is available.</p><p dir="ltr">While the Contrastive Language-Image Pre-training (CLIP) model demonstrates impressive zero-shot capabilities, its performance in few-shot scenarios remains limited. </p><p dir="ltr">Existing methods primarily aim to leverage the limited labeled dataset, but this offers limited potential for improvement.</p><p dir="ltr">To overcome the limitations of small datasets in few-shot learning, we introduce a novel framework, SSAT-Adapter, that leverages CLIP's language understanding to generate informative auxiliary tasks and improve CLIP's performance and adaptability in few-shot settings.</p>
eu_rights_str_mv openAccess
id Manara_09fee5aeffaf0510befe8e86a00a34bc
identifier_str_mv 10.17608/k6.auckland.30380488.v1
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/30380488
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary TasksBowen Chen (12156618)Yun Sing Koh (1221624)Gill Dobbie (1192893)Computer visionVision-Language ModelsFew-shot LearningAuxiliary Learning<p dir="ltr">Traditional deep learning models often struggle in few-shot learning scenarios, where limited labeled data is available.</p><p dir="ltr">While the Contrastive Language-Image Pre-training (CLIP) model demonstrates impressive zero-shot capabilities, its performance in few-shot scenarios remains limited. </p><p dir="ltr">Existing methods primarily aim to leverage the limited labeled dataset, but this offers limited potential for improvement.</p><p dir="ltr">To overcome the limitations of small datasets in few-shot learning, we introduce a novel framework, SSAT-Adapter, that leverages CLIP's language understanding to generate informative auxiliary tasks and improve CLIP's performance and adaptability in few-shot settings.</p>2025-10-16T23:52:30ZImagePosterinfo:eu-repo/semantics/publishedVersionimage10.17608/k6.auckland.30380488.v1https://figshare.com/articles/poster/Bowen_Chen_SSAT-Adapter_Enhancing_Vision-Language_Model_Few-shot_Learning_with_Auxiliary_Tasks/30380488CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/303804882025-10-16T23:52:30Z
spellingShingle Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
Bowen Chen (12156618)
Computer vision
Vision-Language Models
Few-shot Learning
Auxiliary Learning
status_str publishedVersion
title Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
title_full Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
title_fullStr Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
title_full_unstemmed Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
title_short Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
title_sort Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
topic Computer vision
Vision-Language Models
Few-shot Learning
Auxiliary Learning