Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks
<p dir="ltr">Traditional deep learning models often struggle in few-shot learning scenarios, where limited labeled data is available.</p><p dir="ltr">While the Contrastive Language-Image Pre-training (CLIP) model demonstrates impressive zero-shot capabilities, i...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | , |
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1852015728468688896 |
|---|---|
| author | Bowen Chen (12156618) |
| author2 | Yun Sing Koh (1221624) Gill Dobbie (1192893) |
| author2_role | author author |
| author_facet | Bowen Chen (12156618) Yun Sing Koh (1221624) Gill Dobbie (1192893) |
| author_role | author |
| dc.creator.none.fl_str_mv | Bowen Chen (12156618) Yun Sing Koh (1221624) Gill Dobbie (1192893) |
| dc.date.none.fl_str_mv | 2025-10-16T23:52:30Z |
| dc.identifier.none.fl_str_mv | 10.17608/k6.auckland.30380488.v1 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/poster/Bowen_Chen_SSAT-Adapter_Enhancing_Vision-Language_Model_Few-shot_Learning_with_Auxiliary_Tasks/30380488 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Computer vision Vision-Language Models Few-shot Learning Auxiliary Learning |
| dc.title.none.fl_str_mv | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks |
| dc.type.none.fl_str_mv | Image Poster info:eu-repo/semantics/publishedVersion image |
| description | <p dir="ltr">Traditional deep learning models often struggle in few-shot learning scenarios, where limited labeled data is available.</p><p dir="ltr">While the Contrastive Language-Image Pre-training (CLIP) model demonstrates impressive zero-shot capabilities, its performance in few-shot scenarios remains limited. </p><p dir="ltr">Existing methods primarily aim to leverage the limited labeled dataset, but this offers limited potential for improvement.</p><p dir="ltr">To overcome the limitations of small datasets in few-shot learning, we introduce a novel framework, SSAT-Adapter, that leverages CLIP's language understanding to generate informative auxiliary tasks and improve CLIP's performance and adaptability in few-shot settings.</p> |
| eu_rights_str_mv | openAccess |
| id | Manara_09fee5aeffaf0510befe8e86a00a34bc |
| identifier_str_mv | 10.17608/k6.auckland.30380488.v1 |
| network_acronym_str | Manara |
| network_name_str | ManaraRepo |
| oai_identifier_str | oai:figshare.com:article/30380488 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary TasksBowen Chen (12156618)Yun Sing Koh (1221624)Gill Dobbie (1192893)Computer visionVision-Language ModelsFew-shot LearningAuxiliary Learning<p dir="ltr">Traditional deep learning models often struggle in few-shot learning scenarios, where limited labeled data is available.</p><p dir="ltr">While the Contrastive Language-Image Pre-training (CLIP) model demonstrates impressive zero-shot capabilities, its performance in few-shot scenarios remains limited. </p><p dir="ltr">Existing methods primarily aim to leverage the limited labeled dataset, but this offers limited potential for improvement.</p><p dir="ltr">To overcome the limitations of small datasets in few-shot learning, we introduce a novel framework, SSAT-Adapter, that leverages CLIP's language understanding to generate informative auxiliary tasks and improve CLIP's performance and adaptability in few-shot settings.</p>2025-10-16T23:52:30ZImagePosterinfo:eu-repo/semantics/publishedVersionimage10.17608/k6.auckland.30380488.v1https://figshare.com/articles/poster/Bowen_Chen_SSAT-Adapter_Enhancing_Vision-Language_Model_Few-shot_Learning_with_Auxiliary_Tasks/30380488CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/303804882025-10-16T23:52:30Z |
| spellingShingle | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks Bowen Chen (12156618) Computer vision Vision-Language Models Few-shot Learning Auxiliary Learning |
| status_str | publishedVersion |
| title | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks |
| title_full | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks |
| title_fullStr | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks |
| title_full_unstemmed | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks |
| title_short | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks |
| title_sort | Bowen Chen: SSAT-Adapter: Enhancing Vision-Language Model Few-shot Learning with Auxiliary Tasks |
| topic | Computer vision Vision-Language Models Few-shot Learning Auxiliary Learning |