Validation of module effectiveness.

<div><p>Addressing the limitations in current visual question answering (VQA) models face limitations in multimodal feature fusion capabilities and often lack adequate consideration of local information, this study proposes a multimodal Transformer VQA network based on local and global i...

Full description

Saved in:
Bibliographic Details
Main Author: Cuiyang Huang (21647898) (author)
Other Authors: Zihan Hu (15363084) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!