Table 1_SinGAN-CBAM: a multi-scale GAN with attention for few-shot plant disease image generation.docx

Introduction<p>To address the limitation in model performance for tea and coffee disease identification caused by scarce and low-quality image samples, this paper proposes a few-shot multi-scale image generation method named SinGAN-CBAM, aiming to enhance the detail fidelity and semantic usabi...

Full description

Saved in:
Bibliographic Details
Main Author: Mengyao Wu (808514) (author)
Other Authors: Xinrui Wang (488330) (author), Rongbiao Ji (18276976) (author), Yadong Li (317794) (author), Zongyuan Lv (22619738) (author), Li Chen (5749) (author), Jianping Yang (322369) (author), Canyu Wang (22434982) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Introduction<p>To address the limitation in model performance for tea and coffee disease identification caused by scarce and low-quality image samples, this paper proposes a few-shot multi-scale image generation method named SinGAN-CBAM, aiming to enhance the detail fidelity and semantic usability of generated images.</p>Methods<p>The research data were collected from Kunming, Baoshan, and Pu’er regions in Yunnan Province, covering seven typical diseases affecting both tea and coffee plants. Based on the SinGAN framework as the baseline, we incorporate the Convolutional Block Attention Module (CBAM), which leverages dual-channel and spatial attention mechanisms to strengthen the model’s ability to capture texture, edges, and spatial distribution features of diseased regions. Additionally, a SinGAN-SE model is constructed for comparative analysis to evaluate the improvement brought by channel-wise attention mechanisms. The generated images are validated through classification using a YOLO v8 model to assess their effectiveness in real-world recognition tasks.</p>Results<p>Experimental results demonstrate that SinGAN-CBAM significantly outperforms GAN, Fast-GAN, and the original SinGAN in metrics such as SSIM, PSNR, and Tenengrad, exhibiting superior structural consistency and edge clarity in generating both tea and coffee disease images. Compared with SinGAN-SE, SinGAN-CBAM further improves the naturalness of texture details and lesion distribution, showing particularly notable advantages in generating complex diseases such as rust and leaf miner infestations. Downstream classification results indicate that the YOLOv8 model trained on data generated by SinGAN-CBAM achieves higher precision, recall, and F1-score than those trained with other models, with key category recognition performance approaching or exceeding 0.98.</p>Discussion<p>This study validates the effectiveness of dual-dimensional attention mechanisms in enhancing the quality of agricultural few-shot image generation, providing a high-quality data augmentation solution for intelligent disease identification with promising practical applications.</p>