Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer

<p dir="ltr">Precision agriculture demands robust yet efficient models capable of operating on resource-constrained edge devices for real-time plant health monitoring. Existing Vision Transformer (ViT) models often underperform in data-scarce agricultural settings due to their relian...

Full description

Saved in:
Bibliographic Details
Main Author: Junaid Ahmad Khan (23739870) (author)
Other Authors: Muhammad Asif Khan (7367468) (author), Imen Filali (23739873) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864513521710006272
author Junaid Ahmad Khan (23739870)
author2 Muhammad Asif Khan (7367468)
Imen Filali (23739873)
author2_role author
author
author_facet Junaid Ahmad Khan (23739870)
Muhammad Asif Khan (7367468)
Imen Filali (23739873)
author_role author
dc.creator.none.fl_str_mv Junaid Ahmad Khan (23739870)
Muhammad Asif Khan (7367468)
Imen Filali (23739873)
dc.date.none.fl_str_mv 2025-12-04T09:00:00Z
dc.identifier.none.fl_str_mv 10.1109/access.2025.3632545
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Drone-Aided_Plants_Health_Monitoring_Using_Enhanced_Vision_Transformer/32033856
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Artificial intelligence
Computer vision and multimedia computation
CNN
disease classification
drones
plants
ViT
Computer architecture
Transformers
Standards
Computational modeling
Accuracy
Biological system modeling
Training
Real-time systems
Feature extraction
dc.title.none.fl_str_mv Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Precision agriculture demands robust yet efficient models capable of operating on resource-constrained edge devices for real-time plant health monitoring. Existing Vision Transformer (ViT) models often underperform in data-scarce agricultural settings due to their reliance on large-scale pretraining and limited local feature extraction capabilities. In this study, we propose a enhanced lightweight ViT architecture (EViT) for edge devices used in plants disease classification. The core of our approach is a domain-optimized convolutional stem (ConvStem) architecture that replaces the standard patch embedding with a convolutional stem to enhance local feature extraction, particularly effective in data-scarce scenarios. Unlike ViT architectures, which rely on large-scale pretraining, the proposed ConvStem-ViT (i.e., EViT) achieves a notable improvement in accuracy and generalization on two benchmark datasets - PlantVillage (a widely used dataset of leaf disease images) and CCMT (a more challenging, real-world crop disease dataset). Comprehensive experiments, including ablation studies using model variant analysis, supported with attention map visualizations, confirm the superiority of the proposed model over standard ViTs and contemporary CNN-based baselines. Our EViT achieves over 94.8% accuracy on PlantVillage and 78.0% on CCMT, outperforming conventional ViTs by up to 13.6%. Computational analysis shows the model achieves these gains with minimal overhead (5.55 ms/image latency, 1.30 GFLOPs); these efficiency metrics enable real-time inference on UAVs or low-power edge devices, making the model suitable for practical drone-based scouting in agricultural fields. This work offers a viable framework for scalable, accurate, and efficient plant disease classification, making it an effective choice for resource-constrained edge devices such as drones for plant disease classification.</p><h2 dir="ltr">Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2025.3632545" target="_blank">https://dx.doi.org/10.1109/access.2025.3632545</a></p>
eu_rights_str_mv openAccess
id Manara2_a76caebefc3b588cb8107f7970fc4a7f
identifier_str_mv 10.1109/access.2025.3632545
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/32033856
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Drone-Aided Plants Health Monitoring Using Enhanced Vision TransformerJunaid Ahmad Khan (23739870)Muhammad Asif Khan (7367468)Imen Filali (23739873)Information and computing sciencesArtificial intelligenceComputer vision and multimedia computationCNNdisease classificationdronesplantsViTComputer architectureTransformersStandardsComputational modelingAccuracyBiological system modelingTrainingReal-time systemsFeature extraction<p dir="ltr">Precision agriculture demands robust yet efficient models capable of operating on resource-constrained edge devices for real-time plant health monitoring. Existing Vision Transformer (ViT) models often underperform in data-scarce agricultural settings due to their reliance on large-scale pretraining and limited local feature extraction capabilities. In this study, we propose a enhanced lightweight ViT architecture (EViT) for edge devices used in plants disease classification. The core of our approach is a domain-optimized convolutional stem (ConvStem) architecture that replaces the standard patch embedding with a convolutional stem to enhance local feature extraction, particularly effective in data-scarce scenarios. Unlike ViT architectures, which rely on large-scale pretraining, the proposed ConvStem-ViT (i.e., EViT) achieves a notable improvement in accuracy and generalization on two benchmark datasets - PlantVillage (a widely used dataset of leaf disease images) and CCMT (a more challenging, real-world crop disease dataset). Comprehensive experiments, including ablation studies using model variant analysis, supported with attention map visualizations, confirm the superiority of the proposed model over standard ViTs and contemporary CNN-based baselines. Our EViT achieves over 94.8% accuracy on PlantVillage and 78.0% on CCMT, outperforming conventional ViTs by up to 13.6%. Computational analysis shows the model achieves these gains with minimal overhead (5.55 ms/image latency, 1.30 GFLOPs); these efficiency metrics enable real-time inference on UAVs or low-power edge devices, making the model suitable for practical drone-based scouting in agricultural fields. This work offers a viable framework for scalable, accurate, and efficient plant disease classification, making it an effective choice for resource-constrained edge devices such as drones for plant disease classification.</p><h2 dir="ltr">Other Information</h2><p dir="ltr">Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/deed.en" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2025.3632545" target="_blank">https://dx.doi.org/10.1109/access.2025.3632545</a></p>2025-12-04T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2025.3632545https://figshare.com/articles/journal_contribution/Drone-Aided_Plants_Health_Monitoring_Using_Enhanced_Vision_Transformer/32033856CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/320338562025-12-04T09:00:00Z
spellingShingle Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer
Junaid Ahmad Khan (23739870)
Information and computing sciences
Artificial intelligence
Computer vision and multimedia computation
CNN
disease classification
drones
plants
ViT
Computer architecture
Transformers
Standards
Computational modeling
Accuracy
Biological system modeling
Training
Real-time systems
Feature extraction
status_str publishedVersion
title Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer
title_full Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer
title_fullStr Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer
title_full_unstemmed Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer
title_short Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer
title_sort Drone-Aided Plants Health Monitoring Using Enhanced Vision Transformer
topic Information and computing sciences
Artificial intelligence
Computer vision and multimedia computation
CNN
disease classification
drones
plants
ViT
Computer architecture
Transformers
Standards
Computational modeling
Accuracy
Biological system modeling
Training
Real-time systems
Feature extraction