An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm

<p>Natural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. A...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Ghulam Jillani Ansari (16896342) (author)
مؤلفون آخرون: Jamal Hussain Shah (16896345) (author), Mylene C. Q. Farias (16896348) (author), Muhammad Sharif (7039565) (author), Nauman Qadeer (16896351) (author), Habib Ullah Khan (12024579) (author)
منشور في: 2021
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513560366809088
author Ghulam Jillani Ansari (16896342)
author2 Jamal Hussain Shah (16896345)
Mylene C. Q. Farias (16896348)
Muhammad Sharif (7039565)
Nauman Qadeer (16896351)
Habib Ullah Khan (12024579)
author2_role author
author
author
author
author
author_facet Ghulam Jillani Ansari (16896342)
Jamal Hussain Shah (16896345)
Mylene C. Q. Farias (16896348)
Muhammad Sharif (7039565)
Nauman Qadeer (16896351)
Habib Ullah Khan (12024579)
author_role author
dc.creator.none.fl_str_mv Ghulam Jillani Ansari (16896342)
Jamal Hussain Shah (16896345)
Mylene C. Q. Farias (16896348)
Muhammad Sharif (7039565)
Nauman Qadeer (16896351)
Habib Ullah Khan (12024579)
dc.date.none.fl_str_mv 2021-04-05T00:00:00Z
dc.identifier.none.fl_str_mv 10.1109/access.2021.3071169
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/An_Optimized_Feature_Selection_Technique_in_Diversified_Natural_Scene_Text_for_Classification_Using_Genetic_Algorithm/24049242
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Artificial intelligence
Computer vision and multimedia computation
Machine learning
Feature extraction
Genetic algorithms
Classification algorithms
Support vector machines
Optimization
Text categorization
Sociology
Genetic algorithm
Natural scene text
Optimal feature selection
SFS
Feature fusion
Feature space dimensionality reduction
dc.title.none.fl_str_mv An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p>Natural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. Above all, the high dimension of the input image's feature space is another major problem in such tasks. This work is aimed to tackle these problems and remove redundant and irrelevant features to improve the generalization properties of the classifier. In other words, the selection of a qualitative and discriminative set of features, aiming to reduce dimensionality that helps to achieve a successful pattern classification. In this work, we use a biologically inspired genetic algorithm because crossover employed in such algorithm significantly improve the quality of multimodal discriminative set of features and hence improve the classification accuracy for diversified natural scene text images. The Support Vector Machine (SVM) algorithm is used for classification and the average F-Score is used as fitness function and target condition. First after preprocessing input images, the whole feature space (population) is built using a multimodal feature representation technique. Second, a feature level fusion approach is used to combine the features. Third, to improve the average F-score of the classifier, we apply a meta-heuristic optimization technique using a GA for feature selection. The proposed algorithm is tested on five publically available datasets and the results are compared with various state-of-the-art methods. The obtained results proved that the proposed algorithm performs well while classifying textual and non-textual region with better accuracy than benchmark state-of-the-art algorithms.</p><h2>Other Information</h2><p>Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/legalcode" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2021.3071169" target="_blank">https://dx.doi.org/10.1109/access.2021.3071169</a></p>
eu_rights_str_mv openAccess
id Manara2_6e1146629015d202d778595f3a667eeb
identifier_str_mv 10.1109/access.2021.3071169
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/24049242
publishDate 2021
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic AlgorithmGhulam Jillani Ansari (16896342)Jamal Hussain Shah (16896345)Mylene C. Q. Farias (16896348)Muhammad Sharif (7039565)Nauman Qadeer (16896351)Habib Ullah Khan (12024579)Information and computing sciencesArtificial intelligenceComputer vision and multimedia computationMachine learningFeature extractionGenetic algorithmsClassification algorithmsSupport vector machinesOptimizationText categorizationSociologyGenetic algorithmNatural scene textOptimal feature selectionSFSFeature fusionFeature space dimensionality reduction<p>Natural scene text classification is considered to be a challenging task because of diversified set of image contents, presence of degradations including noise, low contrast/resolution and the random appearance of foreground (font, style, sizes and orientations) and background properties. Above all, the high dimension of the input image's feature space is another major problem in such tasks. This work is aimed to tackle these problems and remove redundant and irrelevant features to improve the generalization properties of the classifier. In other words, the selection of a qualitative and discriminative set of features, aiming to reduce dimensionality that helps to achieve a successful pattern classification. In this work, we use a biologically inspired genetic algorithm because crossover employed in such algorithm significantly improve the quality of multimodal discriminative set of features and hence improve the classification accuracy for diversified natural scene text images. The Support Vector Machine (SVM) algorithm is used for classification and the average F-Score is used as fitness function and target condition. First after preprocessing input images, the whole feature space (population) is built using a multimodal feature representation technique. Second, a feature level fusion approach is used to combine the features. Third, to improve the average F-score of the classifier, we apply a meta-heuristic optimization technique using a GA for feature selection. The proposed algorithm is tested on five publically available datasets and the results are compared with various state-of-the-art methods. The obtained results proved that the proposed algorithm performs well while classifying textual and non-textual region with better accuracy than benchmark state-of-the-art algorithms.</p><h2>Other Information</h2><p>Published in: IEEE Access<br>License: <a href="https://creativecommons.org/licenses/by/4.0/legalcode" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1109/access.2021.3071169" target="_blank">https://dx.doi.org/10.1109/access.2021.3071169</a></p>2021-04-05T00:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1109/access.2021.3071169https://figshare.com/articles/journal_contribution/An_Optimized_Feature_Selection_Technique_in_Diversified_Natural_Scene_Text_for_Classification_Using_Genetic_Algorithm/24049242CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/240492422021-04-05T00:00:00Z
spellingShingle An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
Ghulam Jillani Ansari (16896342)
Information and computing sciences
Artificial intelligence
Computer vision and multimedia computation
Machine learning
Feature extraction
Genetic algorithms
Classification algorithms
Support vector machines
Optimization
Text categorization
Sociology
Genetic algorithm
Natural scene text
Optimal feature selection
SFS
Feature fusion
Feature space dimensionality reduction
status_str publishedVersion
title An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_full An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_fullStr An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_full_unstemmed An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_short An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
title_sort An Optimized Feature Selection Technique in Diversified Natural Scene Text for Classification Using Genetic Algorithm
topic Information and computing sciences
Artificial intelligence
Computer vision and multimedia computation
Machine learning
Feature extraction
Genetic algorithms
Classification algorithms
Support vector machines
Optimization
Text categorization
Sociology
Genetic algorithm
Natural scene text
Optimal feature selection
SFS
Feature fusion
Feature space dimensionality reduction