A novel deep learning identifier for promoters and their strength using heterogeneous features

<p>Promoters, which are short (50–1500 base-pair) in DNA regions, have emerged to play a critical role in the regulation of gene transcription. Numerous dangerous diseases, likewise cancer, cardiovascular, and inflammatory bowel diseases, are caused by genetic variations in promoters. Conseque...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Aqsa Amjad (22155247) (author)
مؤلفون آخرون: Saeed Ahmed (417198) (author), Muhammad Kabir (4582228) (author), Muhammad Arif (769250) (author), Tanvir Alam (638619) (author)
منشور في: 2024
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513540944035840
author Aqsa Amjad (22155247)
author2 Saeed Ahmed (417198)
Muhammad Kabir (4582228)
Muhammad Arif (769250)
Tanvir Alam (638619)
author2_role author
author
author
author
author_facet Aqsa Amjad (22155247)
Saeed Ahmed (417198)
Muhammad Kabir (4582228)
Muhammad Arif (769250)
Tanvir Alam (638619)
author_role author
dc.creator.none.fl_str_mv Aqsa Amjad (22155247)
Saeed Ahmed (417198)
Muhammad Kabir (4582228)
Muhammad Arif (769250)
Tanvir Alam (638619)
dc.date.none.fl_str_mv 2024-08-23T18:00:00Z
dc.identifier.none.fl_str_mv 10.1016/j.ymeth.2024.08.005
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/A_novel_deep_learning_identifier_for_promoters_and_their_strength_using_heterogeneous_features/30023341
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Biological sciences
Genetics
Biomedical and clinical sciences
Cardiovascular medicine and haematology
Oncology and carcinogenesis
Engineering
Biomedical engineering
Computational intelligence
Bioinformatics
Promoters
Convolutional neural network
Bidirectional long short-term memory
dc.title.none.fl_str_mv A novel deep learning identifier for promoters and their strength using heterogeneous features
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p>Promoters, which are short (50–1500 base-pair) in DNA regions, have emerged to play a critical role in the regulation of gene transcription. Numerous dangerous diseases, likewise cancer, cardiovascular, and inflammatory bowel diseases, are caused by genetic variations in promoters. Consequently, the correct identification and characterization of promoters are significant for the discovery of drugs. However, experimental approaches to recognizing promoters and their strengths are challenging in terms of cost, time, and resources. Therefore, computational techniques are highly desirable for the correct characterization of promoters from unannotated genomic data. Here, we designed a powerful bi-layer deep-learning based predictor named “PROCABLES“, which discriminates DNA samples as promoters in the first-phase and strong or weak promoters in the second-phase respectively. The proposed method utilizes five distinct features, such as word2vec, k-spaced nucleotide pairs, trinucleotide propensity-based features, trinucleotide composition, and electron–ion interaction pseudopotentials, to extract the hidden patterns from the DNA sequence. Afterwards, a stacked framework is formed by integrating a convolutional neural network (CNN) with bidirectional long-short-term memory (LSTM) using multi-view attributes to train the proposed model. The PROCABLES model achieved an accuracy of 0.971 and 0.920 and the MCC 0.940 and 0.840 for the first and second-layer using the ten-fold cross-validation test, respectively. The predicted results anticipate that the proposed PROCABLES protocol outperformed the advanced computational predictors targeting promoters and their types. In summary, this research will provide useful hints for the recognition of large-scale promoters in particular and other DNA problems in general.</p><h2>Other Information</h2> <p> Published in: Methods<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.ymeth.2024.08.005" target="_blank">https://dx.doi.org/10.1016/j.ymeth.2024.08.005</a></p>
eu_rights_str_mv openAccess
id Manara2_da9bc9a1cfb98da484ff594e038d3be9
identifier_str_mv 10.1016/j.ymeth.2024.08.005
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/30023341
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling A novel deep learning identifier for promoters and their strength using heterogeneous featuresAqsa Amjad (22155247)Saeed Ahmed (417198)Muhammad Kabir (4582228)Muhammad Arif (769250)Tanvir Alam (638619)Biological sciencesGeneticsBiomedical and clinical sciencesCardiovascular medicine and haematologyOncology and carcinogenesisEngineeringBiomedical engineeringComputational intelligenceBioinformaticsPromotersConvolutional neural networkBidirectional long short-term memory<p>Promoters, which are short (50–1500 base-pair) in DNA regions, have emerged to play a critical role in the regulation of gene transcription. Numerous dangerous diseases, likewise cancer, cardiovascular, and inflammatory bowel diseases, are caused by genetic variations in promoters. Consequently, the correct identification and characterization of promoters are significant for the discovery of drugs. However, experimental approaches to recognizing promoters and their strengths are challenging in terms of cost, time, and resources. Therefore, computational techniques are highly desirable for the correct characterization of promoters from unannotated genomic data. Here, we designed a powerful bi-layer deep-learning based predictor named “PROCABLES“, which discriminates DNA samples as promoters in the first-phase and strong or weak promoters in the second-phase respectively. The proposed method utilizes five distinct features, such as word2vec, k-spaced nucleotide pairs, trinucleotide propensity-based features, trinucleotide composition, and electron–ion interaction pseudopotentials, to extract the hidden patterns from the DNA sequence. Afterwards, a stacked framework is formed by integrating a convolutional neural network (CNN) with bidirectional long-short-term memory (LSTM) using multi-view attributes to train the proposed model. The PROCABLES model achieved an accuracy of 0.971 and 0.920 and the MCC 0.940 and 0.840 for the first and second-layer using the ten-fold cross-validation test, respectively. The predicted results anticipate that the proposed PROCABLES protocol outperformed the advanced computational predictors targeting promoters and their types. In summary, this research will provide useful hints for the recognition of large-scale promoters in particular and other DNA problems in general.</p><h2>Other Information</h2> <p> Published in: Methods<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.ymeth.2024.08.005" target="_blank">https://dx.doi.org/10.1016/j.ymeth.2024.08.005</a></p>2024-08-23T18:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1016/j.ymeth.2024.08.005https://figshare.com/articles/journal_contribution/A_novel_deep_learning_identifier_for_promoters_and_their_strength_using_heterogeneous_features/30023341CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/300233412024-08-23T18:00:00Z
spellingShingle A novel deep learning identifier for promoters and their strength using heterogeneous features
Aqsa Amjad (22155247)
Biological sciences
Genetics
Biomedical and clinical sciences
Cardiovascular medicine and haematology
Oncology and carcinogenesis
Engineering
Biomedical engineering
Computational intelligence
Bioinformatics
Promoters
Convolutional neural network
Bidirectional long short-term memory
status_str publishedVersion
title A novel deep learning identifier for promoters and their strength using heterogeneous features
title_full A novel deep learning identifier for promoters and their strength using heterogeneous features
title_fullStr A novel deep learning identifier for promoters and their strength using heterogeneous features
title_full_unstemmed A novel deep learning identifier for promoters and their strength using heterogeneous features
title_short A novel deep learning identifier for promoters and their strength using heterogeneous features
title_sort A novel deep learning identifier for promoters and their strength using heterogeneous features
topic Biological sciences
Genetics
Biomedical and clinical sciences
Cardiovascular medicine and haematology
Oncology and carcinogenesis
Engineering
Biomedical engineering
Computational intelligence
Bioinformatics
Promoters
Convolutional neural network
Bidirectional long short-term memory