A novel deep learning identifier for promoters and their strength using heterogeneous features
<p>Promoters, which are short (50–1500 base-pair) in DNA regions, have emerged to play a critical role in the regulation of gene transcription. Numerous dangerous diseases, likewise cancer, cardiovascular, and inflammatory bowel diseases, are caused by genetic variations in promoters. Conseque...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , , , |
| منشور في: |
2024
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513540944035840 |
|---|---|
| author | Aqsa Amjad (22155247) |
| author2 | Saeed Ahmed (417198) Muhammad Kabir (4582228) Muhammad Arif (769250) Tanvir Alam (638619) |
| author2_role | author author author author |
| author_facet | Aqsa Amjad (22155247) Saeed Ahmed (417198) Muhammad Kabir (4582228) Muhammad Arif (769250) Tanvir Alam (638619) |
| author_role | author |
| dc.creator.none.fl_str_mv | Aqsa Amjad (22155247) Saeed Ahmed (417198) Muhammad Kabir (4582228) Muhammad Arif (769250) Tanvir Alam (638619) |
| dc.date.none.fl_str_mv | 2024-08-23T18:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1016/j.ymeth.2024.08.005 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/A_novel_deep_learning_identifier_for_promoters_and_their_strength_using_heterogeneous_features/30023341 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Biological sciences Genetics Biomedical and clinical sciences Cardiovascular medicine and haematology Oncology and carcinogenesis Engineering Biomedical engineering Computational intelligence Bioinformatics Promoters Convolutional neural network Bidirectional long short-term memory |
| dc.title.none.fl_str_mv | A novel deep learning identifier for promoters and their strength using heterogeneous features |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p>Promoters, which are short (50–1500 base-pair) in DNA regions, have emerged to play a critical role in the regulation of gene transcription. Numerous dangerous diseases, likewise cancer, cardiovascular, and inflammatory bowel diseases, are caused by genetic variations in promoters. Consequently, the correct identification and characterization of promoters are significant for the discovery of drugs. However, experimental approaches to recognizing promoters and their strengths are challenging in terms of cost, time, and resources. Therefore, computational techniques are highly desirable for the correct characterization of promoters from unannotated genomic data. Here, we designed a powerful bi-layer deep-learning based predictor named “PROCABLES“, which discriminates DNA samples as promoters in the first-phase and strong or weak promoters in the second-phase respectively. The proposed method utilizes five distinct features, such as word2vec, k-spaced nucleotide pairs, trinucleotide propensity-based features, trinucleotide composition, and electron–ion interaction pseudopotentials, to extract the hidden patterns from the DNA sequence. Afterwards, a stacked framework is formed by integrating a convolutional neural network (CNN) with bidirectional long-short-term memory (LSTM) using multi-view attributes to train the proposed model. The PROCABLES model achieved an accuracy of 0.971 and 0.920 and the MCC 0.940 and 0.840 for the first and second-layer using the ten-fold cross-validation test, respectively. The predicted results anticipate that the proposed PROCABLES protocol outperformed the advanced computational predictors targeting promoters and their types. In summary, this research will provide useful hints for the recognition of large-scale promoters in particular and other DNA problems in general.</p><h2>Other Information</h2> <p> Published in: Methods<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.ymeth.2024.08.005" target="_blank">https://dx.doi.org/10.1016/j.ymeth.2024.08.005</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_da9bc9a1cfb98da484ff594e038d3be9 |
| identifier_str_mv | 10.1016/j.ymeth.2024.08.005 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/30023341 |
| publishDate | 2024 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | A novel deep learning identifier for promoters and their strength using heterogeneous featuresAqsa Amjad (22155247)Saeed Ahmed (417198)Muhammad Kabir (4582228)Muhammad Arif (769250)Tanvir Alam (638619)Biological sciencesGeneticsBiomedical and clinical sciencesCardiovascular medicine and haematologyOncology and carcinogenesisEngineeringBiomedical engineeringComputational intelligenceBioinformaticsPromotersConvolutional neural networkBidirectional long short-term memory<p>Promoters, which are short (50–1500 base-pair) in DNA regions, have emerged to play a critical role in the regulation of gene transcription. Numerous dangerous diseases, likewise cancer, cardiovascular, and inflammatory bowel diseases, are caused by genetic variations in promoters. Consequently, the correct identification and characterization of promoters are significant for the discovery of drugs. However, experimental approaches to recognizing promoters and their strengths are challenging in terms of cost, time, and resources. Therefore, computational techniques are highly desirable for the correct characterization of promoters from unannotated genomic data. Here, we designed a powerful bi-layer deep-learning based predictor named “PROCABLES“, which discriminates DNA samples as promoters in the first-phase and strong or weak promoters in the second-phase respectively. The proposed method utilizes five distinct features, such as word2vec, k-spaced nucleotide pairs, trinucleotide propensity-based features, trinucleotide composition, and electron–ion interaction pseudopotentials, to extract the hidden patterns from the DNA sequence. Afterwards, a stacked framework is formed by integrating a convolutional neural network (CNN) with bidirectional long-short-term memory (LSTM) using multi-view attributes to train the proposed model. The PROCABLES model achieved an accuracy of 0.971 and 0.920 and the MCC 0.940 and 0.840 for the first and second-layer using the ten-fold cross-validation test, respectively. The predicted results anticipate that the proposed PROCABLES protocol outperformed the advanced computational predictors targeting promoters and their types. In summary, this research will provide useful hints for the recognition of large-scale promoters in particular and other DNA problems in general.</p><h2>Other Information</h2> <p> Published in: Methods<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.ymeth.2024.08.005" target="_blank">https://dx.doi.org/10.1016/j.ymeth.2024.08.005</a></p>2024-08-23T18:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1016/j.ymeth.2024.08.005https://figshare.com/articles/journal_contribution/A_novel_deep_learning_identifier_for_promoters_and_their_strength_using_heterogeneous_features/30023341CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/300233412024-08-23T18:00:00Z |
| spellingShingle | A novel deep learning identifier for promoters and their strength using heterogeneous features Aqsa Amjad (22155247) Biological sciences Genetics Biomedical and clinical sciences Cardiovascular medicine and haematology Oncology and carcinogenesis Engineering Biomedical engineering Computational intelligence Bioinformatics Promoters Convolutional neural network Bidirectional long short-term memory |
| status_str | publishedVersion |
| title | A novel deep learning identifier for promoters and their strength using heterogeneous features |
| title_full | A novel deep learning identifier for promoters and their strength using heterogeneous features |
| title_fullStr | A novel deep learning identifier for promoters and their strength using heterogeneous features |
| title_full_unstemmed | A novel deep learning identifier for promoters and their strength using heterogeneous features |
| title_short | A novel deep learning identifier for promoters and their strength using heterogeneous features |
| title_sort | A novel deep learning identifier for promoters and their strength using heterogeneous features |
| topic | Biological sciences Genetics Biomedical and clinical sciences Cardiovascular medicine and haematology Oncology and carcinogenesis Engineering Biomedical engineering Computational intelligence Bioinformatics Promoters Convolutional neural network Bidirectional long short-term memory |