DPI_CDF: druggable protein identifier using cascade deep forest
<h3>Background</h3><p dir="ltr">Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computatio...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , , , |
| منشور في: |
2024
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513509787697152 |
|---|---|
| author | Muhammad Arif (769250) |
| author2 | Ge Fang (3104241) Ali Ghulam (18321907) Saleh Musleh (15279190) Tanvir Alam (638619) |
| author2_role | author author author author |
| author_facet | Muhammad Arif (769250) Ge Fang (3104241) Ali Ghulam (18321907) Saleh Musleh (15279190) Tanvir Alam (638619) |
| author_role | author |
| dc.creator.none.fl_str_mv | Muhammad Arif (769250) Ge Fang (3104241) Ali Ghulam (18321907) Saleh Musleh (15279190) Tanvir Alam (638619) |
| dc.date.none.fl_str_mv | 2024-04-05T03:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1186/s12859-024-05744-3 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/DPI_CDF_druggable_protein_identifier_using_cascade_deep_forest/26421580 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Biological sciences Bioinformatics and computational biology Biomedical and clinical sciences Pharmacology and pharmaceutical sciences Druggable proteins Bioinformatics PSSM Physicochemical features Cascade deep forest |
| dc.title.none.fl_str_mv | DPI_CDF: druggable protein identifier using cascade deep forest |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <h3>Background</h3><p dir="ltr">Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computational methods are highly desirable as an alternative to expedite the large-scale identification of druggable proteins (DPs); however, the existing in silico predictor’s performance is still not satisfactory.</p><h3>Methods</h3><p dir="ltr">In this study, we developed a novel deep learning-based model DPI_CDF for predicting DPs based on protein sequence only. DPI_CDF utilizes evolutionary-based (i.e., histograms of oriented gradients for position-specific scoring matrix), physiochemical-based (i.e., component protein sequence representation), and compositional-based (i.e., normalized qualitative characteristic) properties of protein sequence to generate features. Then a hierarchical deep forest model fuses these three encoding schemes to build the proposed model DPI_CDF.</p><h3>Results</h3><p dir="ltr">The empirical outcomes on 10-fold cross-validation demonstrate that the proposed model achieved 99.13 % accuracy and 0.982 of Matthew’s-correlation-coefficient (MCC) on the training dataset. The generalization power of the trained model is further examined on an independent dataset and achieved 95.01% of maximum accuracy and 0.900 MCC. When compared to current state-of-the-art methods, DPI_CDF improves in terms of accuracy by 4.27% and 4.31% on training and testing datasets, respectively. We believe, DPI_CDF will support the research community to identify druggable proteins and escalate the drug discovery process.</p><h2>Other Information</h2><p dir="ltr">Published in: BMC Bioinformatics<br>License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1186/s12859-024-05744-3" target="_blank">https://dx.doi.org/10.1186/s12859-024-05744-3</a></p><p><br></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_ac55321a79ca61b78993d513482aaa65 |
| identifier_str_mv | 10.1186/s12859-024-05744-3 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/26421580 |
| publishDate | 2024 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | DPI_CDF: druggable protein identifier using cascade deep forestMuhammad Arif (769250)Ge Fang (3104241)Ali Ghulam (18321907)Saleh Musleh (15279190)Tanvir Alam (638619)Biological sciencesBioinformatics and computational biologyBiomedical and clinical sciencesPharmacology and pharmaceutical sciencesDruggable proteinsBioinformaticsPSSMPhysicochemical featuresCascade deep forest<h3>Background</h3><p dir="ltr">Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computational methods are highly desirable as an alternative to expedite the large-scale identification of druggable proteins (DPs); however, the existing in silico predictor’s performance is still not satisfactory.</p><h3>Methods</h3><p dir="ltr">In this study, we developed a novel deep learning-based model DPI_CDF for predicting DPs based on protein sequence only. DPI_CDF utilizes evolutionary-based (i.e., histograms of oriented gradients for position-specific scoring matrix), physiochemical-based (i.e., component protein sequence representation), and compositional-based (i.e., normalized qualitative characteristic) properties of protein sequence to generate features. Then a hierarchical deep forest model fuses these three encoding schemes to build the proposed model DPI_CDF.</p><h3>Results</h3><p dir="ltr">The empirical outcomes on 10-fold cross-validation demonstrate that the proposed model achieved 99.13 % accuracy and 0.982 of Matthew’s-correlation-coefficient (MCC) on the training dataset. The generalization power of the trained model is further examined on an independent dataset and achieved 95.01% of maximum accuracy and 0.900 MCC. When compared to current state-of-the-art methods, DPI_CDF improves in terms of accuracy by 4.27% and 4.31% on training and testing datasets, respectively. We believe, DPI_CDF will support the research community to identify druggable proteins and escalate the drug discovery process.</p><h2>Other Information</h2><p dir="ltr">Published in: BMC Bioinformatics<br>License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1186/s12859-024-05744-3" target="_blank">https://dx.doi.org/10.1186/s12859-024-05744-3</a></p><p><br></p>2024-04-05T03:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1186/s12859-024-05744-3https://figshare.com/articles/journal_contribution/DPI_CDF_druggable_protein_identifier_using_cascade_deep_forest/26421580CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/264215802024-04-05T03:00:00Z |
| spellingShingle | DPI_CDF: druggable protein identifier using cascade deep forest Muhammad Arif (769250) Biological sciences Bioinformatics and computational biology Biomedical and clinical sciences Pharmacology and pharmaceutical sciences Druggable proteins Bioinformatics PSSM Physicochemical features Cascade deep forest |
| status_str | publishedVersion |
| title | DPI_CDF: druggable protein identifier using cascade deep forest |
| title_full | DPI_CDF: druggable protein identifier using cascade deep forest |
| title_fullStr | DPI_CDF: druggable protein identifier using cascade deep forest |
| title_full_unstemmed | DPI_CDF: druggable protein identifier using cascade deep forest |
| title_short | DPI_CDF: druggable protein identifier using cascade deep forest |
| title_sort | DPI_CDF: druggable protein identifier using cascade deep forest |
| topic | Biological sciences Bioinformatics and computational biology Biomedical and clinical sciences Pharmacology and pharmaceutical sciences Druggable proteins Bioinformatics PSSM Physicochemical features Cascade deep forest |