DPI_CDF: druggable protein identifier using cascade deep forest

<h3>Background</h3><p dir="ltr">Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computatio...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Muhammad Arif (769250) (author)
مؤلفون آخرون: Ge Fang (3104241) (author), Ali Ghulam (18321907) (author), Saleh Musleh (15279190) (author), Tanvir Alam (638619) (author)
منشور في: 2024
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513509787697152
author Muhammad Arif (769250)
author2 Ge Fang (3104241)
Ali Ghulam (18321907)
Saleh Musleh (15279190)
Tanvir Alam (638619)
author2_role author
author
author
author
author_facet Muhammad Arif (769250)
Ge Fang (3104241)
Ali Ghulam (18321907)
Saleh Musleh (15279190)
Tanvir Alam (638619)
author_role author
dc.creator.none.fl_str_mv Muhammad Arif (769250)
Ge Fang (3104241)
Ali Ghulam (18321907)
Saleh Musleh (15279190)
Tanvir Alam (638619)
dc.date.none.fl_str_mv 2024-04-05T03:00:00Z
dc.identifier.none.fl_str_mv 10.1186/s12859-024-05744-3
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/DPI_CDF_druggable_protein_identifier_using_cascade_deep_forest/26421580
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Biological sciences
Bioinformatics and computational biology
Biomedical and clinical sciences
Pharmacology and pharmaceutical sciences
Druggable proteins
Bioinformatics
PSSM
Physicochemical features
Cascade deep forest
dc.title.none.fl_str_mv DPI_CDF: druggable protein identifier using cascade deep forest
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <h3>Background</h3><p dir="ltr">Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computational methods are highly desirable as an alternative to expedite the large-scale identification of druggable proteins (DPs); however, the existing in silico predictor’s performance is still not satisfactory.</p><h3>Methods</h3><p dir="ltr">In this study, we developed a novel deep learning-based model DPI_CDF for predicting DPs based on protein sequence only. DPI_CDF utilizes evolutionary-based (i.e., histograms of oriented gradients for position-specific scoring matrix), physiochemical-based (i.e., component protein sequence representation), and compositional-based (i.e., normalized qualitative characteristic) properties of protein sequence to generate features. Then a hierarchical deep forest model fuses these three encoding schemes to build the proposed model DPI_CDF.</p><h3>Results</h3><p dir="ltr">The empirical outcomes on 10-fold cross-validation demonstrate that the proposed model achieved 99.13 % accuracy and 0.982 of Matthew’s-correlation-coefficient (MCC) on the training dataset. The generalization power of the trained model is further examined on an independent dataset and achieved 95.01% of maximum accuracy and 0.900 MCC. When compared to current state-of-the-art methods, DPI_CDF improves in terms of accuracy by 4.27% and 4.31% on training and testing datasets, respectively. We believe, DPI_CDF will support the research community to identify druggable proteins and escalate the drug discovery process.</p><h2>Other Information</h2><p dir="ltr">Published in: BMC Bioinformatics<br>License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1186/s12859-024-05744-3" target="_blank">https://dx.doi.org/10.1186/s12859-024-05744-3</a></p><p><br></p>
eu_rights_str_mv openAccess
id Manara2_ac55321a79ca61b78993d513482aaa65
identifier_str_mv 10.1186/s12859-024-05744-3
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/26421580
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling DPI_CDF: druggable protein identifier using cascade deep forestMuhammad Arif (769250)Ge Fang (3104241)Ali Ghulam (18321907)Saleh Musleh (15279190)Tanvir Alam (638619)Biological sciencesBioinformatics and computational biologyBiomedical and clinical sciencesPharmacology and pharmaceutical sciencesDruggable proteinsBioinformaticsPSSMPhysicochemical featuresCascade deep forest<h3>Background</h3><p dir="ltr">Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computational methods are highly desirable as an alternative to expedite the large-scale identification of druggable proteins (DPs); however, the existing in silico predictor’s performance is still not satisfactory.</p><h3>Methods</h3><p dir="ltr">In this study, we developed a novel deep learning-based model DPI_CDF for predicting DPs based on protein sequence only. DPI_CDF utilizes evolutionary-based (i.e., histograms of oriented gradients for position-specific scoring matrix), physiochemical-based (i.e., component protein sequence representation), and compositional-based (i.e., normalized qualitative characteristic) properties of protein sequence to generate features. Then a hierarchical deep forest model fuses these three encoding schemes to build the proposed model DPI_CDF.</p><h3>Results</h3><p dir="ltr">The empirical outcomes on 10-fold cross-validation demonstrate that the proposed model achieved 99.13 % accuracy and 0.982 of Matthew’s-correlation-coefficient (MCC) on the training dataset. The generalization power of the trained model is further examined on an independent dataset and achieved 95.01% of maximum accuracy and 0.900 MCC. When compared to current state-of-the-art methods, DPI_CDF improves in terms of accuracy by 4.27% and 4.31% on training and testing datasets, respectively. We believe, DPI_CDF will support the research community to identify druggable proteins and escalate the drug discovery process.</p><h2>Other Information</h2><p dir="ltr">Published in: BMC Bioinformatics<br>License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1186/s12859-024-05744-3" target="_blank">https://dx.doi.org/10.1186/s12859-024-05744-3</a></p><p><br></p>2024-04-05T03:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1186/s12859-024-05744-3https://figshare.com/articles/journal_contribution/DPI_CDF_druggable_protein_identifier_using_cascade_deep_forest/26421580CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/264215802024-04-05T03:00:00Z
spellingShingle DPI_CDF: druggable protein identifier using cascade deep forest
Muhammad Arif (769250)
Biological sciences
Bioinformatics and computational biology
Biomedical and clinical sciences
Pharmacology and pharmaceutical sciences
Druggable proteins
Bioinformatics
PSSM
Physicochemical features
Cascade deep forest
status_str publishedVersion
title DPI_CDF: druggable protein identifier using cascade deep forest
title_full DPI_CDF: druggable protein identifier using cascade deep forest
title_fullStr DPI_CDF: druggable protein identifier using cascade deep forest
title_full_unstemmed DPI_CDF: druggable protein identifier using cascade deep forest
title_short DPI_CDF: druggable protein identifier using cascade deep forest
title_sort DPI_CDF: druggable protein identifier using cascade deep forest
topic Biological sciences
Bioinformatics and computational biology
Biomedical and clinical sciences
Pharmacology and pharmaceutical sciences
Druggable proteins
Bioinformatics
PSSM
Physicochemical features
Cascade deep forest