DASSI: differential architecture search for splice identification from DNA sequences

<h2>Background</h2> <p>The data explosion caused by unprecedented advancements in the field of genomics is constantly challenging the conventional methods used in the interpretation of the human genome. The demand for robust algorithms over the recent years has brought huge success...

Full description

Saved in:
Bibliographic Details
Main Author: Shabir Moosa (14153316) (author)
Other Authors: Prof. Abbes Amira (14153319) (author), Dr. Sabri Boughorbel (14153322) (author)
Published: 2022
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864513566578573312
author Shabir Moosa (14153316)
author2 Prof. Abbes Amira (14153319)
Dr. Sabri Boughorbel (14153322)
author2_role author
author
author_facet Shabir Moosa (14153316)
Prof. Abbes Amira (14153319)
Dr. Sabri Boughorbel (14153322)
author_role author
dc.creator.none.fl_str_mv Shabir Moosa (14153316)
Prof. Abbes Amira (14153319)
Dr. Sabri Boughorbel (14153322)
dc.date.none.fl_str_mv 2022-11-22T21:18:13Z
dc.identifier.none.fl_str_mv 10.1186/s13040-021-00237-y
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/DASSI_differential_architecture_search_for_splice_identification_from_DNA_sequences/21598467
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Numerical and computational mathematics
Computational Mathematics
Computational Theory and Mathematics
Computer Science Applications
Genetics
Molecular Biology
Biochemistry
dc.title.none.fl_str_mv DASSI: differential architecture search for splice identification from DNA sequences
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <h2>Background</h2> <p>The data explosion caused by unprecedented advancements in the field of genomics is constantly challenging the conventional methods used in the interpretation of the human genome. The demand for robust algorithms over the recent years has brought huge success in the field of Deep Learning (DL) in solving many difficult tasks in image, speech and natural language processing by automating the manual process of architecture design. This has been fueled through the development of new DL architectures. Yet genomics possesses unique challenges that requires customization and development of new DL models.</p> <h2>Methods</h2> <p>We proposed a new model, DASSI, by adapting a differential architecture search method and applying it to the Splice Site (SS) recognition task on DNA sequences to discover new high-performance convolutional architectures in an automated manner. We evaluated the discovered model against state-of-the-art tools to classify true and false SS in Homo sapiens (Human), Arabidopsis thaliana (Plant), Caenorhabditis elegans (Worm) and Drosophila melanogaster (Fly).</p> <h2>Results</h2> <p>Our experimental evaluation demonstrated that the discovered architecture outperformed baseline models and fixed architectures and showed competitive results against state-of-the-art models used in classification of splice sites. The proposed model - DASSI has a compact architecture and showed very good results on a transfer learning task. The benchmarking experiments of execution time and precision on architecture search and evaluation process showed better performance on recently available GPUs making it feasible to adopt architecture search based methods on large datasets.</p> <h2>Conclusions</h2> <p>We proposed the use of differential architecture search method (DASSI) to perform SS classification on raw DNA sequences, and discovered new neural network models with low number of tunable parameters and competitive performance compared with manually engineered architectures. We have extensively benchmarked DASSI model with other state-of-the-art models and assessed its computational efficiency. The results have shown a high potential of using automated architecture search mechanism for solving various problems in the field of genomics.</p><h2>Other Information</h2> <p> Published in: BioData Mining<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="http://dx.doi.org/10.1186/s13040-021-00237-y" target="_blank">http://dx.doi.org/10.1186/s13040-021-00237-y</a></p>
eu_rights_str_mv openAccess
id Manara2_d7dd308a83e22429149cc5957ed495af
identifier_str_mv 10.1186/s13040-021-00237-y
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/21598467
publishDate 2022
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling DASSI: differential architecture search for splice identification from DNA sequencesShabir Moosa (14153316)Prof. Abbes Amira (14153319)Dr. Sabri Boughorbel (14153322)Numerical and computational mathematicsComputational MathematicsComputational Theory and MathematicsComputer Science ApplicationsGeneticsMolecular BiologyBiochemistry<h2>Background</h2> <p>The data explosion caused by unprecedented advancements in the field of genomics is constantly challenging the conventional methods used in the interpretation of the human genome. The demand for robust algorithms over the recent years has brought huge success in the field of Deep Learning (DL) in solving many difficult tasks in image, speech and natural language processing by automating the manual process of architecture design. This has been fueled through the development of new DL architectures. Yet genomics possesses unique challenges that requires customization and development of new DL models.</p> <h2>Methods</h2> <p>We proposed a new model, DASSI, by adapting a differential architecture search method and applying it to the Splice Site (SS) recognition task on DNA sequences to discover new high-performance convolutional architectures in an automated manner. We evaluated the discovered model against state-of-the-art tools to classify true and false SS in Homo sapiens (Human), Arabidopsis thaliana (Plant), Caenorhabditis elegans (Worm) and Drosophila melanogaster (Fly).</p> <h2>Results</h2> <p>Our experimental evaluation demonstrated that the discovered architecture outperformed baseline models and fixed architectures and showed competitive results against state-of-the-art models used in classification of splice sites. The proposed model - DASSI has a compact architecture and showed very good results on a transfer learning task. The benchmarking experiments of execution time and precision on architecture search and evaluation process showed better performance on recently available GPUs making it feasible to adopt architecture search based methods on large datasets.</p> <h2>Conclusions</h2> <p>We proposed the use of differential architecture search method (DASSI) to perform SS classification on raw DNA sequences, and discovered new neural network models with low number of tunable parameters and competitive performance compared with manually engineered architectures. We have extensively benchmarked DASSI model with other state-of-the-art models and assessed its computational efficiency. The results have shown a high potential of using automated architecture search mechanism for solving various problems in the field of genomics.</p><h2>Other Information</h2> <p> Published in: BioData Mining<br> License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="http://dx.doi.org/10.1186/s13040-021-00237-y" target="_blank">http://dx.doi.org/10.1186/s13040-021-00237-y</a></p>2022-11-22T21:18:13ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1186/s13040-021-00237-yhttps://figshare.com/articles/journal_contribution/DASSI_differential_architecture_search_for_splice_identification_from_DNA_sequences/21598467CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/215984672022-11-22T21:18:13Z
spellingShingle DASSI: differential architecture search for splice identification from DNA sequences
Shabir Moosa (14153316)
Numerical and computational mathematics
Computational Mathematics
Computational Theory and Mathematics
Computer Science Applications
Genetics
Molecular Biology
Biochemistry
status_str publishedVersion
title DASSI: differential architecture search for splice identification from DNA sequences
title_full DASSI: differential architecture search for splice identification from DNA sequences
title_fullStr DASSI: differential architecture search for splice identification from DNA sequences
title_full_unstemmed DASSI: differential architecture search for splice identification from DNA sequences
title_short DASSI: differential architecture search for splice identification from DNA sequences
title_sort DASSI: differential architecture search for splice identification from DNA sequences
topic Numerical and computational mathematics
Computational Mathematics
Computational Theory and Mathematics
Computer Science Applications
Genetics
Molecular Biology
Biochemistry