Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx

Introduction<p>Microalgae constitute a prominent feedstock for producing biofuels and biochemicals by virtue of their prolific reproduction, high bioproduct accumulation, and the ability to grow in brackish and saline water. However, naturally occurring wild type algal strains are rarely optim...

Full description

Saved in:
Bibliographic Details
Main Author: Mary-Francis LaPorte (20821568) (author)
Other Authors: Neha Arora (1693135) (author), Struan Clark (18706753) (author), Ambarish Nag (180001) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852022338289139712
author Mary-Francis LaPorte (20821568)
author2 Neha Arora (1693135)
Struan Clark (18706753)
Ambarish Nag (180001)
author2_role author
author
author
author_facet Mary-Francis LaPorte (20821568)
Neha Arora (1693135)
Struan Clark (18706753)
Ambarish Nag (180001)
author_role author
dc.creator.none.fl_str_mv Mary-Francis LaPorte (20821568)
Neha Arora (1693135)
Struan Clark (18706753)
Ambarish Nag (180001)
dc.date.none.fl_str_mv 2025-03-04T05:17:04Z
dc.identifier.none.fl_str_mv 10.3389/fmicb.2025.1541898.s001
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/Data_Sheet_1_AlgaeOrtho_a_bioinformatics_tool_for_processing_ortholog_inference_results_in_algae_docx/28530071
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Microbiology
bioengineering
algae
metabolic engineering
bioinformatics
nutraceuticals
protein orthology
dc.title.none.fl_str_mv Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description Introduction<p>Microalgae constitute a prominent feedstock for producing biofuels and biochemicals by virtue of their prolific reproduction, high bioproduct accumulation, and the ability to grow in brackish and saline water. However, naturally occurring wild type algal strains are rarely optimal for industrial use; therefore, bioengineering of algae is necessary to generate superior performing strains that can address production challenges in industrial settings, particularly the bioenergy and bioproduct sectors. One of the crucial steps in this process is deciding on a bioengineering target: namely, which gene/protein to differentially express. These targets are often orthologs which are defined as genes/proteins originating from a common ancestor in divergent species. Although bioinformatics tools for the identification of protein orthologs already exist, processing the output from such tools is nontrivial, especially for a researcher with little or no bioinformatics experience.</p>Methods<p>The present study introduces AlgaeOrtho, a user-friendly tool that builds upon the SonicParanoid orthology inference tool (based on an algorithm that identifies potential protein orthologs based on amino acid sequences) and the PhycoCosm database from JGI (Joint Genome Institute) to help researchers identify orthologs of their proteins of interest in multiple diverse algal species.</p>Results<p>The output of this application includes a table of the putative orthologs of their protein of interest, a heatmap showing sequence similarity (%), and an unrooted tree of the putative protein orthologs. Notably, the tool would be instrumental in identifying novel bioengineering targets in different algal strains, including targets in not-fully annotated algal species, since it does not depend on existing protein annotations. We tested AlgaeOrtho using three case studies, for which orthologs of proteins relevant to bioengineering targets, were identified from diverse algal species, demonstrating its ease of use and utility for bioengineering researchers.</p>Discussion<p>This tool is unique in the protein ortholog identification space as it can visualize putative orthologs, as desired by the user, across several algal species.</p>
eu_rights_str_mv openAccess
id Manara_8cc6b9165f75a73cb73fef01ac97483d
identifier_str_mv 10.3389/fmicb.2025.1541898.s001
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/28530071
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docxMary-Francis LaPorte (20821568)Neha Arora (1693135)Struan Clark (18706753)Ambarish Nag (180001)Microbiologybioengineeringalgaemetabolic engineeringbioinformaticsnutraceuticalsprotein orthologyIntroduction<p>Microalgae constitute a prominent feedstock for producing biofuels and biochemicals by virtue of their prolific reproduction, high bioproduct accumulation, and the ability to grow in brackish and saline water. However, naturally occurring wild type algal strains are rarely optimal for industrial use; therefore, bioengineering of algae is necessary to generate superior performing strains that can address production challenges in industrial settings, particularly the bioenergy and bioproduct sectors. One of the crucial steps in this process is deciding on a bioengineering target: namely, which gene/protein to differentially express. These targets are often orthologs which are defined as genes/proteins originating from a common ancestor in divergent species. Although bioinformatics tools for the identification of protein orthologs already exist, processing the output from such tools is nontrivial, especially for a researcher with little or no bioinformatics experience.</p>Methods<p>The present study introduces AlgaeOrtho, a user-friendly tool that builds upon the SonicParanoid orthology inference tool (based on an algorithm that identifies potential protein orthologs based on amino acid sequences) and the PhycoCosm database from JGI (Joint Genome Institute) to help researchers identify orthologs of their proteins of interest in multiple diverse algal species.</p>Results<p>The output of this application includes a table of the putative orthologs of their protein of interest, a heatmap showing sequence similarity (%), and an unrooted tree of the putative protein orthologs. Notably, the tool would be instrumental in identifying novel bioengineering targets in different algal strains, including targets in not-fully annotated algal species, since it does not depend on existing protein annotations. We tested AlgaeOrtho using three case studies, for which orthologs of proteins relevant to bioengineering targets, were identified from diverse algal species, demonstrating its ease of use and utility for bioengineering researchers.</p>Discussion<p>This tool is unique in the protein ortholog identification space as it can visualize putative orthologs, as desired by the user, across several algal species.</p>2025-03-04T05:17:04ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.3389/fmicb.2025.1541898.s001https://figshare.com/articles/dataset/Data_Sheet_1_AlgaeOrtho_a_bioinformatics_tool_for_processing_ortholog_inference_results_in_algae_docx/28530071CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/285300712025-03-04T05:17:04Z
spellingShingle Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx
Mary-Francis LaPorte (20821568)
Microbiology
bioengineering
algae
metabolic engineering
bioinformatics
nutraceuticals
protein orthology
status_str publishedVersion
title Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx
title_full Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx
title_fullStr Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx
title_full_unstemmed Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx
title_short Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx
title_sort Data Sheet 1_AlgaeOrtho, a bioinformatics tool for processing ortholog inference results in algae.docx
topic Microbiology
bioengineering
algae
metabolic engineering
bioinformatics
nutraceuticals
protein orthology