Transfer learning for genotype–phenotype prediction using deep learning models

Background For some understudied populations, genotype data is minimal for genotype-phenotype prediction. However, we can use the data of some other large populations to learn about the disease-causing SNPs and use that knowledge for the genotype-phenotype prediction of small populations. This manus...

Full description

Saved in:
Bibliographic Details
Main Author: Feng, Samuel (author)
Other Authors: Muneeb, Muhammad (author), Henschel, Andreas (author)
Published: 2022
Subjects:
Online Access:https://depot.sorbonne.ae/handle/20.500.12458/1344
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1857415063152885760
author Feng, Samuel
author2 Muneeb, Muhammad
Henschel, Andreas
author2_role author
author
author_facet Feng, Samuel
Muneeb, Muhammad
Henschel, Andreas
author_role author
dc.creator.none.fl_str_mv Feng, Samuel
Muneeb, Muhammad
Henschel, Andreas
dc.date.none.fl_str_mv 2022
2023-01-03T06:37:20Z
2023-01-03T06:37:20Z
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv 10.1186/s12859-022-05036-8
1471-2105
https://depot.sorbonne.ae/handle/20.500.12458/1344
10.1186/s12859-022-05036-8
dc.language.none.fl_str_mv en
dc.relation.none.fl_str_mv BMC Bioinformatics
dc.subject.none.fl_str_mv Bioinformatics
Genotype-phenotype
Transfer learning
Deep learning
Genetics
dc.title.none.fl_str_mv Transfer learning for genotype–phenotype prediction using deep learning models
dc.type.none.fl_str_mv Controlled Vocabulary for Resource Type Genres::text::periodical::journal::contribution to journal::journal article
description Background For some understudied populations, genotype data is minimal for genotype-phenotype prediction. However, we can use the data of some other large populations to learn about the disease-causing SNPs and use that knowledge for the genotype-phenotype prediction of small populations. This manuscript illustrated that transfer learning is applicable for genotype data and genotype-phenotype prediction. Results Using HAPGEN2 and PhenotypeSimulator, we generated eight phenotypes for 500 cases/500 controls (CEU, large population) and 100 cases/100 controls (YRI, small populations). We considered 5 (4 phenotypes) and 10 (4 phenotypes) different risk SNPs for each phenotype to evaluate the proposed method. The improved accuracy with transfer learning for eight different phenotypes was between 2 and 14.2 percent. The two-tailed p-value between the classification accuracies for all phenotypes without transfer learning and with transfer learning was 0.0306 for five risk SNPs phenotypes and 0.0478 for ten risk SNPs phenotypes. Conclusion The proposed pipeline is used to transfer knowledge for the case/control classification of the small population. In addition, we argue that this method can also be used in the realm of endangered species and personalized medicine. If the large population data is extensive compared to small population data, expect transfer learning results to improve significantly. We show that Transfer learning is capable to create powerful models for genotype-phenotype predictions in large, well-studied populations and fine-tune these models to populations were data is sparse.
id sorbonner_21c0ea08ecdf9500eb76943fce4fdc07
identifier_str_mv 10.1186/s12859-022-05036-8
1471-2105
language_invalid_str_mv en
network_acronym_str sorbonner
network_name_str Sorbonne University Abu Dhabi repository
oai_identifier_str oai:depot.sorbonne.ae:20.500.12458/1344
publishDate 2022
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Transfer learning for genotype–phenotype prediction using deep learning modelsFeng, SamuelMuneeb, MuhammadHenschel, AndreasBioinformaticsGenotype-phenotypeTransfer learningDeep learningGeneticsBackground For some understudied populations, genotype data is minimal for genotype-phenotype prediction. However, we can use the data of some other large populations to learn about the disease-causing SNPs and use that knowledge for the genotype-phenotype prediction of small populations. This manuscript illustrated that transfer learning is applicable for genotype data and genotype-phenotype prediction. Results Using HAPGEN2 and PhenotypeSimulator, we generated eight phenotypes for 500 cases/500 controls (CEU, large population) and 100 cases/100 controls (YRI, small populations). We considered 5 (4 phenotypes) and 10 (4 phenotypes) different risk SNPs for each phenotype to evaluate the proposed method. The improved accuracy with transfer learning for eight different phenotypes was between 2 and 14.2 percent. The two-tailed p-value between the classification accuracies for all phenotypes without transfer learning and with transfer learning was 0.0306 for five risk SNPs phenotypes and 0.0478 for ten risk SNPs phenotypes. Conclusion The proposed pipeline is used to transfer knowledge for the case/control classification of the small population. In addition, we argue that this method can also be used in the realm of endangered species and personalized medicine. If the large population data is extensive compared to small population data, expect transfer learning results to improve significantly. We show that Transfer learning is capable to create powerful models for genotype-phenotype predictions in large, well-studied populations and fine-tune these models to populations were data is sparse.2023-01-03T06:37:20Z2023-01-03T06:37:20Z2022Controlled Vocabulary for Resource Type Genres::text::periodical::journal::contribution to journal::journal articleapplication/pdf10.1186/s12859-022-05036-81471-2105https://depot.sorbonne.ae/handle/20.500.12458/134410.1186/s12859-022-05036-8enBMC Bioinformaticsoai:depot.sorbonne.ae:20.500.12458/13442023-01-05T07:17:01Z
spellingShingle Transfer learning for genotype–phenotype prediction using deep learning models
Feng, Samuel
Bioinformatics
Genotype-phenotype
Transfer learning
Deep learning
Genetics
title Transfer learning for genotype–phenotype prediction using deep learning models
title_full Transfer learning for genotype–phenotype prediction using deep learning models
title_fullStr Transfer learning for genotype–phenotype prediction using deep learning models
title_full_unstemmed Transfer learning for genotype–phenotype prediction using deep learning models
title_short Transfer learning for genotype–phenotype prediction using deep learning models
title_sort Transfer learning for genotype–phenotype prediction using deep learning models
topic Bioinformatics
Genotype-phenotype
Transfer learning
Deep learning
Genetics
url https://depot.sorbonne.ae/handle/20.500.12458/1344