Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees

<p dir="ltr">Genotype imputation is widely used in genome-wide association studies to boost variant density, allowing increased power in association testing. Many studies currently include pedigree data due to increasing interest in rare variants coupled with the availability of appr...

Full description

Saved in:

Bibliographic Details
Main Author:	Ehsan Ullah (2698921) (author)
Other Authors:	Raghvendra Mall (581171) (author), Mostafa M. Abbas (17058093) (author), Khalid Kunji (828224) (author), Alejandro Q. Nato (18619228) (author), Halima Bensmail (10400) (author), Ellen M. Wijsman (18619231) (author), Mohamad Saad (214545) (author)
Published:	2018
Subjects:	Biological sciences Genetics Variant density Association testing Pedigree data Rare variants Population-based imputation Family-based imputation Ped_Pop method
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1864513520778870784
author	Ehsan Ullah (2698921)
author2	Raghvendra Mall (581171) Mostafa M. Abbas (17058093) Khalid Kunji (828224) Alejandro Q. Nato (18619228) Halima Bensmail (10400) Ellen M. Wijsman (18619231) Mohamad Saad (214545)
author2_role	author author author author author author author
author_facet	Ehsan Ullah (2698921) Raghvendra Mall (581171) Mostafa M. Abbas (17058093) Khalid Kunji (828224) Alejandro Q. Nato (18619228) Halima Bensmail (10400) Ellen M. Wijsman (18619231) Mohamad Saad (214545)
author_role	author
dc.creator.none.fl_str_mv	Ehsan Ullah (2698921) Raghvendra Mall (581171) Mostafa M. Abbas (17058093) Khalid Kunji (828224) Alejandro Q. Nato (18619228) Halima Bensmail (10400) Ellen M. Wijsman (18619231) Mohamad Saad (214545)
dc.date.none.fl_str_mv	2018-12-04T03:00:00Z
dc.identifier.none.fl_str_mv	10.1101/gr.236315.118
dc.relation.none.fl_str_mv	https://figshare.com/articles/journal_contribution/Comparison_and_assessment_of_family-_and_population-based_genotype_imputation_methods_in_large_pedigrees/25908358
dc.rights.none.fl_str_mv	CC BY 4.0 info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv	Biological sciences Genetics Variant density Association testing Pedigree data Rare variants Population-based imputation Family-based imputation Ped_Pop method
dc.title.none.fl_str_mv	Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
dc.type.none.fl_str_mv	Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal
description	<p dir="ltr">Genotype imputation is widely used in genome-wide association studies to boost variant density, allowing increased power in association testing. Many studies currently include pedigree data due to increasing interest in rare variants coupled with the availability of appropriate analysis tools. The performance of population-based (subjects are unrelated) imputation methods is well established. However, the performance of family- and population-based imputation methods on family data has been subject to much less scrutiny. Here, we extensively compare several family- and population-based imputation methods on family data of large pedigrees with both European and African ancestry. Our comparison includes many widely used family- and population-based tools and another method, Ped_Pop, which combines family- and population-based imputation results. We also compare four subject selection strategies for full sequencing to serve as the reference panel for imputation: GIGI-Pick, ExomePicks, PRIMUS, and random selection. Moreover, we compare two imputation accuracy metrics: the Imputation Quality Score and Pearson's correlation R<sup>2</sup> for predicting power of association analysis using imputation results. Our results show that (1) GIGI outperforms Merlin; (2) family-based imputation outperforms population-based imputation for rare variants but not for common ones; (3) combining family- and population-based imputation outperforms all imputation approaches for all minor allele frequencies; (4) GIGI-Pick gives the best selection strategy based on the R<sup>2</sup> criterion; and (5) R<sup>2</sup> is the best measure of imputation accuracy. Our study is the first to extensively evaluate the imputation performance of many available family- and population-based tools on the same family data and provides guidelines for future studies.</p><p><br></p><h2>Other Information</h2><p dir="ltr">Published in: Genome Research<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1101/gr.236315.118" target="_blank">https://dx.doi.org/10.1101/gr.236315.118</a></p>
eu_rights_str_mv	openAccess
id	Manara2_f7693b94d6da8be2cf4494567748f25c
identifier_str_mv	10.1101/gr.236315.118
network_acronym_str	Manara2
network_name_str	Manara2
oai_identifier_str	oai:figshare.com:article/25908358
publishDate	2018
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv	CC BY 4.0
spelling	Comparison and assessment of family- and population-based genotype imputation methods in large pedigreesEhsan Ullah (2698921)Raghvendra Mall (581171)Mostafa M. Abbas (17058093)Khalid Kunji (828224)Alejandro Q. Nato (18619228)Halima Bensmail (10400)Ellen M. Wijsman (18619231)Mohamad Saad (214545)Biological sciencesGeneticsVariant densityAssociation testingPedigree dataRare variantsPopulation-based imputationFamily-based imputationPed_Pop method<p dir="ltr">Genotype imputation is widely used in genome-wide association studies to boost variant density, allowing increased power in association testing. Many studies currently include pedigree data due to increasing interest in rare variants coupled with the availability of appropriate analysis tools. The performance of population-based (subjects are unrelated) imputation methods is well established. However, the performance of family- and population-based imputation methods on family data has been subject to much less scrutiny. Here, we extensively compare several family- and population-based imputation methods on family data of large pedigrees with both European and African ancestry. Our comparison includes many widely used family- and population-based tools and another method, Ped_Pop, which combines family- and population-based imputation results. We also compare four subject selection strategies for full sequencing to serve as the reference panel for imputation: GIGI-Pick, ExomePicks, PRIMUS, and random selection. Moreover, we compare two imputation accuracy metrics: the Imputation Quality Score and Pearson's correlation R<sup>2</sup> for predicting power of association analysis using imputation results. Our results show that (1) GIGI outperforms Merlin; (2) family-based imputation outperforms population-based imputation for rare variants but not for common ones; (3) combining family- and population-based imputation outperforms all imputation approaches for all minor allele frequencies; (4) GIGI-Pick gives the best selection strategy based on the R<sup>2</sup> criterion; and (5) R<sup>2</sup> is the best measure of imputation accuracy. Our study is the first to extensively evaluate the imputation performance of many available family- and population-based tools on the same family data and provides guidelines for future studies.</p><p><br></p><h2>Other Information</h2><p dir="ltr">Published in: Genome Research<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1101/gr.236315.118" target="_blank">https://dx.doi.org/10.1101/gr.236315.118</a></p>2018-12-04T03:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1101/gr.236315.118https://figshare.com/articles/journal_contribution/Comparison_and_assessment_of_family-_and_population-based_genotype_imputation_methods_in_large_pedigrees/25908358CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/259083582018-12-04T03:00:00Z
spellingShingle	Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees Ehsan Ullah (2698921) Biological sciences Genetics Variant density Association testing Pedigree data Rare variants Population-based imputation Family-based imputation Ped_Pop method
status_str	publishedVersion
title	Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_full	Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_fullStr	Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_full_unstemmed	Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_short	Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
title_sort	Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees
topic	Biological sciences Genetics Variant density Association testing Pedigree data Rare variants Population-based imputation Family-based imputation Ped_Pop method

Comparison and assessment of family- and population-based genotype imputation methods in large pedigrees

Similar Items