Fast and robust invariant generalized linear models

<p>Statistical integration of diverse data sources is an essential step in the building of generalizable prediction tools, especially in precision health. The invariant features model is a new paradigm for multi-source data integration which posits that a small number of covariates affect the...

Olles dieđut

Furkejuvvon:
Bibliográfalaš dieđut
Váldodahkki: Parker Knight (22683015) (author)
Eará dahkkit: Ndey Isatou Jobe (22683018) (author), Rui Duan (561668) (author)
Almmustuhtton: 2025
Fáttát:
Fáddágilkorat: Lasit fáddágilkoriid
Eai fáddágilkorat, Lasit vuosttaš fáddágilkora!
_version_ 1849927630053179392
author Parker Knight (22683015)
author2 Ndey Isatou Jobe (22683018)
Rui Duan (561668)
author2_role author
author
author_facet Parker Knight (22683015)
Ndey Isatou Jobe (22683018)
Rui Duan (561668)
author_role author
dc.creator.none.fl_str_mv Parker Knight (22683015)
Ndey Isatou Jobe (22683018)
Rui Duan (561668)
dc.date.none.fl_str_mv 2025-11-25T17:40:16Z
dc.identifier.none.fl_str_mv 10.6084/m9.figshare.30713061.v1
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/Fast_and_robust_invariant_generalized_linear_models/30713061
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Medicine
Genetics
Molecular Biology
Physiology
Infectious Diseases
Computational Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
data integration
optimization
electronic health records
risk prediction
dc.title.none.fl_str_mv Fast and robust invariant generalized linear models
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <p>Statistical integration of diverse data sources is an essential step in the building of generalizable prediction tools, especially in precision health. The invariant features model is a new paradigm for multi-source data integration which posits that a small number of covariates affect the outcome identically across all possible environments. Existing methods for estimating invariant effects suffer from immense computational costs or only offer good statistical performance under strict assumptions. In this work, we provide a general framework for estimation under the invariant features model that is computationally efficient and statistically flexible. We also provide a robust extension of our proposed method to protect against possibly corrupted or misspecified data sources. We demonstrate the robust properties of our method via simulations, and use it to build a transferable prediction model for end stage renal disease using electronic health records from the All of Us research program.</p>
eu_rights_str_mv openAccess
id Manara_563df5098dab0f153156ba32458013a4
identifier_str_mv 10.6084/m9.figshare.30713061.v1
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/30713061
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Fast and robust invariant generalized linear modelsParker Knight (22683015)Ndey Isatou Jobe (22683018)Rui Duan (561668)MedicineGeneticsMolecular BiologyPhysiologyInfectious DiseasesComputational BiologyBiological Sciences not elsewhere classifiedMathematical Sciences not elsewhere classifiedInformation Systems not elsewhere classifieddata integrationoptimizationelectronic health recordsrisk prediction<p>Statistical integration of diverse data sources is an essential step in the building of generalizable prediction tools, especially in precision health. The invariant features model is a new paradigm for multi-source data integration which posits that a small number of covariates affect the outcome identically across all possible environments. Existing methods for estimating invariant effects suffer from immense computational costs or only offer good statistical performance under strict assumptions. In this work, we provide a general framework for estimation under the invariant features model that is computationally efficient and statistically flexible. We also provide a robust extension of our proposed method to protect against possibly corrupted or misspecified data sources. We demonstrate the robust properties of our method via simulations, and use it to build a transferable prediction model for end stage renal disease using electronic health records from the All of Us research program.</p>2025-11-25T17:40:16ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.6084/m9.figshare.30713061.v1https://figshare.com/articles/dataset/Fast_and_robust_invariant_generalized_linear_models/30713061CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/307130612025-11-25T17:40:16Z
spellingShingle Fast and robust invariant generalized linear models
Parker Knight (22683015)
Medicine
Genetics
Molecular Biology
Physiology
Infectious Diseases
Computational Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
data integration
optimization
electronic health records
risk prediction
status_str publishedVersion
title Fast and robust invariant generalized linear models
title_full Fast and robust invariant generalized linear models
title_fullStr Fast and robust invariant generalized linear models
title_full_unstemmed Fast and robust invariant generalized linear models
title_short Fast and robust invariant generalized linear models
title_sort Fast and robust invariant generalized linear models
topic Medicine
Genetics
Molecular Biology
Physiology
Infectious Diseases
Computational Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
data integration
optimization
electronic health records
risk prediction