Fast and robust invariant generalized linear models
<p>Statistical integration of diverse data sources is an essential step in the building of generalizable prediction tools, especially in precision health. The invariant features model is a new paradigm for multi-source data integration which posits that a small number of covariates affect the...
Furkejuvvon:
| Váldodahkki: | |
|---|---|
| Eará dahkkit: | , |
| Almmustuhtton: |
2025
|
| Fáttát: | |
| Fáddágilkorat: |
Lasit fáddágilkoriid
Eai fáddágilkorat, Lasit vuosttaš fáddágilkora!
|
| _version_ | 1849927630053179392 |
|---|---|
| author | Parker Knight (22683015) |
| author2 | Ndey Isatou Jobe (22683018) Rui Duan (561668) |
| author2_role | author author |
| author_facet | Parker Knight (22683015) Ndey Isatou Jobe (22683018) Rui Duan (561668) |
| author_role | author |
| dc.creator.none.fl_str_mv | Parker Knight (22683015) Ndey Isatou Jobe (22683018) Rui Duan (561668) |
| dc.date.none.fl_str_mv | 2025-11-25T17:40:16Z |
| dc.identifier.none.fl_str_mv | 10.6084/m9.figshare.30713061.v1 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/dataset/Fast_and_robust_invariant_generalized_linear_models/30713061 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Medicine Genetics Molecular Biology Physiology Infectious Diseases Computational Biology Biological Sciences not elsewhere classified Mathematical Sciences not elsewhere classified Information Systems not elsewhere classified data integration optimization electronic health records risk prediction |
| dc.title.none.fl_str_mv | Fast and robust invariant generalized linear models |
| dc.type.none.fl_str_mv | Dataset info:eu-repo/semantics/publishedVersion dataset |
| description | <p>Statistical integration of diverse data sources is an essential step in the building of generalizable prediction tools, especially in precision health. The invariant features model is a new paradigm for multi-source data integration which posits that a small number of covariates affect the outcome identically across all possible environments. Existing methods for estimating invariant effects suffer from immense computational costs or only offer good statistical performance under strict assumptions. In this work, we provide a general framework for estimation under the invariant features model that is computationally efficient and statistically flexible. We also provide a robust extension of our proposed method to protect against possibly corrupted or misspecified data sources. We demonstrate the robust properties of our method via simulations, and use it to build a transferable prediction model for end stage renal disease using electronic health records from the All of Us research program.</p> |
| eu_rights_str_mv | openAccess |
| id | Manara_563df5098dab0f153156ba32458013a4 |
| identifier_str_mv | 10.6084/m9.figshare.30713061.v1 |
| network_acronym_str | Manara |
| network_name_str | ManaraRepo |
| oai_identifier_str | oai:figshare.com:article/30713061 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Fast and robust invariant generalized linear modelsParker Knight (22683015)Ndey Isatou Jobe (22683018)Rui Duan (561668)MedicineGeneticsMolecular BiologyPhysiologyInfectious DiseasesComputational BiologyBiological Sciences not elsewhere classifiedMathematical Sciences not elsewhere classifiedInformation Systems not elsewhere classifieddata integrationoptimizationelectronic health recordsrisk prediction<p>Statistical integration of diverse data sources is an essential step in the building of generalizable prediction tools, especially in precision health. The invariant features model is a new paradigm for multi-source data integration which posits that a small number of covariates affect the outcome identically across all possible environments. Existing methods for estimating invariant effects suffer from immense computational costs or only offer good statistical performance under strict assumptions. In this work, we provide a general framework for estimation under the invariant features model that is computationally efficient and statistically flexible. We also provide a robust extension of our proposed method to protect against possibly corrupted or misspecified data sources. We demonstrate the robust properties of our method via simulations, and use it to build a transferable prediction model for end stage renal disease using electronic health records from the All of Us research program.</p>2025-11-25T17:40:16ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.6084/m9.figshare.30713061.v1https://figshare.com/articles/dataset/Fast_and_robust_invariant_generalized_linear_models/30713061CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/307130612025-11-25T17:40:16Z |
| spellingShingle | Fast and robust invariant generalized linear models Parker Knight (22683015) Medicine Genetics Molecular Biology Physiology Infectious Diseases Computational Biology Biological Sciences not elsewhere classified Mathematical Sciences not elsewhere classified Information Systems not elsewhere classified data integration optimization electronic health records risk prediction |
| status_str | publishedVersion |
| title | Fast and robust invariant generalized linear models |
| title_full | Fast and robust invariant generalized linear models |
| title_fullStr | Fast and robust invariant generalized linear models |
| title_full_unstemmed | Fast and robust invariant generalized linear models |
| title_short | Fast and robust invariant generalized linear models |
| title_sort | Fast and robust invariant generalized linear models |
| topic | Medicine Genetics Molecular Biology Physiology Infectious Diseases Computational Biology Biological Sciences not elsewhere classified Mathematical Sciences not elsewhere classified Information Systems not elsewhere classified data integration optimization electronic health records risk prediction |