Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
<p dir="ltr">Effective modeling of patient representation from electronic health records (EHRs) is increasingly becoming a vital research topic. Yet, modeling the non-stationarity in EHR data has received less attention. Most existing studies follow a strong assumption of stationarit...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , , |
| منشور في: |
2024
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513509859000320 |
|---|---|
| author | Rawan AlSaad (14159019) |
| author2 | Qutaibah Malluhi (3158757) Alaa Abd-alrazaq (17058018) Sabri Boughorbel (846228) |
| author2_role | author author author |
| author_facet | Rawan AlSaad (14159019) Qutaibah Malluhi (3158757) Alaa Abd-alrazaq (17058018) Sabri Boughorbel (846228) |
| author_role | author |
| dc.creator.none.fl_str_mv | Rawan AlSaad (14159019) Qutaibah Malluhi (3158757) Alaa Abd-alrazaq (17058018) Sabri Boughorbel (846228) |
| dc.date.none.fl_str_mv | 2024-02-13T09:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1016/j.artmed.2024.102802 |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Temporal_self-attention_for_risk_prediction_from_electronic_health_records_using_non-stationary_kernel_approximation/26395630 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Health sciences Health services and systems Information and computing sciences Artificial intelligence Data management and data science Machine learning Self-attention Electronic health records Time series prediction Non-stationary kernel Temporal model |
| dc.title.none.fl_str_mv | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p dir="ltr">Effective modeling of patient representation from electronic health records (EHRs) is increasingly becoming a vital research topic. Yet, modeling the non-stationarity in EHR data has received less attention. Most existing studies follow a strong assumption of stationarity in patient representation from EHRs. However, in practice, a patient’s visits are irregularly spaced over a relatively long period of time, and disease progression patterns exhibit non-stationarity. Furthermore, the time gaps between patient visits often encapsulate significant domain knowledge, potentially revealing undiscovered patterns that characterize specific medical conditions. To address these challenges, we introduce a new method which combines the self-attention mechanism with non-stationary kernel approximation to capture both contextual information and temporal relationships between patient visits in EHRs. To assess the effectiveness of our proposed approach, we use two real-world EHR datasets, comprising a total of 76,925 patients, for the task of predicting the next diagnosis code for a patient, given their EHR history. The first dataset is a general EHR cohort and consists of 11,451 patients with a total of 3,485 unique diagnosis codes. The second dataset is a disease-specific cohort that includes 65,474 pregnant patients and encompasses a total of 9,782 unique diagnosis codes. Our experimental evaluation involved nine prediction models, categorized into three distinct groups. Group 1 comprises the baselines: original self-attention with positional encoding model, RETAIN model, and LSTM model. Group 2 includes models employing self-attention with stationary kernel approximations, specifically incorporating three variations of Bochner’s feature maps. Lastly, Group 3 consists of models utilizing self-attention with non-stationary kernel approximations, including quadratic, cubic, and bi-quadratic polynomials. The experimental results demonstrate that non-stationary kernels significantly outperformed baseline methods for NDCG@10 and Hit@10 metrics in both datasets. The performance boost was more substantial in dataset 1 for the NDCG@10 metric. On the other hand, stationary Kernels showed significant but smaller gains over baselines and were nearly as effective as Non-stationary Kernels for Hit@10 in dataset 2. These findings robustly validate the efficacy of employing non-stationary kernels for temporal modeling of EHR data, and emphasize the importance of modeling non-stationary temporal information in healthcare prediction tasks.</p><h2>Other Information</h2><p dir="ltr">Published in: Artificial Intelligence in Medicine<br>License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.artmed.2024.102802" target="_blank">https://dx.doi.org/10.1016/j.artmed.2024.102802</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_afe87f2b4587c834b8e82a4e3ef906b6 |
| identifier_str_mv | 10.1016/j.artmed.2024.102802 |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/26395630 |
| publishDate | 2024 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximationRawan AlSaad (14159019)Qutaibah Malluhi (3158757)Alaa Abd-alrazaq (17058018)Sabri Boughorbel (846228)Health sciencesHealth services and systemsInformation and computing sciencesArtificial intelligenceData management and data scienceMachine learningSelf-attentionElectronic health recordsTime series predictionNon-stationary kernelTemporal model<p dir="ltr">Effective modeling of patient representation from electronic health records (EHRs) is increasingly becoming a vital research topic. Yet, modeling the non-stationarity in EHR data has received less attention. Most existing studies follow a strong assumption of stationarity in patient representation from EHRs. However, in practice, a patient’s visits are irregularly spaced over a relatively long period of time, and disease progression patterns exhibit non-stationarity. Furthermore, the time gaps between patient visits often encapsulate significant domain knowledge, potentially revealing undiscovered patterns that characterize specific medical conditions. To address these challenges, we introduce a new method which combines the self-attention mechanism with non-stationary kernel approximation to capture both contextual information and temporal relationships between patient visits in EHRs. To assess the effectiveness of our proposed approach, we use two real-world EHR datasets, comprising a total of 76,925 patients, for the task of predicting the next diagnosis code for a patient, given their EHR history. The first dataset is a general EHR cohort and consists of 11,451 patients with a total of 3,485 unique diagnosis codes. The second dataset is a disease-specific cohort that includes 65,474 pregnant patients and encompasses a total of 9,782 unique diagnosis codes. Our experimental evaluation involved nine prediction models, categorized into three distinct groups. Group 1 comprises the baselines: original self-attention with positional encoding model, RETAIN model, and LSTM model. Group 2 includes models employing self-attention with stationary kernel approximations, specifically incorporating three variations of Bochner’s feature maps. Lastly, Group 3 consists of models utilizing self-attention with non-stationary kernel approximations, including quadratic, cubic, and bi-quadratic polynomials. The experimental results demonstrate that non-stationary kernels significantly outperformed baseline methods for NDCG@10 and Hit@10 metrics in both datasets. The performance boost was more substantial in dataset 1 for the NDCG@10 metric. On the other hand, stationary Kernels showed significant but smaller gains over baselines and were nearly as effective as Non-stationary Kernels for Hit@10 in dataset 2. These findings robustly validate the efficacy of employing non-stationary kernels for temporal modeling of EHR data, and emphasize the importance of modeling non-stationary temporal information in healthcare prediction tasks.</p><h2>Other Information</h2><p dir="ltr">Published in: Artificial Intelligence in Medicine<br>License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.artmed.2024.102802" target="_blank">https://dx.doi.org/10.1016/j.artmed.2024.102802</a></p>2024-02-13T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1016/j.artmed.2024.102802https://figshare.com/articles/journal_contribution/Temporal_self-attention_for_risk_prediction_from_electronic_health_records_using_non-stationary_kernel_approximation/26395630CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/263956302024-02-13T09:00:00Z |
| spellingShingle | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation Rawan AlSaad (14159019) Health sciences Health services and systems Information and computing sciences Artificial intelligence Data management and data science Machine learning Self-attention Electronic health records Time series prediction Non-stationary kernel Temporal model |
| status_str | publishedVersion |
| title | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation |
| title_full | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation |
| title_fullStr | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation |
| title_full_unstemmed | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation |
| title_short | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation |
| title_sort | Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation |
| topic | Health sciences Health services and systems Information and computing sciences Artificial intelligence Data management and data science Machine learning Self-attention Electronic health records Time series prediction Non-stationary kernel Temporal model |