Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation

<p dir="ltr">Effective modeling of patient representation from electronic health records (EHRs) is increasingly becoming a vital research topic. Yet, modeling the non-stationarity in EHR data has received less attention. Most existing studies follow a strong assumption of stationarit...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Rawan AlSaad (14159019) (author)
مؤلفون آخرون: Qutaibah Malluhi (3158757) (author), Alaa Abd-alrazaq (17058018) (author), Sabri Boughorbel (846228) (author)
منشور في: 2024
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513509859000320
author Rawan AlSaad (14159019)
author2 Qutaibah Malluhi (3158757)
Alaa Abd-alrazaq (17058018)
Sabri Boughorbel (846228)
author2_role author
author
author
author_facet Rawan AlSaad (14159019)
Qutaibah Malluhi (3158757)
Alaa Abd-alrazaq (17058018)
Sabri Boughorbel (846228)
author_role author
dc.creator.none.fl_str_mv Rawan AlSaad (14159019)
Qutaibah Malluhi (3158757)
Alaa Abd-alrazaq (17058018)
Sabri Boughorbel (846228)
dc.date.none.fl_str_mv 2024-02-13T09:00:00Z
dc.identifier.none.fl_str_mv 10.1016/j.artmed.2024.102802
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Temporal_self-attention_for_risk_prediction_from_electronic_health_records_using_non-stationary_kernel_approximation/26395630
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Health sciences
Health services and systems
Information and computing sciences
Artificial intelligence
Data management and data science
Machine learning
Self-attention
Electronic health records
Time series prediction
Non-stationary kernel
Temporal model
dc.title.none.fl_str_mv Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Effective modeling of patient representation from electronic health records (EHRs) is increasingly becoming a vital research topic. Yet, modeling the non-stationarity in EHR data has received less attention. Most existing studies follow a strong assumption of stationarity in patient representation from EHRs. However, in practice, a patient’s visits are irregularly spaced over a relatively long period of time, and disease progression patterns exhibit non-stationarity. Furthermore, the time gaps between patient visits often encapsulate significant domain knowledge, potentially revealing undiscovered patterns that characterize specific medical conditions. To address these challenges, we introduce a new method which combines the self-attention mechanism with non-stationary kernel approximation to capture both contextual information and temporal relationships between patient visits in EHRs. To assess the effectiveness of our proposed approach, we use two real-world EHR datasets, comprising a total of 76,925 patients, for the task of predicting the next diagnosis code for a patient, given their EHR history. The first dataset is a general EHR cohort and consists of 11,451 patients with a total of 3,485 unique diagnosis codes. The second dataset is a disease-specific cohort that includes 65,474 pregnant patients and encompasses a total of 9,782 unique diagnosis codes. Our experimental evaluation involved nine prediction models, categorized into three distinct groups. Group 1 comprises the baselines: original self-attention with positional encoding model, RETAIN model, and LSTM model. Group 2 includes models employing self-attention with stationary kernel approximations, specifically incorporating three variations of Bochner’s feature maps. Lastly, Group 3 consists of models utilizing self-attention with non-stationary kernel approximations, including quadratic, cubic, and bi-quadratic polynomials. The experimental results demonstrate that non-stationary kernels significantly outperformed baseline methods for NDCG@10 and Hit@10 metrics in both datasets. The performance boost was more substantial in dataset 1 for the NDCG@10 metric. On the other hand, stationary Kernels showed significant but smaller gains over baselines and were nearly as effective as Non-stationary Kernels for Hit@10 in dataset 2. These findings robustly validate the efficacy of employing non-stationary kernels for temporal modeling of EHR data, and emphasize the importance of modeling non-stationary temporal information in healthcare prediction tasks.</p><h2>Other Information</h2><p dir="ltr">Published in: Artificial Intelligence in Medicine<br>License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.artmed.2024.102802" target="_blank">https://dx.doi.org/10.1016/j.artmed.2024.102802</a></p>
eu_rights_str_mv openAccess
id Manara2_afe87f2b4587c834b8e82a4e3ef906b6
identifier_str_mv 10.1016/j.artmed.2024.102802
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/26395630
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximationRawan AlSaad (14159019)Qutaibah Malluhi (3158757)Alaa Abd-alrazaq (17058018)Sabri Boughorbel (846228)Health sciencesHealth services and systemsInformation and computing sciencesArtificial intelligenceData management and data scienceMachine learningSelf-attentionElectronic health recordsTime series predictionNon-stationary kernelTemporal model<p dir="ltr">Effective modeling of patient representation from electronic health records (EHRs) is increasingly becoming a vital research topic. Yet, modeling the non-stationarity in EHR data has received less attention. Most existing studies follow a strong assumption of stationarity in patient representation from EHRs. However, in practice, a patient’s visits are irregularly spaced over a relatively long period of time, and disease progression patterns exhibit non-stationarity. Furthermore, the time gaps between patient visits often encapsulate significant domain knowledge, potentially revealing undiscovered patterns that characterize specific medical conditions. To address these challenges, we introduce a new method which combines the self-attention mechanism with non-stationary kernel approximation to capture both contextual information and temporal relationships between patient visits in EHRs. To assess the effectiveness of our proposed approach, we use two real-world EHR datasets, comprising a total of 76,925 patients, for the task of predicting the next diagnosis code for a patient, given their EHR history. The first dataset is a general EHR cohort and consists of 11,451 patients with a total of 3,485 unique diagnosis codes. The second dataset is a disease-specific cohort that includes 65,474 pregnant patients and encompasses a total of 9,782 unique diagnosis codes. Our experimental evaluation involved nine prediction models, categorized into three distinct groups. Group 1 comprises the baselines: original self-attention with positional encoding model, RETAIN model, and LSTM model. Group 2 includes models employing self-attention with stationary kernel approximations, specifically incorporating three variations of Bochner’s feature maps. Lastly, Group 3 consists of models utilizing self-attention with non-stationary kernel approximations, including quadratic, cubic, and bi-quadratic polynomials. The experimental results demonstrate that non-stationary kernels significantly outperformed baseline methods for NDCG@10 and Hit@10 metrics in both datasets. The performance boost was more substantial in dataset 1 for the NDCG@10 metric. On the other hand, stationary Kernels showed significant but smaller gains over baselines and were nearly as effective as Non-stationary Kernels for Hit@10 in dataset 2. These findings robustly validate the efficacy of employing non-stationary kernels for temporal modeling of EHR data, and emphasize the importance of modeling non-stationary temporal information in healthcare prediction tasks.</p><h2>Other Information</h2><p dir="ltr">Published in: Artificial Intelligence in Medicine<br>License: <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank">http://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1016/j.artmed.2024.102802" target="_blank">https://dx.doi.org/10.1016/j.artmed.2024.102802</a></p>2024-02-13T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1016/j.artmed.2024.102802https://figshare.com/articles/journal_contribution/Temporal_self-attention_for_risk_prediction_from_electronic_health_records_using_non-stationary_kernel_approximation/26395630CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/263956302024-02-13T09:00:00Z
spellingShingle Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
Rawan AlSaad (14159019)
Health sciences
Health services and systems
Information and computing sciences
Artificial intelligence
Data management and data science
Machine learning
Self-attention
Electronic health records
Time series prediction
Non-stationary kernel
Temporal model
status_str publishedVersion
title Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
title_full Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
title_fullStr Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
title_full_unstemmed Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
title_short Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
title_sort Temporal self-attention for risk prediction from electronic health records using non-stationary kernel approximation
topic Health sciences
Health services and systems
Information and computing sciences
Artificial intelligence
Data management and data science
Machine learning
Self-attention
Electronic health records
Time series prediction
Non-stationary kernel
Temporal model