ICD-10 codes for MACE definition.

<div><p>Background</p><p>Studies of cardiovascular disease risk prediction by machine learning algorithms often do not assess their ability to generalize to other populations and few of them include an analysis of the interpretability of individual predictions. This manuscrip...

Full description

Saved in:
Bibliographic Details
Main Author: Gilson Yuuji Shimizu (19837946) (author)
Other Authors: Michael Schrempf (19837949) (author), Elen Almeida Romão (4772397) (author), Stefanie Jauk (19837952) (author), Diether Kramer (19837955) (author), Peter P. Rainer (5961086) (author), José Abrão Cardeal da Costa (19837958) (author), João Mazzoncini de Azevedo-Marques (3737785) (author), Sandro Scarpelini (4320544) (author), Katia Mitiko Firmino Suzuki (19837961) (author), Hilton Vicente César (19837964) (author), Paulo Mazzoncini de Azevedo-Marques (9073344) (author)
Published: 2024
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852026000443965440
author Gilson Yuuji Shimizu (19837946)
author2 Michael Schrempf (19837949)
Elen Almeida Romão (4772397)
Stefanie Jauk (19837952)
Diether Kramer (19837955)
Peter P. Rainer (5961086)
José Abrão Cardeal da Costa (19837958)
João Mazzoncini de Azevedo-Marques (3737785)
Sandro Scarpelini (4320544)
Katia Mitiko Firmino Suzuki (19837961)
Hilton Vicente César (19837964)
Paulo Mazzoncini de Azevedo-Marques (9073344)
author2_role author
author
author
author
author
author
author
author
author
author
author
author_facet Gilson Yuuji Shimizu (19837946)
Michael Schrempf (19837949)
Elen Almeida Romão (4772397)
Stefanie Jauk (19837952)
Diether Kramer (19837955)
Peter P. Rainer (5961086)
José Abrão Cardeal da Costa (19837958)
João Mazzoncini de Azevedo-Marques (3737785)
Sandro Scarpelini (4320544)
Katia Mitiko Firmino Suzuki (19837961)
Hilton Vicente César (19837964)
Paulo Mazzoncini de Azevedo-Marques (9073344)
author_role author
dc.creator.none.fl_str_mv Gilson Yuuji Shimizu (19837946)
Michael Schrempf (19837949)
Elen Almeida Romão (4772397)
Stefanie Jauk (19837952)
Diether Kramer (19837955)
Peter P. Rainer (5961086)
José Abrão Cardeal da Costa (19837958)
João Mazzoncini de Azevedo-Marques (3737785)
Sandro Scarpelini (4320544)
Katia Mitiko Firmino Suzuki (19837961)
Hilton Vicente César (19837964)
Paulo Mazzoncini de Azevedo-Marques (9073344)
dc.date.none.fl_str_mv 2024-10-11T17:24:22Z
dc.identifier.none.fl_str_mv 10.1371/journal.pone.0311719.t001
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/ICD-10_codes_for_MACE_definition_/27212828
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Cell Biology
Cancer
Science Policy
Plant Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
preto medical school
evaluated regarding accuracy
applied towards insights
882 ); accuracy
792 ); accuracy
859 &# 8211
782 &# 8211
778 &# 8211
704 &# 8211
support vector machine
based risk prediction
ribeir &# 227
xlink "> studies
xlink "> among
machine learning algorithms
shapley values suggest
rpms ), university
random forest showed
bidmc ), usa
best predictive performance
roc curve ).
best generalization ability
000 mace cases
local interpretability analyses
interpretability </ p
&# 227
xlink ">
machine learning
shapley values
roc curve
random forest
predictive performance
mace cases
mace ).
local interpretability
interpretability analyses
year risk
good generalization
000 non
retrospective cohort
nearest neighbors
naive bayes
model reliability
manuscript addresses
layer perceptron
final model
decision tree
consistent explanations
cardiovascular diseases
brazilian hospital
balanced sample
additional one
808 ))
717 )).
dc.title.none.fl_str_mv ICD-10 codes for MACE definition.
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <div><p>Background</p><p>Studies of cardiovascular disease risk prediction by machine learning algorithms often do not assess their ability to generalize to other populations and few of them include an analysis of the interpretability of individual predictions. This manuscript addresses the development and validation, both internal and external, of predictive models for the assessment of risks of major adverse cardiovascular events (MACE). Global and local interpretability analyses of predictions were conducted towards improving MACE’s model reliability and tailoring preventive interventions.</p><p>Methods</p><p>The models were trained and validated on a retrospective cohort with the use of data from Ribeirão Preto Medical School (RPMS), University of São Paulo, Brazil. Data from Beth Israel Deaconess Medical Center (BIDMC), USA, were used for external validation. A balanced sample of 6,000 MACE cases and 6,000 non-MACE cases from RPMS was created for training and internal validation and an additional one of 8,000 MACE cases and 8,000 non-MACE cases from BIDMC was employed for external validation. Eight machine learning algorithms, namely Penalized Logistic Regression, Random Forest, XGBoost, Decision Tree, Support Vector Machine, k-Nearest Neighbors, Naive Bayes, and Multi-Layer Perceptron were trained to predict a 5-year risk of major adverse cardiovascular events and their predictive performance was evaluated regarding accuracy, ROC curve (receiver operating characteristic), and AUC (area under the ROC curve). LIME and Shapley values were applied towards insights about model interpretability.</p><p>Findings</p><p>Random Forest showed the best predictive performance in both internal validation (AUC = 0.871 (0.859–0.882); Accuracy = 0.794 (0.782–0.808)) and external one (AUC = 0.786 (0.778–0.792); Accuracy = 0.710 (0.704–0.717)). Compared to LIME, Shapley values suggest more consistent explanations on exploratory analysis and importance of features.</p><p>Conclusions</p><p>Among the machine learning algorithms evaluated, Random Forest showed the best generalization ability, both internally and externally. Shapley values for local interpretability were more informative than LIME ones, which is in line with our exploratory analysis and global interpretation of the final model. Machine learning algorithms with good generalization and accompanied by interpretability analyses are recommended for assessments of individual risks of cardiovascular diseases and development of personalized preventive actions.</p></div>
eu_rights_str_mv openAccess
id Manara_bbad6429dfd640997b4dc6caa5bb5c5d
identifier_str_mv 10.1371/journal.pone.0311719.t001
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/27212828
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling ICD-10 codes for MACE definition.Gilson Yuuji Shimizu (19837946)Michael Schrempf (19837949)Elen Almeida Romão (4772397)Stefanie Jauk (19837952)Diether Kramer (19837955)Peter P. Rainer (5961086)José Abrão Cardeal da Costa (19837958)João Mazzoncini de Azevedo-Marques (3737785)Sandro Scarpelini (4320544)Katia Mitiko Firmino Suzuki (19837961)Hilton Vicente César (19837964)Paulo Mazzoncini de Azevedo-Marques (9073344)Cell BiologyCancerScience PolicyPlant BiologyBiological Sciences not elsewhere classifiedMathematical Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedpreto medical schoolevaluated regarding accuracyapplied towards insights882 ); accuracy792 ); accuracy859 &# 8211782 &# 8211778 &# 8211704 &# 8211support vector machinebased risk predictionribeir &# 227xlink "> studiesxlink "> amongmachine learning algorithmsshapley values suggestrpms ), universityrandom forest showedbidmc ), usabest predictive performanceroc curve ).best generalization ability000 mace caseslocal interpretability analysesinterpretability </ p&# 227xlink ">machine learningshapley valuesroc curverandom forestpredictive performancemace casesmace ).local interpretabilityinterpretability analysesyear riskgood generalization000 nonretrospective cohortnearest neighborsnaive bayesmodel reliabilitymanuscript addresseslayer perceptronfinal modeldecision treeconsistent explanationscardiovascular diseasesbrazilian hospitalbalanced sampleadditional one808 ))717 )).<div><p>Background</p><p>Studies of cardiovascular disease risk prediction by machine learning algorithms often do not assess their ability to generalize to other populations and few of them include an analysis of the interpretability of individual predictions. This manuscript addresses the development and validation, both internal and external, of predictive models for the assessment of risks of major adverse cardiovascular events (MACE). Global and local interpretability analyses of predictions were conducted towards improving MACE’s model reliability and tailoring preventive interventions.</p><p>Methods</p><p>The models were trained and validated on a retrospective cohort with the use of data from Ribeirão Preto Medical School (RPMS), University of São Paulo, Brazil. Data from Beth Israel Deaconess Medical Center (BIDMC), USA, were used for external validation. A balanced sample of 6,000 MACE cases and 6,000 non-MACE cases from RPMS was created for training and internal validation and an additional one of 8,000 MACE cases and 8,000 non-MACE cases from BIDMC was employed for external validation. Eight machine learning algorithms, namely Penalized Logistic Regression, Random Forest, XGBoost, Decision Tree, Support Vector Machine, k-Nearest Neighbors, Naive Bayes, and Multi-Layer Perceptron were trained to predict a 5-year risk of major adverse cardiovascular events and their predictive performance was evaluated regarding accuracy, ROC curve (receiver operating characteristic), and AUC (area under the ROC curve). LIME and Shapley values were applied towards insights about model interpretability.</p><p>Findings</p><p>Random Forest showed the best predictive performance in both internal validation (AUC = 0.871 (0.859–0.882); Accuracy = 0.794 (0.782–0.808)) and external one (AUC = 0.786 (0.778–0.792); Accuracy = 0.710 (0.704–0.717)). Compared to LIME, Shapley values suggest more consistent explanations on exploratory analysis and importance of features.</p><p>Conclusions</p><p>Among the machine learning algorithms evaluated, Random Forest showed the best generalization ability, both internally and externally. Shapley values for local interpretability were more informative than LIME ones, which is in line with our exploratory analysis and global interpretation of the final model. Machine learning algorithms with good generalization and accompanied by interpretability analyses are recommended for assessments of individual risks of cardiovascular diseases and development of personalized preventive actions.</p></div>2024-10-11T17:24:22ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.1371/journal.pone.0311719.t001https://figshare.com/articles/dataset/ICD-10_codes_for_MACE_definition_/27212828CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/272128282024-10-11T17:24:22Z
spellingShingle ICD-10 codes for MACE definition.
Gilson Yuuji Shimizu (19837946)
Cell Biology
Cancer
Science Policy
Plant Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
preto medical school
evaluated regarding accuracy
applied towards insights
882 ); accuracy
792 ); accuracy
859 &# 8211
782 &# 8211
778 &# 8211
704 &# 8211
support vector machine
based risk prediction
ribeir &# 227
xlink "> studies
xlink "> among
machine learning algorithms
shapley values suggest
rpms ), university
random forest showed
bidmc ), usa
best predictive performance
roc curve ).
best generalization ability
000 mace cases
local interpretability analyses
interpretability </ p
&# 227
xlink ">
machine learning
shapley values
roc curve
random forest
predictive performance
mace cases
mace ).
local interpretability
interpretability analyses
year risk
good generalization
000 non
retrospective cohort
nearest neighbors
naive bayes
model reliability
manuscript addresses
layer perceptron
final model
decision tree
consistent explanations
cardiovascular diseases
brazilian hospital
balanced sample
additional one
808 ))
717 )).
status_str publishedVersion
title ICD-10 codes for MACE definition.
title_full ICD-10 codes for MACE definition.
title_fullStr ICD-10 codes for MACE definition.
title_full_unstemmed ICD-10 codes for MACE definition.
title_short ICD-10 codes for MACE definition.
title_sort ICD-10 codes for MACE definition.
topic Cell Biology
Cancer
Science Policy
Plant Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
preto medical school
evaluated regarding accuracy
applied towards insights
882 ); accuracy
792 ); accuracy
859 &# 8211
782 &# 8211
778 &# 8211
704 &# 8211
support vector machine
based risk prediction
ribeir &# 227
xlink "> studies
xlink "> among
machine learning algorithms
shapley values suggest
rpms ), university
random forest showed
bidmc ), usa
best predictive performance
roc curve ).
best generalization ability
000 mace cases
local interpretability analyses
interpretability </ p
&# 227
xlink ">
machine learning
shapley values
roc curve
random forest
predictive performance
mace cases
mace ).
local interpretability
interpretability analyses
year risk
good generalization
000 non
retrospective cohort
nearest neighbors
naive bayes
model reliability
manuscript addresses
layer perceptron
final model
decision tree
consistent explanations
cardiovascular diseases
brazilian hospital
balanced sample
additional one
808 ))
717 )).