Table1_Natural language processing for automated triage and prioritization of individual case safety reports for case-by-case assessment.docx

<p>Objective: To improve a previously developed prediction model that could assist in the triage of individual case safety reports using the addition of features designed from free text fields using natural language processing.</p><p>Methods: Structured features and natural languag...

Full description

Saved in:
Bibliographic Details
Main Author: Thomas Lieber (14518946) (author)
Other Authors: Helen R. Gosselt (9238049) (author), Pelle C. Kools (14577803) (author), Okko C. Kruijssen (14577806) (author), Stijn N. C. Van Lierop (14577809) (author), Linda Härmark (10939774) (author), Florence P. A. M. Van Hunsel (14577812) (author)
Published: 2024
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<p>Objective: To improve a previously developed prediction model that could assist in the triage of individual case safety reports using the addition of features designed from free text fields using natural language processing.</p><p>Methods: Structured features and natural language processing (NLP) features were used to train a bagging classifier model. NLP features were extracted from free text fields. A bag-of-words model was applied. Stop words were deleted and words that were significantly differently distributed among the case and non-case reports were used for the training data. Besides NLP features from free-text fields, the data also consisted of a list of signal words deemed important by expert report assessors. Lastly, variables with multiple categories were transformed to numerical variables using the weight of evidence method.</p><p>Results: the model, a bagging classifier of decision trees had an AUC of 0.921 (95% CI = 0.918–0.925). Generic drug name, info text length, ATC code, BMI and patient age. were most important features in classification.</p><p>Conclusion: this predictive model using Natural Language Processing could be used to assist assessors in prioritizing which future ICSRs to assess first, based on the probability that it is a case which requires clinical review.</p>