Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website

HSR&D Citation Abstract

Search | Search by Center | Search by Source | Keywords in Title

Accuracies of Training Labels and Machine Learning Models: Experiments on Delirium and Simulated Data.

Cheng Y, Shao Y, Rudolph J, Weir CR, Sahlmann B, Zeng-Treitler Q. Accuracies of Training Labels and Machine Learning Models: Experiments on Delirium and Simulated Data. Studies in health technology and informatics. 2022 Jun 6; 290:665-669.

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

If you have VA-Intranet access, click here for more information vaww.hsrd.research.va.gov/dimensions/

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
   Search Dimensions for VA for this citation
* Don't have VA-internal network access or a VA email address? Try searching the free-to-the-public version of Dimensions



Abstract:

Supervised predictive models require labeled data for training purposes. Complete and accurate labeled data is not always available, and imperfectly labeled data may need to serve as an alternative. An important question is if the accuracy of the labeled data creates a performance ceiling for the trained model. In this study, we trained several models to recognize the presence of delirium in clinical documents using data with annotations that are not completely accurate. In the external evaluation, the support vector machine model with a linear kernel performed best, achieving an area under the curve of 89.3% and accuracy of 88%, surpassing the 80% accuracy of the training sample. We then generated a set of simulated data and carried out a series of experiments which demonstrated that models trained on imperfect data can (but do not always) outperform the accuracy of the training data.





Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.