Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website

HSR&D Citation Abstract

Search | Search by Center | Search by Source | Keywords in Title

Generalization of Machine Learning Approaches to Identify Notifiable Conditions from a Statewide Health Information Exchange.

Dexter GP, Grannis SJ, Dixon BE, Kasthurirathne SN. Generalization of Machine Learning Approaches to Identify Notifiable Conditions from a Statewide Health Information Exchange. AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science. 2020 May 30; 2020:152-161.

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

If you have VA-Intranet access, click here for more information vaww.hsrd.research.va.gov/dimensions/

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
   Search Dimensions for VA for this citation
* Don't have VA-internal network access or a VA email address? Try searching the free-to-the-public version of Dimensions



Abstract:

Healthcare analytics is impeded by a lack of machine learning (ML) model generalizability, the ability of a model to predict accurately on varied data sources not included in the model's training dataset. We leveraged free-text laboratory data from a Health Information Exchange network to evaluate ML generalization using Notifiable Condition Detection (NCD) for public health surveillance as a use case. We 1) built ML models for detecting syphilis, salmonella, and histoplasmosis; 2) evaluated generalizability of these models across data from holdout lab systems, and; 3) explored factors that influence weak model generalizability. Models for predicting each disease reported considerable accuracy. However, they demonstrated poor generalizability across data from holdout lab systems being tested. Our evaluation determined that weak generalization was influenced by variant syntactic nature of free-text datasets across each lab system. Results highlight the need for actionable methodology to generalize ML solutions for healthcare analytics.





Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.