Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website

HSR&D Citation Abstract

Search | Search by Center | Search by Source | Keywords in Title

Predicting Incident Adenocarcinoma of the Esophagus or Gastric Cardia Using Machine Learning of Electronic Health Records.

Rubenstein JH, Fontaine S, MacDonald PW, Burns JA, Evans RR, Arasim ME, Chang JW, Firsht EM, Hawley ST, Saini SD, Wallner LP, Zhu J, Waljee AK. Predicting Incident Adenocarcinoma of the Esophagus or Gastric Cardia Using Machine Learning of Electronic Health Records. Gastroenterology. 2023 Dec 1; 165(6):1420-1429.e10.

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

If you have VA-Intranet access, click here for more information vaww.hsrd.research.va.gov/dimensions/

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
   Search Dimensions for VA for this citation
* Don't have VA-internal network access or a VA email address? Try searching the free-to-the-public version of Dimensions



Abstract:

BACKGROUND and AIMS: Tools that can automatically predict incident esophageal adenocarcinoma (EAC) and gastric cardia adenocarcinoma (GCA) using electronic health records to guide screening decisions are needed. METHODS: The Veterans Health Administration (VHA) Corporate Data Warehouse was accessed to identify Veterans with 1 or more encounters between 2005 and 2018. Patients diagnosed with EAC (n  = 8430) or GCA (n  = 2965) were identified in the VHA Central Cancer Registry and compared with 10,256,887 controls. Predictors included demographic characteristics, prescriptions, laboratory results, and diagnoses between 1 and 5 years before the index date. The Kettles Esophageal and Cardia Adenocarcinoma predictioN (K-ECAN) tool was developed and internally validated using simple random sampling imputation and extreme gradient boosting, a machine learning method. Training was performed in 50% of the data, preliminary validation in 25% of the data, and final testing in 25% of the data. RESULTS: K-ECAN was well-calibrated and had better discrimination (area under the receiver operating characteristic curve [AuROC], 0.77) than previously validated models, such as the Nord-Trøndelag Health Study (AuROC, 0.68) and Kunzmann model (AuROC, 0.64), or published guidelines. Using only data from between 3 and 5 years before index diminished its accuracy slightly (AuROC, 0.75). Undersampling men to simulate a non-VHA population, AUCs of the Nord-Trøndelag Health Study and Kunzmann model improved, but K-ECAN was still the most accurate (AuROC, 0.85). Although gastroesophageal reflux disease was strongly associated with EAC, it contributed only a small proportion of gain in information for prediction. CONCLUSIONS: K-ECAN is a novel, internally validated tool predicting incident EAC and GCA using electronic health records data. Further work is needed to validate K-ECAN outside VHA and to assess how best to implement it within electronic health records.





Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.