Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

VA Health Systems Research

Go to the VA ORD website
Go to the QUERI website

PPO 20-124 – HSR Study

PPO 20-124
Consistency of Uses of ICD Codes for Retrospective Data Analysis
Qing Zeng, PhD
Washington DC VA Medical Center, Washington, DC
Washington, DC
Funding Period: February 2021 - October 2022


Background: The Clinical Modification (CM) of the 9th Revision of International Classification of Diseases (ICD) codes have been the standard for clinical, operational and research activities using health record data in the U.S. for decades. On October 1, 2015, the Centers for Medicare & Medicaid Services (CMS) replaced the ICD-9CM codes with ICD-10CM codes, which are fundamentally different in structure and concepts from the ICD-9CM. In many cases, there are no exact matches between these two sets of codes. To make the transition smooth, the Centers for Disease Control and Prevention (CDC) and the CMS have created General Equivalence Mappings (GEM) or “crosswalks” that can translate one code set to the other. However, the GEM does not simply and automatically translate one code to another in a completely reliable way. Significance/Impact: Health services research relies on accurate and reliable use of ICD codes. Retrospective analyses using existing EHR data assume the ICD codes to be a relatively consistent representation of the clinical data. The lack of automated and reliable translation between ICD-9CM and ICD- 10 CM have been shown to result in incorrect estimations of disease prevalence, which may lead to serious errors in cohort identification, statistical analyses or machine learning models. Innovation: Existing crosswalk tools such as the GEM were developed solely based on the terms and hierarchy of the ICD-9CM and ICD-10CM. We propose to study the actual longitudinal and contextual usage of ICD-9CM and ICD-10CM in EHR. The advantage of a large EHR repository such as the VA clinical data warehouse (CDW) is that there is a long time series (~20 years in CDW) and extremely rich clinical context (e.g. demographic, lab, medication and text note) for us to examine the consistency of ICD usage. Specific Aims: 1) To assess the consistency of ICD-9CM and ICD-10CM usage in VA EHR data, by detecting aberrant signals using time-series analysis methods; and 2) To improve the consistency of ICD-9CM and ICD-10CM usage in VA EHR data, using embedding methods to compare usage contexts. Methodology: The Aim 1 analysis will use signal detection methods that have been validated in bio-surveillance. Aim 2 will use embedding methods to map each ICD-9CM and ICD-10CM code to a latent semantic space based on their usage context. Terminology and domain experts will review a stratified sample of the results. Implementation/Next Steps: Findings of this pilot project will be shared with our operational partners in the VA central office. We envision further investigations building on this pilot to develop a user-friendly ICD translation tool and more accurate ICD mappings for VA and other EHR datasets over time and across facilities, and extend the effort beyond ICD to other terminologies.

External Links for this Project

NIH Reporter

Grant Number: I21HX003278-01A1

Dimensions for VA

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

Learn more about Dimensions for VA.

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
    Search Dimensions for this project


None at this time.

DRA: Other Conditions, Health Systems
DRE: TRL - Applied/Translational, Diagnosis, Research Infrastructure
Keywords: Data Management, Electronic Health Record, Healthcare Algorithms
MeSH Terms: None at this time.

Questions about the HSR website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.