skip to page content
Talk to the Veterans Crisis Line now
U.S. flag
An official website of the United States government

Health Services Research & Development

Go to the ORD website
Go to the QUERI website

HSR&D Citation Abstract

Search | Search by Center | Search by Source | Keywords in Title

Enhancing the identification of rheumatoid arthritis-associated interstitial lung disease through text mining of chest computerized tomography reports.

Luedders BA, Cope BJ, Hershberger D, DeVries M, Campbell WS, Campbell J, Roul P, Yang Y, Rojas J, Cannon GW, Sauer BC, Baker JF, Curtis JR, Mikuls TR, England BR. Enhancing the identification of rheumatoid arthritis-associated interstitial lung disease through text mining of chest computerized tomography reports. Seminars in Arthritis and Rheumatism. 2023 Jun 1; 60:152204.

Dimensions for VA is a web-based tool available to VA staff that enables detailed searches of published research and research projects.

If you have VA-Intranet access, click here for more information

VA staff not currently on the VA network can access Dimensions by registering for an account using their VA email address.
   Search Dimensions for VA for this citation
* Don't have VA-internal network access or a VA email address? Try searching the free-to-the-public version of Dimensions


OBJECTIVES: Algorithms have been developed to identify rheumatoid arthritis-interstitial lung disease (RA-ILD) in administrative data with positive predictive values (PPVs) between 70 and 80%. We hypothesized that including ILD-related terms identified within chest computed tomography (CT) reports through text mining would improve the PPV of these algorithms in this cross-sectional study. METHODS: We identified a derivation cohort of possible RA-ILD cases (n  =  114) using electronic health record data from a large academic medical center and performed medical record review to validate diagnoses (reference standard). ILD-related terms (e.g., ground glass, honeycomb) were identified in chest CT reports by natural language processing. Administrative algorithms including diagnostic and procedural codes as well as specialty were applied to the cohort both with and without the requirement for ILD-related terms from CT reports. We subsequently analyzed similar algorithms in an external validation cohort of 536 participants with RA. RESULTS: The addition of ILD-related terms to RA-ILD administrative algorithms increased the PPV in both the derivation (improvement ranging from 3.6 to 11.7%) and validation cohorts (improvement 6.0 to 21.1%). This increase was greatest for less stringent algorithms. Administrative algorithms including ILD-related terms from CT reports exceeded a PPV of 90% (maximum 94.6% derivation cohort). Increases in PPV were accompanied by a decline in sensitivity (validation cohort -3.9 to -19.5%). CONCLUSIONS: The addition of ILD-related terms identified by text mining from chest CT reports led to improvements in the PPV of RA-ILD algorithms. With high PPVs, use of these algorithms in large data sets could facilitate epidemiologic and comparative effectiveness research in RA-ILD.

Questions about the HSR&D website? Email the Web Team

Any health information on this website is strictly for informational purposes and is not intended as medical advice. It should not be used to diagnose or treat any condition.