Health Services Research & Development

Veterans Crisis Line Badge
Go to the ORD website
Go to the QUERI website
FORUM - Translating research into quality health care for Veterans

» Back to Table of Contents

Research Highlight

It’s rare to attend a conference on quality, safety, or informatics without feeling the excitement of “big data,” a loose term referring to large volumes of interconnected (and often unverified) data that may be updated and processed rapidly. With wide-scale implementation of electronic health records (EHRs), interconnectivity between different systems of care, and new patient-based sensor technology, the prospect of using big data to discover new relationships is real. For quality and safety researchers, using big data for surveillance of missed opportunities is a dream come true.1 In this article, we discuss some of the challenges that need to be addressed in order to leverage big data to improve quality and safety at the point of care.

The promise of big data is the ability to identify significant events and nascent risks by combining data (e.g., diagnoses, test results, and treatments) gathered through multiple sources and methods, often across disparate organizations. This massive merging of data sources and types may reveal a longitudinal picture of a patient, illuminating important trends or gaps in care. However, sharing data across organizations and ensuring that the information is accurate remains a challenge. In particular, matching data from the same patient across organizations is difficult and filled with errors.

Further, data are often collected haphazardly, and methods to encode and/or map similar clinical concepts from one standard vocabulary to another have shortcomings. Even when researchers are able to bring the data together, it is often difficult to understand what really happened to the patient. By developing a Corporate Data Warehouse, VA is systematically addressing these issues, refining common data definitions, and adding essential data elements to develop a more comprehensive picture of our patients.2

Data are being generated and stored at an unprecedented rate as a by-product of myriad digital transactions, such as order entry, admission, discharge, transfer, and procedure recording. Additionally, patients themselves are generating data, sometimes in conjunction with new omnipresent monitoring technologies. Paired with this outpouring of data is an enthusiasm for “discovering” new relationships and using these to inform forecasts.

While expanding access to data has real promise, this tsunami of information will also create unintended consequences. At the Center for Innovation in Quality, Effectiveness and Safety, our work has shown that almost a third of providers currently miss abnormal test results in their EHRs due to information overload.3 These new information sources, if not carefully managed, are almost certain to add to clinicians’ information processing burden. Thus, those who develop and deploy these big data-based discoveries and solutions should make sure that the information delivery fits within the workflow of the recipient and is delivered in a non-intrusive fashion. Much of this information should be distributed to members of the health care team other than frontline clinicians, such as care managers, quality and safety personnel, or even new types of personnel dedicated to handling this information. Another challenge will be to ensure that the use of big data actually improves quality and safety, the patient and clinician experience, and efficiency. Since the aims of big data analytics go well beyond the original purposes of the data, distinguishing signal from noise is essential. Dedicated analytics teams should include highly trained mathematicians, computer scientists, and informaticians supported by both front-line clinicians and quality and safety administrators who can help ensure that the information gleaned from the data is correct, actionable, and able to be delivered to the right person. Teams must take care to avoid bias, confounding factors, and spurious associations in their attempts to identify meaningful relationships and assign causation. Retrospective, observational study designs have significant inherent limitations, especially for determining causation or even identifying preventable events. Therefore, predictive models based on previously collected data should be tested prospectively whenever possible.

The final challenge is operationalizing a regulatory, financing, and policy framework to optimize the use of big data for quality and safety improvement. Managing the trade-off between individual privacy rights and the potential benefits from this research to society as a whole continues to be a challenge.

While use of big data in health care has potential to improve the quality, safety, and efficiency of patient care, much work remains to unlock this potential. This work must be supported with dedicated funding, new types of personnel and information governance structures, and a robust, high-capacity information technology infrastructure.

  1. Murphy, D.R. et al. “Electronic Health Recordbased Triggers to Detect Potential Delays in Cancer Diagnosis,” BMJ Quality & Safety 2014; 23(1):8-16.
  2. Fihn, S.D. et al. “Insights from Advanced Analytics at the Veterans Health Administration,” Health Affairs (Millwood ) 2014; 33:1203-11.
  3. Singh, H. et al. “Information Overload and Missed Test Results in Electronic Health Record-based Settings,” JAMA Internal Medicine 2013;173:702-4.