By providing a new level of clarity into how deep learning systems reach their medical diagnoses, scientists hope this AI approach will fuel a new era of preventative medicine.
LA JOLLA, CA—A team led by scientists at Scripps Research has demonstrated a method for “opening the black box” of a powerful artificial intelligence technique called deep learning, which holds vast potential for identifying disease earlier than is currently possible. The new method could help unleash deep learning for medical applications such as predicting heart attacks and diagnosing cancers.
“The framework we’ve created enables an in-depth exploration of the deep learning decision-making process, making it more transparent and useful for clinicians,” says the study’s senior author Giorgio Quer, PhD, senior staff scientist at Scripps Research and director of artificial intelligence at the Scripps Research Translational Institute.
Deep learning networks are software-based, brain-like systems—neural networks—that can process a large amount of data to recognize patterns, make predictions or otherwise make brain-like decisions. They are considered “black boxes” because the ways in which they turn their inputs into outputs are often unknown—they don’t reveal interpretable relationships. That makes deep learning hard to use in real-world settings where, for example, a doctor might need to know why a deep learning system reached an important conclusion such as a disease diagnosis.
In the study, published in the IEEE Journal of Biomedical and Health Informatics, the Scripps Research scientists developed a framework for explaining how deep learning works, and demonstrated this through a deep learning system’s identification of a common heart condition called atrial fibrillation.
A case study in heart disease
Atrial fibrillation, often called AFib, is a condition in which the heart sometimes slips into a rapid and dysfunctional rhythm, causing lightheadedness and other symptoms. Four out of every 10 adults over age 55 will develop AFib in their lifetime. Most critically, AFib promotes the formation of blood clots in the heart, which can break off and cause strokes; people with this condition have five times the normal risk of stroke.
Yet, for many patients AFib flares up only occasionally, making it hard to detect in routine medical checkups that gather only a brief snapshot of a patient’s heartbeat. Quer and colleagues in recent years have shown how wearable medical devices can gather heart-rhythm data for extended periods, and that deep learning can then process the vast datasets to detect AFib (read the 2019 study).
This proposed method would offer improved diagnostic accuracy over traditional models and could save lives by spotting disease earlier. But the opacity of deep learning has remained a hurdle to clinical adoption of this approach.
Evoking transparency
In the new study, Quer and colleagues created a way to analyze how deep learning makes its decisions in a medical diagnostic context. Then, they applied this framework to the deep learning system they developed for AFib. Their method included subtracting or otherwise varying the input data to see how those changes altered the deep learning network’s decision. The result was a comprehensive picture of how deep learning works in this instance, showing how its recognition of AFib depends on key clinical variables.
“Interestingly, we found that the ECG features that most heavily contributed to the final deep learning decision are features that cardiologists also routinely look for in diagnosing AFib,” says study co-author Steven Steinhubl, MD, a cardiologist and director of Digital Medicine at the Scripps Research Translational Institute.
Quer, Steinhubl and their colleagues hope that the availability of this explanatory framework will encourage cardiologists and hospitals to start using deep learning and wearable technology for better AFib screening and monitoring.
The approach could eventually be applied to other medical conditions that can be identified through data collected with wearable devices. Currently, the Scripps Research team is working on a machine learning method to detect COVID-19 from wearable sensor data, aiming to improve upon the accuracy they reported in their recent paper in Nature Medicine. Here, it will be necessary to explain the machine learning results and highlight the most important features used to identify people with COVID-19 infection.
“With this new explanatory approach, we hope to better understand the deep learning network and make it more useful to clinicians,” Quer says.
The study was a collaboration between Scripps Research and the University of California, San Diego. “A Comprehensive Explanation Framework for Biomedical Time Series Classification” was co-authored by Praharsh Ivaturi, Matteo Gadaleta, Amitabh Pandey, Michael Pazzani, Steven Steinhubl and Giorgio Quer.
Funding was provided by the Defense Advanced Research Projects Agency, the National Science Foundation and the National Center for Advancing Translational Sciences.