Debarun Dhar | MS Student, Cornell Tech
With all the recent successes and democratization of modern machine learning and
data analysis techniques, a common question which comes to mind is how might we use these tools to improve decision making and outcomes. On March 14th, Angela Zhou, a Cornell Tech PhD student and DLI Doctoral Fellow, gave her talk at the Digital Life Seminar where she provided an overview of data-driven decision making and many of the common pitfalls associated with utilizing “black-box” models with real-world observational data. This blog post provides a brief summary of some of the main themes covered in the talk, with examples drawn from the challenging domain of healthcare.
While operations research methods and machine learning have existed and been put
to use for quite some time, we now have an abundance of rich data in the present era which
opens up new possibilities for what can be achieved using this information. This has garnered the interest of various institutional actors, whether they be in government or in hospitals, to use their data for inform and improve their processes, decisions and outcomes. However, Zhou argues that within the “ecology” of data-driven decision making there exist a number of challenges which need to be addressed when working with observational data.
The first core set of problems arise from the nature of observational data itself. In
medical research, randomized controlled trials (RCTs) are the gold standard for understanding the general effects of a particular drug or treatment on a given patient population. By design, these trials focus on studying outcomes for large groups, without accounting for each individual patient’s specific context. On the other hand, there has been a recent push for personalized medicine which is born out of utilizing the large amounts of observational data abound in electronic health records. A fundamental problem with this is that unlike the case of RCTs, the data in these observational studies is extremely messy and there is no control over treatment assignment, which can lead to highly error-prone predictions and spurious correlations. Confounding, defined as the presence of alternative explanations for a given statistical observation, also plays a major role. This is illustrated by Zhou who described a set of parallel clinical trials and observational studies conducted by the Women’s Health Initiative (WHI) to understand the effects of hormone replacement therapy for coronary heart disease prevention. During the study the clinical trials needed to be cut short due to an increase in the rate of heart attacks, cancer and stroke, while in contrast according to the observational studies the therapy had a positive protective effect. This disparity demonstrates the difficulty in extracting insights about causal effects from observational data.
Another issue associated with the use of observational data stems from the biases
born under the specific context under which the data was collected. A case study to illustrate
this is described in the paper “Intelligible Models for HealthCare: Predicting Pneumonia Risk
and Hospital 30-day Readmission” by Caruana et al, where the researchers aimed to predict
the probability of death for patients with pneumonia, with the goal of assisting with better
triage of patients. When analysing the well-performing models, they found the surprising
result that according to the models, patients who had pneumonia AND asthma had a lower
risk of death. This was obviously untrue and was purely an artifact of the fact that these
patients were admitted directly into the ICU and had received aggressive care. This
information was not accounted for in the dataset and so led to a form of “label leakage”
which generated this anomalous result.
In addition to discussing the problems with observational data, Zhou stresses that a
number of issues can arise during the modeling process itself. A lot of parametric models are
generally built on top of strong assumptions which may not be valid for settings with
observational data about people, such as in cases where we want to predict a medical or
social outcome. Therefore, when these models are used as “black-boxes” without any
adjustments the underlying assumptions are often frequently and systematically violated.
Another strong assumption made by many methods which learn to personalize from
observational data is that of “unconfoundedness” or that the effect of unmeasured
confounders is not significant. For example, in the case of electronic health records, one
might assume that as the data gets richer it will become possible to observe every factor
which affects a given treatment. However, in practice, this is untrue as there will always be
some residual confounding due to human decision making.
By going over each of the challenges described above during her talk, Zhou provided a
comprehensive summary of many of the common pitfalls which practitioners may encounter
when applying data-driven methods to decision making. Despite this, Zhou maintains that
while these models may have the wrong assumptions, they can still be useful and that
researchers should pay particular attention to understanding the prescriptive validity of their
presented evidence. There is a lot of possible value in the use of large-scale observational
data, and more research is needed to unlock this potential.