Decision Analysis, Dynamic Programming, and the Future of Healthcare
- Joe
- Dec 31, 2019
- 7 min read
In the past, I wrote about how machine learning, on its own, applies to a very common problem structure: input data is fed into a model, which returns a prediction. In the post, I argued that this problem structure, while ubiquitous, is still relatively narrow. In simpler terms, our current methods alone aren’t enough to solve every problem. Instead, maybe the simplicity of these methods' structure lends them to becoming a plug-and-play components of larger modeling procedures, like cogs in a decision-making machine. In this post, I’ll talk about healthcare as an example of this. More specifically, I’ll try to develop a framework for AI-driven diagnoses that goes beyond the simple data-model-prediction process to more closely model the way healthcare decisions are actually made.
Let’s start by outlining a “typical” healthcare experience. I, the patient, go to the doctor. I tell the doctor my symptoms. The doctor asks me some more questions and, based on those questions, orders me a set of tests. This process more or less repeats until a diagnosis is made, the doctor refers me to a specialist, or the doctor runs out of ideas and gives up.
Data science hopes to simplify this process . The general idea, as far as I understand it, is that a patient (or the healthcare provider, on the patient's behalf) supplies some information to the machine learning model, which then returns a diagnosis.
This is a simple solution trying to solve a complex problem. As such, it has some strict limitations. For one, these models are pretty particular. Current solutions are good at one kind of diagnosis, like determining if a tissue sample is cancerous or quantifying the risk of a particular surgery. The narrowness of these models is a result of a particular assumption: full information.
Single-step solutions really have two different solutions to missing information. The first is to just train on other data, ignoring the data points where you don’t have full information. This doesn’t fly when the patient you're seeing is missing information.
Depending on how the information is missing, you can alternatively try to “impute” it from other data, to approximate what the true value should be. This process can be pretty dicey, and will depend on the quality of your other data.
Neither of these options tell you whether you should work to obtain the data. This is a spot where real medical practice diverges wildly from the data science tools meant to augment it. When real doctors aren’t sure about a diagnosis, they “buy” more information by ordering a test. Beyond that, a doctor can't just order any test. There needs to be some potential benefit to it, or the doctor is wasting resources (money and time) and taking on liability. A simple, one-step model isn’t enough to capture this.
Here's the first question. How can we value information?
Let's use an illuminating example. Suppose it's a cloudy day, and you must decide whether to buy an umbrella on the way to work. If you buy an umbrella, you pay the price of the umbrella regardless of rain. If you don't buy an umbrella, but you get rained on, you pay the "cost" of getting wet. A decision tree for this is below:

In Decision Analysis cases, there is often a fictional character called the clairvoyant. The clairvoyant can tell you, for a price, the result of a particular uncertainty. She could tell you, for certain, whether or not it would rain in the afternoon. By deciding how much that “perfect” information is worth, you can set an upper limit to how much you’d spend on any imperfect information. I won’t go into the actual computation now, but in the literature, this value is often called Expected Value of Perfect Information.

This principle stretches beyond the sort of abstract, intentionally-extreme example of the clairvoyant’s perfect information. Let’s say the information you’re purchasing is an imperfect weather machine, which with some probability correctly predicts whether there will be rain. By determining the value of your decision after the information (deciding under an “improved” sense of the probability of rain) and comparing it to the status quo (deciding under your current assessment), you can calculate a value to that information. In more abstract terms, the value of information quantifies the value of moving from your current state to a new state, where you have different information for your decision.
This principle can be applied to deciding whether or not to order a test for a patient. The value shift, and therefore the value of the test, is based on the current state, before the test, and the possible resulting states, after. With your states of information, you can perhaps have a model that gives predictions based on the incomplete data. There is a body of literature within Decision Analysis related to placing a value on personal risk, as well as quantitatively valuing health outcomes. Using that, along with an estimation of your machine learning model's uncertainty, the decision-modeler can develop a utility function that captures the value of a state of information. By assigning a numerical value to these outcomes, it’s possible to also consider the monetary costs of a procedure related to things like opportunity cost, insurance costs, and the loss of quality of life associated with any side effects. Put all of this together, and we have a method for valuing information in a medical setting.
Here’s a concrete example. You, the doctor, have a patient who is experiencing chest pain and is short of breath. At the moment, you have limited information: the patient’s weight, some details about their health history, and maybe some hereditary history. The machine learning model you’re currently using suggests two possibilities: either it’s a minor issue, that can be resolved with the prescription of medicine, or the patient is having a heart attack and needs to go to the emergency room. On one hand, you know that chest pain for patients like yours (similar weight, lifestyle, etc) is usually nothing, and emergency room visits can be very expensive for the patient. On the other hand, you really don’t want your patient to die. To make the issue more complicated, the machine learning model’s certainty is disturbingly low, so you have no decision clarity.
You do, however, have another version of this machine learning model that can consider the information from an EKG, and you want to decide whether it's worth ordering one.
This case is structurally identical to the rain example above. Instead of a weather machine, you have an EKG and the machine learning model that considers it. By evaluating the value of your current state of information (before the test) and the distribution of your potential future states information (after the test), you can determine a value for the EKG.
The tree below shows this decision structure, where V(*) is the value function for a state of information, & represents our previous knowledge, and C(*) is a cost function for medical tests and procedures. In reality, the tree spans further out to include decisions on interventions. For brevity, that part of the tree is omitted.

This EKG example has a couple of simplifying assumptions that we can relax to make the problem more realistic. The first relaxation is to extend the decision beyond a binary one. We aren’t just choosing between an EKG or nothing. We might have several tests to choose from. This doesn’t change the structure of our analysis too much; we can just add more branches to the tree above, valuing each of the resulting outcomes (states of information), and picking the best one.
The second relaxation is really an extension of the first. A doctor may choose to order another test after the first, depending on the state of information. This defines a potential sequence of decisions. More importantly, these decisions are not independent. Perhaps an EKG on its own does not give us enough information for a diagnosis, but narrows the realm of possibility and makes other tests more effective. In other words, tests can be incredibly valuable without directly resulting in a diagnosis.
In data science circles, the principle we're wrestling against is called “greediness.” Greedy methods solve each step of a multi-step process to optimality, not necessarily solving the whole process to optimality. In our case, that would mean choosing the case that results in the best state of information. However, with a sequence of decisions, a “suboptimal” decision in a sequence may lead to better opportunities later.
The greedy approach in this example is quite simple. The value of each test, at every step in a sequence, is simply the value of the state of information that results from it. With a global approach, however, each value includes both its state of information and the potential states of information that can be accessed from it. Calculating the value of those future potential states of information would require the same kind of analysis. What we have, then, is a Markov Decision Process. Each state is defined by the information you’ve acquired. The value of the state is the sum of the value of your current information (that considers the severity and certainty of your model) and the value of states that can be accessed from the current state. A more in-depth introduction to MDP's, along with some early applications to healthcare, can be found in this book chapter.
With this framework, you could add some structural constraints. This could begin with superficial things, like the amount of time a doctor/lab can devote to a single patient, but could also capture restrictions like the number of radiation-intensive tests a patient should have. In essence, this decision framework would allow medical professionals to avoid making current decisions that severely hamper their future ability to treat a patient.
The schematic below roughly shows a transition in the sequence. The arrows represent dependencies, subject to some (not depicted) structural constraints. Defining the process this way requires certain information at each iteration. The first is a fully trained machine learning model, based on the current state of information. Second, a value function considering the uncertainty of the machine learning model and the available healthcare interventions. In each iteration, the model considers the potential states based on the intervention (drawing from Markov Decision Processes), and evaluates the value of obtaining more information (applying Decision Analysis' value of information framework).

Where current models rely on very strict assumptions, like having complete (or near-complete) information and only making single-step diagnoses, this framework goes further. Instead of imputing missing data, the modeler can determine which information to seek out and how valuable it is. It does so in a globally optimal way, not a greedy one. And it allows the possibility of imposing structural constraints on tests and treatments, more effectively modeling the way real medical professionals make decisions.
Each of the three fields I've drawn from have made efforts to model healthcare decisions. By deliberately combining them, the presented framework goes further than any of these individual fields. Altogether, it's a hint of the immense potential that comes from incorporating different methods into deliberately engineered decision structures. In the increasingly saturated field of healthcare analytics, the most successful companies must go beyond the simply-structured machine learning model.
A final note. I intentionally left out a lot of the mathematical details. They'd probably be boring, and I haven't yet fleshed them out. If you'd like to hear some of my thoughts, feel free to reach out.
Comments