Lesson 26 take-aways

Class sessions
prediction
probability distribution
bayesian updating
Author

Daniel Kaplan

Published

March 23, 2023

  1. In Lesson 25 we pointed out that the proper form for a prediction is a list of the potential outcomes, each matched to a probability of that outcome occurring. The probabilities across all outcomes (the “probability distribution”) should add to 1. (If the outcomes are quantitative and continuous, a probability density is used and “adding” is replaced with integration. But this is not important to us here.)

  2. We are not going to be spending much time on the topic of probability distributions. This is a technical matter and we don’t have enough time to do it justice on a technical level. (We’ll come back to it, however, in Lessons 34 and 35, in a simple setting.)

  3. There are some things you ought to learn, even if you don’t develop a mastery of probability.

    1. There is a small set of mathematically defined probability distributions that are often used as models to organize prediction.
    2. There is a probability logic called “Bayesian updating” that provides the means to update a probability distribution as new data come in. The relevant terms here are:
      • prior distribution: our predictions before we see the new data.
      • likelihood: a mathematical model of the plausibility of each possible value for the new data observation.
      • posterior distribution: the updated prediction upon seeing the new data.
      • The relationship among these three terms is mathematically simple: posterior is proportional to likelihood times prior. A demonstration, using self-driving cars as an example, is in this blog post.
  4. Just for general background, we looked at two of the distributions in the small set (3.i) of probability distributions:

    1. The normal or bell-shaped distribution (available in R as dnorm()). This is the distribution that underlies the shorthand of using prediction intervals rather than probability distributions. Experts know exactly how to translate the prediction interval into the corresponding bell-shaped distribution.
    2. The binomial distribution (available in R as dbinom()). The example we used for this had to do with hospital supplies. Suppose there are drugs or other medical material that is needed only rarely, for instance, having a 1% chance of use in any given hospital or clinic in the time before the material expires. If all hospitals and clinic keep it in stock, just in case, then roughly 99% will be wasted: an expensive proposition. However, if we can provide a central warehouse which can quickly send the material where and when it’s needed, we can dramatically reduce the amount needed to be in stock. For instance, if there are 500 hospitals and clinics, each of which has a 1% probability of needing a given drug, then we can virtually guarantee the availability of the drug with a small amount in stock. The binomial distribution provides the prediction of the amount of drug that will be called for across all hospitals and clinics. For the 500-hospital, 1% use case, the warehouse need only keep 15 doses in stock.