Remember to hand in your work …

At any point, you can submit your answers by collecting them and uploading them to the class site.

Exer-09-likelihood.rmarkdown No answers yet collected

If requested by your instructor, please identify here the people from whom you received assistance on this assignment.

If the answers that have been loaded automatically are not yours, press this button before starting your work:

9 Exercises: Likelihood

Exercise 9. 2 Should the calculations be done with probability or with magnitude. The product of probabilities corresponds to the sum of the magnitudes.

Exercise 9. 3 Do the calculations to produce the years-saved versus cost function from Figure 13.7. Or maybe just the average years saved, that is, how to compute an expectation value.

Exercise 9. 4 A little drill with three or four data points: Calculate likelihood by hand for a given distribution. Maybe have them do both normal and exponential for a few points to show that both distributions are compatible.

Contrast maximum likelihood parameters with likelihoods that are some distance away. Contour plot for likelihood for mean and sd. Add more data points, then see how the likelihood contracts.

It’s the magnitude of the likelihood that we usually work with. Often, we can calculate the magnitude but not so much the actual value which suffers from computer round-off.

Exercise 9. 5 Absolute and relative probability set with the criterion that the total probability across all possibilities, adds up to 1. We’ll talk about how this calculation is done in Chapter 13.

Exercise 9. 6 Consider the risk of a serious-injury-producing automobile accident. The mileage driven until the next such accident is unknown. But we can frame a hypothesis: the relative probability of the mileage until the next accident is an exponential function (Chapter 7, BUT GIVE A MORE SPECIFIC LINK when you have it) with a rate of 1 in 50,000 miles.

There is an infinite number of other hypotheses that might be applied to the automobile-accident setting. For example, an exponential distribution with a rate of 1 in 72,983.5 miles. Or, perhaps a uniform distribution between a minimum of 138 miles and 21,709 miles. This might be starting to sound silly, but Bayesian reasoning saves the day by adding an additional concept: that every hypothesis can be assigned a relative “goodness.” There are two components that go into finding the “goodness” of a hypothesis. One of these is called “prior belief” and will be introduced in Chapter 10. The other is called “likelihood.”

To illustrate, let’s work with the specific hypothesis that “the relative probability of the mileage until the next accident is exponentially distributed with a mean of 50,000 miles.” We are keeping track of a car. Suppose the car has an accident at 38,231 miles. To find the likelihood, simply evaluate the probability distribution at the observed value. Here’s the relevant computing command for the relative probability:

dexp(38231, rate = 1 / 50000)

[1] 9.310216e-06

We can calculate the likelihood for any and all of the other hypotheses we are consideration. For example, we earlier mentioned a different hypothesis: that the exponential distribution has a rate of 1 in 72,983.5 miles. Here’s the calculation of the likelihood:

dexp(38231, rate = 1 / 72983.5)

[1] 8.114813e-06

Now a third hypothesis for the accident mileage: a uniform distribution between a minimum of 138 miles and 21,709 miles.

dunif(38231, min = 138, max = 21709)

[1] 0

The observation of the accident at 38,231 miles produces different likelihoods for the three different hypotheses.

It’s useful to consider a likelihood function that tells us the likelihood induced by an observation at 38,231 miles for each of a large set of hypotheses. Here’s the likelihood function for an exponential distribution: it takes the form of likelihood versus the hypothesized rate.

slice_plot(
  dexp(38231, rate = 1 / miles) ~ miles,
  domain(miles = 3000:1000000),
  npts = 500
) |>
  #gf_refine(scale_x_log10()) |>
  gf_labs(x = "Rate: 1 per n miles",
          y = "Likelihood of 38,231 miles observation")

Exercise 9. 7 The exponential distribution has 1 parameter, the normal has two, and the 17-23-31 has three. Formulas to take into account the number of parameters when comparing likelihoods have been offered. Two well-known ones are:

The Akaike Information Criterion (AIC). The total score is \(2 k - 2 \ln(L)\) where \(k\) is the number of parameters and \(L\) is the calculated likelihood.
The Bayesian Information Criterion (BIC). The score incorporates not only the number of parameters (\(k\)), but also the number of data points (\(n\), which is 3 in our example). The formula is \(k \ln(n) - 2 \ln(L)\).

Hypothesis	\(k\)	\(L\)	AIC	BIC
Exponential	1	0.000006	26	30
Normal	2	0.00004	28.3	22.5
17-23-31	3	12.6	8.7

A lower score for AIC or BIC is better.

Let’s imagine two scenarios where we collect new data.

Scenario 1: The next two intervals turn out to be 5 and 47 years.
Scenario 2 (in the spirit of science fiction): The next two measurements are 31 and 31 years.

Calculate the likelihood, AIC and BIC for each of the three hypotheses using just the 2 new data points. Which hypothesis is favored according to L (higher is better), AIC, and BIC.

Exercise 9. 8

Next status step: complete the draft

Assigned to DTK

MAKE THIS ABOUT PUTTING high-school calculations in the likelihood framework.

ORIENT THIS TOWARD the high-school calculations being about likelihoods. Coin flip and die toss are shorthand for hypotheses.

“Probability” is a standard high-school mathematics topic, and it’s likely that you have spent some time calculating the probability of all heads from three coin flips or the probability of a 7 from rolling a pair of dice. Why coins? Why dice? One reason is that there is not a lot to know about coins and dice and we can be confident in the idea that heads or tails have the same relative probability, and, similarly, that the outcomes one through six of an individual die have the same relative probability. Other reasons: we are familiar from an early age with games that involve multiple throws of dice; we feel justified in the belief that just about any coin or any pair of dice thrown by just about any person provide a fair randomization device.

Whatever the educational virtues of learning the high-school probability calculations for dice and coins, the same reasons they are in the curriculum are also reasons why they are poor examples for dealing with uncertainty in significant events such as floods, fires, earthquakes, heat waves, financial collapses, illnesses, automobile mishaps, or—to reach for an extreme—the risk of an accidental detonation of a nuclear bomb. It would be foolish to think that the probability of a flood is the same at any location at any time or that we are all the same when it comes to becoming ill or the victim of an automobile mishap. And, unlike dice and coins, there is a lot to know about significant events and how they depend on circumstances. Finally, and thankfully, significant events are rare. We are always working with limited data and the understanding that risk can change over time.