7 Drill Questions: Uncertain Quantities

At any point, you can submit your answers by collecting them and uploading them to the class site.

No answers yet collected

Link to upload site

If requested by your instructor, please identify here the people from whom you received assistance on this assignment.

If the answers that have been loaded automatically are not yours, press this button before starting your work:

Drill 7. 1 Consider the relative probability distribution graphed below.

  1. Which of these outcomes is the least likely? sdl-1-uew

    A       B       C       D      

  2. Which of these outcomes is the most likely? sdl-2-uew

    A       B       C       D      

  3. How much more likely is outcome C compared to outcome B? (Pick the closest estimate.)

sdl-3-kss

About 30 percent more likely

About 60 percent more likely

About 100 percent more likely

About 150 percent more likely

  1. How much more likely is outcome B compared to outcome D? (Pick the closest estimate.)
sdl-4-kss

About 100 percent more likely

About 200 percent more likely

About 300 percent more likely

About 500 percent more likely

Drill 7. 2 Each of the following graphics, except one, can be interpretted as a relative probability function. Which is the exception?

Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_line()`).
(a)
(b)
(c)
(d)
Figure 1: Functions that might be relative probability distributions
KnZDfY
a       b       c       d      

Drill 7. 3  

   hct-1-fkw
True     or       False      

An (absolute) probability is also a relative probability.

Drill 7. 4 Large Language Models generate responses in steps. In each step, one new token is added to the part of the response that has already been formed. To illustrate, here are the tokens that one AI considers to be likely to follow, “When I was walking …”.

token weight lyrics
in 2.8 2.1
around 1.4 -0.1
to 3.1 0.6
through 0.7 1.0
down -1.2 1.8
home 1.9 2.0
back -0.3 -1.1
I 1.7 0.8
by -1.1 1.6
along -2.6 1.4

Each word is assigned a weight. There are typically around 50,000 words to choose from. The above shows just a handful of the more likely ones.

The prompt “What are the most likely AI tokens that might follow ‘When I was walking’” generated the weights in the middle column. But for the “lyrics” column, the prompt was preceeded by “Taking into account the lyrics of popular music, ….”

  1. The numbers in the “weight” and “lyrics” columns are not relative probabilities. Which of the following explanations is most salient for this conclusion?
fck-1-ue2

Some of the numbers are greater than 1.

Some of the numbers are negative.

The numbers don’t add up to 1.

  1.    fck-2-3kd

    True     or       False      

    Relative probabilities do not need to add up to 1.

  2.    fck-3-7dw

    True     or       False      

    Relative probabilities do not need to be non-negative.

  3.    fck-4-3ks

    True     or       False      

    A relative probability can never be zero.

In fact, the weights generated by the AI are not relative probabilities. They are ordinary numbers which might be large or small, positive or negative. In the context of indicating a relative probability, the AI weights are called “logits.” Converting a logit to a relative probability is simple: use the storybook function double(). The following chunk reads the weights and words into an R data frame, then does the conversion to relative and absolute probabilities.

  1. Which three words are much more likely to occur following “When I was walking” in the context of lyrics compared to general text?
fck-5-2dw

along, home, through

along, down, by

around, in, home

by, down, through

Drill 7. 5 Which of these pictures corresponds best to the idea that the quantity Q is \(5 \pm 2\)?

(a)
(b)
(c)
(d)
Figure 2: Three relative probability distributions
D-07-ehso
a       b       c       d      

Drill 7. 6 Figure 3 shows the seasonal pattern in daily high and low temperatures for Austin, Texas, USA. Scanning the graph from left to right, you can see a yearly osc() pattern. Naturally, the high and low temperature on any given day vary from year to year so that, for each day, there is an uncertainty distribution.

Figure 3: As described by the source … “The daily range of reported temperatures (gray bars) and 24-hour highs (red ticks) and lows (blue ticks), placed over the daily average high (faint red line) and low (faint blue line) temperature, with 25th to 75th and 10th to 90th percentile bands.” Source: weatherspark.com

Here are graphs of the low-temperature distributions for three days. Days were chosen at the lines dividing two consecutive months, e.g. March/April.

(a)
(b)
(c)
Figure 4: Uncertainty distributions for the daily low temperature on four days during the year.

Match each graph to one of these days:

Graphic (a): ffb_d07_a
Apr/May       Mar/Apr       Feb/Mar       Sept/Oct       Aug/Sept      

Graphic (b): ffb_d07_b
Mar/Apr       Aug/Sept       June/July       Sept/Oct       Nov/Dec      

Graphic (c): ffb_d07_c
June/July       Mar/Apr       New Year’s Day       Apr/May       Feb/Mar      

Drill 7. 7 Match the following real-world scenarios to the distribution that best models them (Uniform, Normal, or Exponential).

A. “I will arrive sometime between 1:00 PM and 5:00 PM, but any time in that window is equally likely.”

xmpGI7
uniform       normal       exponential      

B. “The bus usually arrives at 8:00 AM, but it can be a few minutes early or late. It is very unlikely to be more than 15 minutes off.”

eJQ1tI
uniform       normal       exponential      

C. The amount of time that passes between two random, independent events (like 100-year storms or, on a different time scale, receiving text messages).

J7Wbmk
uniform       normal       exponential      

D. A quantity where values outside a specific range are considered impossible (probability = 0).

qOflLj
uniform       normal       exponential      

Drill 7. 8 Which storybook function introduced in previous chapters is mathematically equivalent to the normal (a.k.a. Gaussian) distribution?

erg-1-ue2
double()       doublings()       hill()       hillside()      

Drill 7. 9 Here’s a weather forecast in the midst of a snowy period.

Due to round-off, the “likelihoods” do not add exactly to 1.00. But close enough, so let’s treat them as probabilities.

  1. What’s the probability of \(\leq 1\) in of snowfall?
crs-1-sno
18%       30%       48%       Not enough info.      

  1. What’s the probability of \(1 \leq\ \text{snowfall} \ \leq 6\) in of snowfall?
crs-2-srto
23%       32%       47%       Not enough info.      

  1. What’s the probability of \(1 \leq\ \text{snowfall} \ \leq 3\) in of snowfall?
crs-3-wql
16%       32%       48%       Not enough info.