```
|>
Hill_racing model_train(time ~ distance + climb) |>
conf_interval()
```

term | .lwr | .coef | .upr |
---|---|---|---|

(Intercept) | -533.00 | -470.00 | -407.00 |

distance | 246.00 | 254.00 | 261.00 |

climb | 2.49 | 2.61 | 2.73 |

As always, there is a model coefficient for each term mentioned in the model specification, `time ~ distance + climb`

. Here, those terms give an intercept, a coefficient on distance, and a coefficient on climb. Each coefficient comes with two other numbers, called `.lwr`

and `.upr`

in the report, standing for “lower” and “upper.” The confidence interval runs from the lower number to the upper number.

Focus for the moment on the `distance`

coefficient: 253.8 s/km. The confidence interval runs from 246 to 261 s/km. In previous Lessons about model values—the output of the model function when given values for the explanatory variables—we have emphasized the coefficient itself..

Statistical thinkers, knowing that there is sampling variation in any coefficient calculated from a data sample, like to use the word “**estimate**” to refer to the calculated value. Admittedly, the computer carries out the calculation of the coefficient without mistake and reports it with many digits. But those digits do not incorporate the uncertainty due to sampling variation. That’s the role of the confidence interval.

The meaning of a confidence interval such as the 246-to-261 s/km interval shown above is, “Any other estimate of the coefficient (made with other data) is consistent with ours so long as it falls within the confidence interval.”

An alternative, but entirely equivalent format for the confidence interval uses \(\pm\) (plus-or-minus) notation. The interval [246-261] s/km in \(\pm\) format can be written 254 \(\pm\) 8 s/km.

Another convention for reporting uncertainty—legendarily emphasized by chemistry teachers—involves the number of digits with which to write a number: the “**significant digits**.” For instance, the `distance`

coefficient reported by the computer is 253.808295 s/km. Were you to put this number in a lab report, you are at risk for a red annotation from your teacher: “Too many digits!”

According to the significant-digits convention, a proper way to write the `distance`

coefficient would be 250 s/km, although some teachers might prefer 254 s/km.

The situation is difficult because the significant-digit convention is attempting to serve three different goals at once. The first goal is to signal the precision of the number. The second goal is to avoid overwhelming human readers with irrelevant digits. The third goal is to allow human readers to redo calculations. These three goals sometimes compete. An example is the [246,261] s/km confidence interval on the `distance`

coefficient reported earlier. For this coefficient, the width of the confidence interval is about 15 s/km. This suggests that there is no value to the human reader in reporting any digits after the decimal point. But a literal translation of [246-261] into \(\pm\) format would be 253.5 \(\pm\) 7.5. Now there is a digit being reported after the decimal point, a digit we previously said isn’t worth reporting!

As a general-purpose procedure, I suggest the following principles for model coefficients:

**Always**report an interval in either the [lower, upper] format or the center \(\pm\) spread format. It doesn’t much matter which one.- As a guide to the number of digits to print, look to the interval width, calculated as upper \(-\) lower or as 2 \(\times\) spread. Print the number using the interval width as a guide: only the first two digits (neglecting leading zeros) are worth anything.
- When interpreting intervals, don’t put much stock in the last digit. For example, is 245 km/s inside the interval [246, 261] km/s. Not mathematically. But remembering that the last digit in 246 is not to be taken as absolute, 245 is for all practical purposes inside the interval.

As I write (2024-01-11), a news notice appeared on my computer screen from the New York *Times*.

The “Inflation Ticks Higher” in the headline is referring to a change from 3.3% reported in November to 3.4% reported in December. Such reports ought to come with a precision interval. To judge from the small wiggles in the 20-year data, this would be about \(\pm 0.2\)%. A numerical change from 3.3% to 3.4% is, taking the precision into account, no change at all!