Instructor Teaching Notes for Lesson 33

Math300Z

Author

Daniel Kaplan

Published

March 8, 2023

Risk

  • What is risk?

Example: Smoking

In Lesson 29 we looked at “life tables” for the US, the rate of mortality as a function of age for each sex separately. In Lessons 30 and 31 we considered increased mortality due to smoking.

CDC information on smoking and lung cancer

Cigarette smoking is the number one risk factor for lung cancer. In the United States, cigarette smoking is linked to about 80% to 90% of lung cancer deaths. Using other tobacco products such as cigars or pipes also increases the risk for lung cancer. Tobacco smoke is a toxic mix of more than 7,000 chemicals. Many are poisons. At least 70 are known to cause cancer in people or animals.

People who smoke cigarettes are 15 to 30 times more likely to get lung cancer or die from lung cancer than people who do not smoke. Even smoking a few cigarettes a day or smoking occasionally increases the risk of lung cancer. The more years a person smokes and the more cigarettes smoked each day, the more risk goes up.

People who quit smoking have a lower risk of lung cancer than if they had continued to smoke, but their risk is higher than the risk for people who never smoked. Quitting smoking at any age can lower the risk of lung cancer.

Take five minutes to discuss this with your group. What do you like or not like about this statement for the purposes of guiding an individual’s action?

Three points from the CDC statement

1, A statistic like “15 to 30 times more likely” is called a risk ratio. I’m going to assume that [15,30] is a confidence interval.

  1. “Cigarette smoking is linked to about 80% to 90% of lung cancer deaths.” This is called a “population attributable fraction.” This is useful for assigning blame.

  2. “Tobacco smoke is a toxic mix of more than 7,000 chemicals. Many are poisons.” This is neither here nor there. The risk what matters for decision-making.

CIs and decision making
  • Basic use for a confidence interval … if you would make a different decision at the two ends of the CI, the decision is not a no-brainer.
    • Statisticians have the luxury of saying, “You need more data to support the decision.”
      • How much more data? The amount needed so that you would make the same decision at either end of the CI.
      • This can be calculated using the \(1/\sqrt{n}\) rule for the width of a CI.
    • Decision-makers have to make decisions with the information currently at hand.

QUESTION: Your 65-year old uncle Fred is a one-pack a day smoker. You want him to stop. (Good for you!) But Fred says that smoking is his favorite activity. Should a risk ratio of 15 for lung cancer be a compelling argument for stopping?

  • Is “risk ratio” the right measure for making a decision?
  • Is lung cancer the outcome of interest or something else?

Example calculation of risk

Using the Whickham data, calculate the 20-year mortality rate for smokers and for non-smokers. (Warning: these data contain a historical artifact which makes the result unreliable.)

Convert the categorical outcome to a zero-one variable

Whickham <- Whickham |>
  mutate(dead = zero_one(outcome, one="Dead"))

Model the zero-one response by smoker:

lm(dead ~ smoker, data=Whickham) |> conf_interval()
# A tibble: 2 × 4
  term          .lwr   .coef    .upr
  <chr>        <dbl>   <dbl>   <dbl>
1 (Intercept)  0.282  0.314   0.347 
2 smokerYes   -0.124 -0.0754 -0.0265
  • The baseline 20-year mortality risk is 31%.
  • The 20-year mortality risk for smokers is 31% - 7% = 24%.

The risk ratio for smokers is 24%/31% = 0.77.

But for an individual person, the absolute risk change, -7%, is more informative. So why use risk ratios at all?

Risk ratios in all-cause mortality

Five-year mortality, death rate per 1000 person-years

Age # deaths # at risk rate RR (unadjusted) RR (adjusted)
65-69 123 1835 13.6 1 1
70-74 155 1616 19.8 1.46 (1.15-1.85) 1.15 (0.90-1.47)
75-79 172 1061 34.8 2.59 (2.04-3.26) 1.45 (1.13-1.87)
80-84 120 496 54.5 4.08 (3.17-5.25) 1.72 (1.29-2.29)
\(\geq\) 85 76 193 96.6 7.10 (5.52-9.79) 2.56 (1.82-3.61)

Risk Factors

Source Level RR (unadjusted) RR (adjusted)
Sex male (1.91-2.63) (1.84-2.98)
Annual Income >50K (0.41-0.73) (0.54-0.98)
Weight < 142M 115F lb 1.00
< 156M 131F (0.48-0.78) (0.67-1.12)
< 172M 145F (0.39-0.64) (0.48-1.12)
< 190M 168F (0.34-0.55) (0.59-0.77)
> 190M, 168F (0.37-0.61) (0.43-0.75)
Activity < 67.5 kcal 1.00
< 473 (0.49-0.80) (0.60-1.00)
< 980 (0.43-0.70) (0.63-1.05)
< 1890 (0.23-0.38) (0.55-0.93)
Smoking Never smoked 1.00
<25 pack-years (0.85-1.32) (0.88-1.38)
<50 pack-years (1.04-1.60) (0.90-1.43)
> 50. (1.65-2.58) (1.25-2.00)
Systolic BP < 129 1.00
< 147 (0.68-1.13)
< 153. (0.75-1.27)
< 169. (0.87-1.51)
> 169. (1.17-2.08)
Congestive heart failure yes (1.84-2.52) (1.29-2.16)
Self-assessed health. poor (5.20-10.88) (1.27-2.87)

Activity