Chapter 23 Adjustment

JUST NOTES

  1. Comparing like things
  2. Assembling conditional probabilities into a whole, e.g. survey weighting

More things…

  1. Making meaningful comparisons: stratify and compare strata
  2. Add up the comparisons for each strata with an agreed-upon, standard weight for each strata.

23.2 Minimum wage example

See Figure 5.3.

23.3 Comparing to an index

Case mortality from Hill-1937a-WA-III.pdf (perhaps for exercise).

23.4 Mismeasure of man

As described in Gould (1996) (pp. 87-99), in the 19th century, racist assertions of superiority were tied to measurements of brain size, with so-called superior races having, on average, the largest brains. There’s no justification for the association of superior intellect with brain size. People of short stature tend to have smaller brains than those of large stature, females tend to have smaller brains than males. In assembling the averages by race, no attempt was made to adjust for the stature of individuals. The claimed differences between the races was substantially the product of the samples used having different mixtures of small- and large-statured people included.

23.5 Age adjustment

A specific example in a context that’s easy to understand. Mortality rates among countries.

Mortality rates: adjusting for age

Cancer rates: adjusting for age

See https://ourworldindata.org/causes-of-death for many death-rate stats.

23.6 Adjustment for covariates

23.7 Seasonal adjustment

Separating out known sources of variation from unknown ones.

23.8 Example

Age-adjusted incidence and mortality rates by year of diagnosis, see https://seer.cancer.gov/archive/csr/1975_2014/results_merged/topic_annualrates.pdf tables 4.5 and 4.6

23.9 More details

From conditioning to correlation: p(xy) = p(y | x) p(x) = p(x | y) p(y)

Use tilde instead of |. So p(xy) = p(y ~ x)p(x)

## 
## Call:
## glm(formula = Y ~ A, family = binomial, data = A_data)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.9081   0.5944   0.5944   0.7535   0.7535  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  1.11395    0.09379  11.877  < 2e-16 ***
## AA           0.52982    0.16654   3.181  0.00147 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1038.4  on 999  degrees of freedom
## Residual deviance: 1027.9  on 998  degrees of freedom
## AIC: 1031.9
## 
## Number of Fisher Scoring iterations: 4
## 
## Call:
## glm(formula = Y ~ A + C, family = binomial, data = A_data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.88294   0.09448   0.17776   0.75942   1.25211  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -0.1739     0.1221  -1.424    0.154    
## AA            1.2698     0.1890   6.720 1.82e-11 ***
## CC            4.3138     0.4258  10.130  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1038.42  on 999  degrees of freedom
## Residual deviance:  713.18  on 997  degrees of freedom
## AIC: 719.18
## 
## Number of Fisher Scoring iterations: 7
##   A meanI_lower meanI_upper
## 1 o   0.9373592    1.307905
## 2 A   1.3942770    1.944251
##   A C meanI_lower meanI_upper
## 1 o o  -0.4100024  0.07603221
## 2 A o   0.8123784  1.39481499
## 3 o C   3.4278546  5.63746818
## 4 A C         Inf         Inf

References

Gould, Stephen Jay. 1996. The Mismeasure of Man. Revised and expanded. W.W. Norton.