12  Adjustment

The phrase “all other things being equal” is a critical qualifier in describing relationships. To illustrate: A simple claim in economics is that a high price for a commodity reduces the demand. For example, increasing the price of gasoline will reduce demand as people avoid unnecessary driving or purchase electric cars. Nevertheless, the claim can be considered obvious only with the qualifier all other things being equal. For instance, the fuel price might have increased because a holiday weekend and the attendant vacation travel has increased the demand for gasoline. Thus, higher gasoline prices may be associated with higher demand unless holding constant other variables such as vacationing.

In economics, the Latin equivalent of “all other things being equal” is sometimes used: “ceteris paribus”. The economics claim would be, “higher prices are associated with lower demand, ceteris paribus.”

Although the phrase “all other things being equal” has a logical simplicity, it is impractical to implement “all.” So instead of the blanket “all other things,” it is helpful to consider just “some other things” to be held constant, being explicit about what those things are. Other phrases along the same lines are “adjusting for …,” “taking into account …,”, and “controlling for ….”

We will use the word “adjustment” to name the statistical techniques by which “other things” are taken into account. Those other things, as they appear in data, are called “covariates.”

There are two phases for adjustment, one requiring careful thought and understanding of the specific system under study, the other—the topic of this Lesson—involving only routine, straightforward calculations.

At least, the calculations are straightforward once you know how to construct and interpret statistical models as described by the earlier in the last few Lessons.

Phase 1: Choose the relevant covariates for adjustment. This almost always involves familiarity with the real-world context. We’ll develop a framework for making such choices based on causal connections in Lesson Chapter 23.

Phase 2: After building a model with the covariates from Phase 1 as explanatory variables, do simple calculations. Here, we will focus the easiest possible calculation: merely looking at the model coefficients.

To keep the presentation of the Phase-2 technique as simple as possible, we will look at settings and models with covariates selected based on common sense and not subject to the careful scrutiny that Phase 1 requires in useful work. However, it cannot be overemphasized that proper choice of covariates is absolutely crucial to drawing genuinely useful conclusions.

Indeed, ignoring covariates is a primary source of data-driven nonsense.

Mortality adjustment

Table 12.1: Life expectancy at birth for several countries and territories. (Source)
Country Female Male
Japan 87.6 84.5
Spain 86.2 80.3
Canada 84.7 80.6
United States 80.9 76.0
Gaza 77.2 73.7
Bolivia 74.0 71.0
Russia 78.3 66.9
North Korea 75.9 67.8
Haiti 68.7 63.3
Nigeria 63.3 59.5
Somalia 58.1 53.4

We will start with an example that will be dead familiar to most readers. In debates about health-care policies and environmental conditions we often refer to the statistic of “life expectancy.” Countries with higher life-expectancy are deemed to have better health care or less dangerous conditions such as air pollution, drug and alcohol abuse, smoking rates, vehicle and pedestrian safety, civil strife, etc. Table 12.1 shows male and female life expectancy for a handful of countries as reported.

The reader broadly familiar with country-by-country variation will not be surprised by the numbers in Table 12.1. (Except, perhaps, by the astonishing disparity between females and males in Russia.) There is a roughly 30-year difference in life expectancy between Japan and Somalia; those countries are very different in terms of income, civil strife, etc.

The numbers in Table 12.1 faithfully reflect the overall situation in the different countries. Yet, without adjustment, they are not well suited to inform about specific situations. For example, over many years in my epidemiology course at Macalester College, I asked students to look at such tables and make policy suggestions for how to improve things. Almost always their recommendations involved improving access to health care, especially for the elderly.

Table 12.2: Life expectancy at age 70. (Source: World Health Organization) average of 65-74 year olds)
Country Female Male
Japan 21.3 17.9
United States 18.3 16.3
Russia 16.2 12.2
Bolivia 13.6 13.0
Haiti 12.9 12.1
Somalia 11.6 9.7

But life expectancy is not mainly, or even mostly, about old age. Two critical determinants are infant mortality and lethal activities by males in their late teenage and early adult years. If we want to look at conditions in the elderly, we need to consider elderly people, that is, adjusting for the non-elderly sources of mortality. This is routinely done in life-expectancy calculations, but hardly ever reported. Table 12.2 shows, according to the World Health Organization, how many years longer a 70-year old can expect to live. The 30-year difference between Japan and Somalia seen in Table 12.1 is reduced, for 70-year olds, to about a decade. The differences between males and females are similarly reduced

Except in Russia.

A picture of adjustment

“Adjustment” is a statistical method for “taking other things into account.” Learning to take other things into account is a basic component in assembling a basket of skills often called “critical thinking.” Speculating what those “other things” should be is a matter of experience and judgment. That is, reasonable people’s opinions may differ.

Labeling a basket as “statistical thinking” does not imply that the contents of the basket are consistent with one another, even if they rightfully belong in the same basket. An example is a critical thinking skill of noting how a person’s conclusion might be rooted in matters of employment or funding or social attitudes. Too often, those unfamiliar with statistical adjustment see it as a mathematical ploy to hide such biases. A particularly nefarious form of identity politics attributes any disagreement to bias. The statistician undertaking a careful and honest adjustment regarding a matter of social controversy should be prepared for ad hominem attacks.

In this section, we will look at a particular setting where “adjustment” is important to drawing proper conclusions: health disparities between urban and rural areas and, particularly, differences in patient mortality between urban and rural hospitals. For most people, particularly urbanites, this is not a pressing matter of social justice. For exactly this reason, it is a good setting for learning about statistical adjustment, since few people have strong pre-conceptions about the issues involved. Insofar as people are flexible in forming opinions on urban/rural disparities, we can draw a picture of “adjustment” without offending anyone.

Notice the use of two different words, “differences” and “disparities.” According to Oxford Languages, a “disparity” is “a difference in level or treatment, especially one that is seen as unfair.” We will not attempt here to determine if the differences in urban-vs-rural health are unfair. That would require investigating the elements that cause the differences. Starting in Lesson 23 we will take a serious look at techniques for forming responsible conclusions from non-experimental—that is, “observational”—data about causality. Remarkably, traditional statistics texts warn off any conclusion about causality from observational data. This means that they should properly be silent about whether a difference is a disparity.

Differences in urban versus rural death rates are described by a September 2021 report from the US National Center for Health Statistics. [S. Curtin and M. Spencer, “Trends in death rates in urban and rural areas: United States, 1999-2019” link.] Figure 12.1 shows a graphic from that report summarizing 20 years data on mortality.

Figure 12.1: Overall age-adjusted mortality rates separately for males and females in urban (green) and rural (blue) US counties. Source

Figure 12.1 shows that the age-adjusted death rate (in deaths per year per 100,000 people) is higher for males than for females and, within each sex, higher for those living in rural vs. urban counties. Note that the presented numbers for each year are not just a matter of the “raw facts,” counting up death certificates in each of the four groups—urban females, rural females, urban males, rural males—and dividing by the group populations. Instead, each group’s raw facts have been adjusted for age.

Book-keeping for age adjustment

The age adjustment is accomplished by book-keeping. Instead of an overall raw death rate for each group (in each calendar year), separate death rates are calculated for people dying at each age. We can suppose that there are about 100 such age groups—zero to one year old, one to two years old, etc.—for each of the four groups. These 100 age-specific death rates are multiplied by a fictional age-specific population called the “standard population.” Usually, the standard population is established by an authoritative source and is intended to be close to the overall age-specific population regardless of group. Multiplying the 100 death rates by the 100 populations produces 100 counts of age-specific deaths, one count for each age group. Then add up the age-specific death counts and divide by the total number in the standard population to get the age-adjusted death rate.

The US “standard population” is described here.

Why go through the trouble of doing separate calculations for each age group when, in the end, the results get summed up to produce the overall result? It’s likely that urban and rural populations have a different age (and sex) structure. For instance, young people move from rural to urban areas at a relatively high rate, meaning that the fraction of the rural population that is young will be less than the fraction for urban areas. Since young people have a lower death rate than old people, the smaller relative population of young people in rural areas would lead to overall death rates being higher in rural areas even if at each age the death rates were the same.

Effectively, the age adjustment of death rates makes irrelevant any theory that attributes the urban-vs-rural mortality differences to the different urban-vs-rural age structures. The rates calculated directly from raw data might display such age-structure dependency, but the adjusted rates do not.

Age adjustment is important in health statistics for two different reasons:

  1. age is such an important factor in determining mortality;
  2. the pattern of increased mortality with age is regarded as “natural” or “inevitable.”

The person who proposed investigating the rural-urban differences in mortality as a consequence of different availability of health care would be regarded as sensibly contributing to possible decisions about how best to set health-care policy. But the person who proposes to reduce rural mortality rates by exporting young urbanites to rural areas is a fool. Such an export policy would (presumably) decrease (raw) mortality rates in the urban districts, but would have absolutely no health benefit to any individual.

Figure 12.2: Age-adjusted death rates from several sources of mortality. Source

With age structure ruled out as contributing to the urban-rural differences in age-adjusted mortality rates, we can focus attention on other factors that might be involved. For instance, might the urban-rural difference be attributable in part to pesticide-use induced excess cancer rates in rural counties? Figure 12.2 shows the age-adjusted disease sources of mortality from the “Trends in death rates …” report. There’s no indication that rural-urban differences in age-adjusted cancer death rates are different from from the other disease sources of death. Knowing this, we can turn our speculation to other theories, presumably ones that operate similarly across disease categories.

The World Health Organization standard population

There is much to be learned by comparing health statistics in different countries. For example, in comparing countries with the same level of income, etc., the country with the best health statistics might have useful pointers for public policy. Of course, meaningful health statistics should be adjusted for age. Adjustment is done by reference to a “standard population.” Figure 12.3 shows the World Health Organizations standard population. Following the pattern observed in most of the world, younger people predominate. A similar pattern was seen in the US many decades ago, but the US population has changed dramatically and now includes roughly equal numbers of people over a wide span of ages. Even so, the WHO standard population is valuable for comparing US health statistics to those in other countries that have a very different age distribution.

(a) [The WHO standard age distribution]

(b) Age distribution in US

Figure 12.3: Comparing the World Health Organization’s standard population to the US population in 1972 and 2021. Females are shown in blue, males in green.

Adjustment generally

Considering the differences between urban and rural mortality in many diseases (Figure 12.2), we might speculate that a possible cause is differences in health-care effectiveness. Imagine, in an attempt to gain insight, that we collect hospital-by-hospital patient admission and mortality data for all US hospitals, then compare the rural and urban hospitals.

Common sense suggests that if we found that rural hospitals had, on average, a higher rate than urban hospitals of bad patient outcomes, we would have substantiated our speculation. But the statistical thinker knows that other factors might be playing a role.

The rate here would be bad outcomes per 100 patient admissions.

We’ll illustrate what might happen with a data simulation, hospital_dag. Since this is a simulation, the data will not be informative about real-world hospitals. Even so, the simulation can point to things that might go wrong in the data analysis and what we can do about them. The simulation will be set up so that rural hospitals have better patient outcomes than urban hospitals, a situation which would conflict with our speculation about hospital differences accounting for the differing urban vs rural mortality rates.

R code for a simulation of hospital outcomes.

Code
hospital_sim <- datasim_make(
  location ~ bernoulli(1+rnorm(n), labels=c("rural", "urban")),
  severity ~ 1 + rnorm(n) - 2*(location=="rural"),
  outcome ~ bernoulli(-1 + severity - I(location=="urban"), labels=c("good", "bad"))
)

Let’s collect a data frame of moderate size from the simulation with the unit of observation being a patient.

Patients <- datasim_run(hospital_sim, n=1000)

A few patients from the simulated hospital data.

head(Patients)

?(caption)

location    severity  outcome 
---------  ---------  --------
rural        -0.2700  bad     
rural        -0.1400  bad     
rural         0.0047  good    
urban         2.0000  good    
rural        -0.7300  bad     
urban         1.6000  good    

A statistical model let’s us compare outcome rates in the urban vs rural hospitals. (Remember: the data are simulated.) We’ll convert the outcome variable to zero-one form, with 1 standing for a bad outcome.

Patients |>
  mutate(result = zero_one(outcome, one="bad")) |>
  model_train(result ~ location) |>
  model_plot() |>
  label_zero_one()

Figure 12.4: Individual 0-1 patient outcomes for the 1000 patients shown with a model of patient outcome for urban vs. rural hospitals.

Remember that when modeling zero-one response variables, the model value is the proportion of ones (in this case, bad outcomes). The graph of the model shows that patient outcomes are worse in urban hospitals.

In our simulated world at least, the result indicates that hospitals don’t account for the differences in urban vs. rural mortality, since rural hospitals have better outcomes.

The locationurban coefficient, 1.77, is unadjusted. The data (?tbl-hospital1) don’t include the age of each patient, so we can’t adjust for age differences between the patients. Happily, the data record a severity index for each patient; this might be an even better adjustment variable than age.

First, let’s confirm that severity has something to say about outcomes. Figure 12.5 shows that the proportion of bad outcomes increases with the patient’s severity.

Patients |>
  mutate(result = zero_one(outcome, one="bad")) |>
  model_train(result ~ severity) |>
  model_plot() |>
  label_zero_one()

Figure 12.5: Individual 0-1 patient outcomes for the 1000 patients shown with a model of patient outcome for urban vs. rural hospitals.

If the patients’ illness severities average the same in urban and rural hospitals, the outcome results are already adjusted. In Figure 12.5 checks the data to see if there are differences in severity.

Patients |>
  model_train(severity ~ location) |>
  model_plot()

Figure 12.6: Severity versus urban/rural location in the simulated data. Urban hospitals have a higher average patient severity than rural hospitals.

To summarize the (simulated) situation up to this point in the analysis: urban hospitals have a higher patient mortality than rural hospitals. But urban hospitals also have a worse (higher) severity index than rural ones.

How can we take into account (“adjust for”) the worse severity for urban hospitals in order to compare rural vs. urban in a fair way. The answer is remarkably simple, once you know how to train models! We make a model of outcome that includes both location and severity as explanatory variables. See Figure 12.7.

Patients |>
  mutate(result = zero_one(outcome, one="bad")) |>
  model_train(result ~ severity + location) |>
  model_plot(data_alpha=0.1) |>
  label_zero_one()

Figure 12.7: Modeling patient outcomes by both location and severity. The model value for outcome is worse for urban hospitals at any level of severity.

The adjusted result is seen by comparing the model output values. Notice that the model values (curves) show that the outcome is worse for rural than for urban hospitals at any given level of severity. [You may be able to see the same result directly from these simulated data: At severities near 0, the rural hospitals have a much bigger fraction of patients with bad outcomes.

In adjusting the result, we choose a common, “standard” level of severity to display both the rural and urban model outcomes. It doesn’t really matter what level we choose for the “standard,” but it is sensible to choose a level that’s reflective of the data as a whole. So a good standard for comparison would be at severity zero.

Students with better study habits will actually mark the points as directed, then trace each point as it moves along its curve to the selected standard for severity. Instructors will want to demonstrate this to the class to help those students who aren’t sure what it means to “move a point along its curve.”

How could …?

Figure 12.7 shows that (the simulated) urban hospitals have better outcomes at all levels of patient severity. So how could it be that, not adjusting for severity, rural hospitals show better outcomes? To see why, refer to Figure 12.6 and note that the average severity for rural hospitals is about -1, while for urban hospitals it is about 1.

Now turn to Figure 12.7. Mark the point on the model curve of rural hospitals corresponding to the average severity of -1. Similarly, mark the point on the model curve of urban hospitals corresponding to an average severity of 1. The urban point falls higher on the vertical axis than the rural point, despite the urban curve being lower than the rural curve.

Adjustment effectively moves both points along their respective curves to a “standard” level of severity, say 0. The adjusted urban point is now lower than the adjusted rural point.

In general, to adjust a response variable for “other factors,” follow this procedure:

  1. Build a statistical model with the given response variable using the explanatory variable of particular interest to you. (In the above, that variable was location.) Also include as explanatory variables all the “other factors.” (In the above, there was only one “other factor”: severity.)
  2. Select standard value for each of the “other factors.” This is usually some value that is typical for each variable looking at the data frame as a whole. For instance, the standard might reasonably be wet to a round number near the mean for each “other factor” considered one at a time.
  3. Evaluate the model at the standard value for each of the “other factors,” but at two values for the explanatory variable of particular interest to you. The difference between the two model outputs is the adjusted difference for the explanatory variable of interest.

In Lesson 21 we will see how each model coefficient is itself automatically adjusted for all the other explanatory variables in the model.

Exercises

Exercise 11.2

Knives and forks example from p. 147 in Milo’s book.

IN DRAFT.

Examples of adjustment using the method described at the end of the last section.

Participation-adjusted school performance. Something is not working here. You’ll need to take spending into account

SAT |> model_train(sat ~ frac + expend) |> conf_interval()
Waiting for profiling to be done...
term                 .lwr        .coef         .upr
------------  -----------  -----------  -----------
(Intercept)     6.8575490    6.9003135    6.9431259
frac           -0.0033956   -0.0029742   -0.0025530
expend          0.0044303    0.0127242    0.0209991
SAT |> select(state, sat, frac, expend) |>
  mutate(adj_sat = sat - 0.00297*(50-frac) + 0.0127*(6 - expend))
state              sat   frac   expend     adj_sat
---------------  -----  -----  -------  ----------
Alabama           1029      8    4.405   1028.8955
Alaska             934     47    8.963    933.9535
Arizona            944     27    4.778    943.9472
Arkansas          1005      6    4.459   1004.8889
California         902     45    4.992    901.9980
Colorado           980     29    5.443    979.9447
Connecticut        908     81    8.817    908.0563
Delaware           897     68    7.030    897.0404
Florida            889     48    5.718    888.9976
Georgia            854     65    5.193    854.0548
Hawaii             889     57    6.078    889.0198
Idaho              979     15    4.210    978.9188
Illinois          1048     13    6.136   1047.8884
Indiana            882     58    5.826    882.0260
Iowa              1099      5    5.483   1098.8729
Kansas            1060      9    5.817   1059.8806
Kentucky           999     11    5.217    998.8941
Louisiana         1021      9    4.761   1020.8940
Maine              896     68    6.428    896.0480
Maryland           909     64    7.245    909.0258
Massachusetts      907     80    7.287    907.0728
Michigan          1033     11    6.994   1032.8715
Minnesota         1085      9    6.000   1084.8782
Mississippi       1036      4    4.080   1035.8878
Missouri          1045      9    5.383   1044.8861
Montana           1009     21    5.692   1008.9178
Nebraska          1050      9    5.935   1049.8791
Nevada             917     30    5.160    916.9513
New Hampshire      935     70    5.859    935.0612
New Jersey         898     70    9.774    898.0115
New Mexico        1015     11    4.586   1014.9021
New York           892     74    9.623    892.0253
North Carolina     865     60    5.077    865.0414
North Dakota      1107      5    4.775   1106.8819
Ohio               975     23    6.162    974.9178
Oklahoma          1027      9    4.845   1026.8929
Oregon             947     51    6.436    946.9974
Pennsylvania       880     70    7.109    880.0453
Rhode Island       888     70    7.469    888.0407
South Carolina     844     58    4.797    844.0390
South Dakota      1068      5    4.775   1067.8819
Tennessee         1040     12    4.388   1039.9076
Texas              893     47    5.222    893.0010
Utah              1076      4    3.656   1075.8931
Vermont            901     68    6.750    901.0439
Virginia           896     65    5.327    896.0531
Washington         937     48    5.906    936.9953
West Virginia      932     17    6.107    931.9006
Wisconsin         1073      9    6.930   1072.8664
Wyoming           1001     10    6.160   1000.8792

Examples of adjustment using the method described at the end of the last section.

IN DRAFT.

Age adjustment in Whickham.

Class activity