Chapter 24 DRAFT: Blame

JUST NOTES

  1. What we mean by “blame”
    1. Political and legal sense:
      1. politics is about the allocation of scarce resources: who gets what. Such decisions are made typically by an established process which, in many settings, is at least somewhat tied to ethical principles, economic theory, etc.
      2. responsible and accountable
      3. legal standards of “proof”
        1. “Proof” just means testing claims (like proofing yeast in bread baking) as opposed to mathematical proof (logical steps from assumptions).
        2. Legal processes are sometimes interested in the establishment of blame in an individual case, as opposed to the beneficial formation of policy.
      4. Examples from law, e.g. equally blameworthy each of the people who participate in a crime.
    2. Statistical policy formation sense:
      1. We are thinking of taking an action (that will be repeated in the long run) to avoid some problem or make a positive outcome more likely. The consequences for a single individual are not the primary concern, so much as the expectation of future outcomes over the long run.
      2. Analogy: Counting cards at blackjack. Having refined our understanding about the conditional probabilities of various hands, we take an action, e.g. increasing the size of the bet in favorable situations. This doesn’t mean that we are assured to win any particular hand but that the long-term frequencies are such that we anticipate a high probability of a successful outcome.
      3. The blame for any one factor added up across all factors can be bigger than 100%. This is because we’ll count the blame ceterus paribus, holding all the other factors constant. So if X is 60% at blame, another factor Z might be at 50%.
        1. This allows us to identify the factor where policy will have the greatest overall impact per unit cost of the policy.
        2. Formation of an overall that has the greatest benefit per cost would be done by mathematical optimization: make a small change, then evaluate again in that context to guide the next small change, and so on.
  2. Importance of causation.
    1. The policy is making an intervention which we want to change the outcomes. Making the link between intervention and outcome is what causality is all about.
  3. Attributable fraction
  4. Probability of necessity
  5. Probability of sufficiency

Introduction: Blame is a matter of belief. What we can do here is look for objective criteria for creating a reasonable belief. Some of these techniques come from the artificial intelligence community, who are looking for ways to automate reasoning about belief.

Austin Bradford Hill criteria

Attributable fraction (aka “fraction of attributable risk”) - see https://www.cdc.gov/pcd/issues/2007/jan/06_0091.htm

24.1 From Pearl

  • Probability of necessity
  • Probability of sufficiency

24.2 Lessened causation standard of Section 2000e-2(m)

From 42 U.S. Code § 2000e-2 - Unlawful employment practices

  1. Impermissible consideration of race, color, religion, sex, or national origin in employment practices

Except as otherwise provided in this subchapter, an unlawful employment practice is established when the complaining party demonstrates that race, color, religion, sex, or national origin was a motivating factor for any employment practice, even though other factors also motivated the practice.

24.3 Cornfield’s inequality

Saying that there might be a confounder is a common way of discounting causal claims. It’s legitimate enough. But Cornfield’s inequality tells us how strong that confounder needs to be in order to account for the observations. Smoking and lung cancer history.

Other methods: see https://trang-q-nguyen.weebly.com/uploads/4/6/0/3/46030141/nguyenumdsensitivityanalysis.pdf

24.4 A crime

A valuable jewel is on display in a shop, which, over the years, has been burglarized several times. One morning, the jewel is missing. Examination of security video shows that just before closing time on the night the jewel went missing, a man is seen standing in front of the display case holding the jewel. Further investigation revealed the same man in the same place at the same time on a few other days. Did the man steal the jewel?

The data reveals a correlation between the man’s suspect presence and the jewel’s disappearance. On the other hand, the man’s presence is not necessary for the theft of a jewel; the shop has been burglarized before. And the man’s presence is not sufficient; the man was observed on days in which none of the shop’s stock went missing.

Three day’s after the heist, the man is seen again in the shop at the same display case just before closing time. Police are called. The man’s identity is discovered. Is there good reason to justify a court giving a search warrant for the man’s apartment? Is there good reason to justify holding the man under arrest while the investigation continues? Is there good reason to try the man for burglary? Is there good reason to find the man guilty of the crime and imprison him?

Each of these actions imposes a cost on the man. Which costs are justified by the blame we can assign to the man for the jewel’s disappearance?

There is certainly evidence that the man stole the jewel. The legal system has different standards for evidence that apply in different situations. In civil cases (e.g. the store sues the man), the standard is “a preponderance of evidence.” The word preponderance means “the quality or fact of being greater in number, quantity, or importance.” But without a numerical measure of the “amount of evidence”, the standard is a metaphor rather than an objective finding. In practice, this is about whether the person applying the standard, judge or jury, thinks there’s more reason to believe the man stole the jewel than the opposite. One can imagine additional evidence becoming available. We find out that the man did web searches on “how to steal jewels.” Perhaps the man took a lock-picking course or a magic course. On the other hand, perhaps the man is writing a crime novel about a jewel theft, and claims the web searches and courses were research for that book.

Another legal standard is evidence “beyond a reasonable doubt.” Again, this doesn’t have an objective meaning. (See <https://supreme.justia.com/cases/federal/us/511/1/case.pdf).)

Still another legal standard is “but for” causation. See https://en.wikipedia.org/wiki/Proximate_cause.

  1. Dislike of subjectivity.
    1. Grammar point of view. “I analyzed the data.” “I” is the subject, “the data” is the object. We want the analysis to depend only on object “the data” and not on the subject “I”. Certainly we want to avoid situtations where there is only a subject and no object, as occurs with intransitive verbs as in “I believe.” But what about “The data tell me.” Now the object is “me” and the subject is “the data”.
    2. Subjective in philosophy. See definition at https://en.wikipedia.org/wiki/Subjectivity > Subjectivity is a central philosophical concept, related to consciousness, agency, personhood, reality, and truth, which has been variously defined by sources. Three common definitions include that subjectivity is the quality or condition of: > * Something being a subject, narrowly meaning an individual who possesses conscious experiences, such as perspectives, feelings, beliefs, and desires. [Let’s add goals to that list; a goal is a kind of a desire.] > * Something being a subject, broadly meaning an entity that has agency, meaning that it acts upon or wields power over some other entity (an object). [This is often the object of data science: to act in some way.] > * Some information, idea, situation, or physical thing considered true only from the perspective of a subject or subjects. [Traditionally, an opinion. But in data science the goal is important. Something needs to be true to accomplish our task.]
  2. Causation not an issue, so adjustment sets aren’t so important.
  3. Clever experimental design (random assignment/orthogonality) avoids difficulty of collinearity, i.e. one explanatory variable standing in for another. So we don’t need to deal so much with confounders/covariates. Experimental design is one of the major areas of statistics. Doing experiment is part of the culture of statistics.

24.5 Exercises

“Exercises/fir-give-window.Rmd”