Stats for Data Science
An MAA mini-course at JMM 2020, Daniel Kaplan

Guiding model building

Given a base model and a proposed elaboration of that model, does the elaboration reveal new aspects of the relationship between the response and explanatory variables?

  • Inputs from the model:
    • \(v_r\) and \(n\)
    • \(v_m^{base}\) and \(v_m^{elab}\),
    • degrees of flexibility \(^\circ\!{\cal F}^{base}\) and \(^\circ\!{\cal F}^{elab}\)
  • Output:

    \[\Delta \mbox{F} = \frac{n - (^\circ\!{\cal F}^{elab} + 1)}{^\circ\!{\cal F}^{elab} - ^\circ\!{\cal F}^{base}} \cdot \frac{v_m^{elab} - v_m^{base}}{v_r - v_m^{elab}}\]
  • Interpretation: Is \(\Delta\)F \(\gtrapprox 4\)? Then a relationship is discernible.1

    • Notes:
      • The special case of a model with \(^\circ{\cal F} = 0\) is called the Null Model and corresponds to the claim that there is no relationship between the explanatory variables and the response variable. In this special case, \(\mbox{F} = \Delta \mbox{F}\).
      • \(\Delta \mbox{F} \neq \mbox{F}^{elab} - \mbox{F}^{base}\)

Example

An exercise comparing two models



  1. Recall that I’m using discernible as a replacement for significant, as proposed by Jeff Witmer.

MAA mini-course evaluation