# Important word pairs

Many of the vocabulary terms used in statistical thinking come in pairs. We list several such pairs below, in roughly the order they first appear in the Lessons. The pairs can be a reference while reading, but it is also helpful to return to this list to sharpen your understanding of the distinctions.

**Explanatory** vs **response** variables. Models (in these Lessons) always involve a *single* response variable*. In contrast, models can have zero or more explanatory variables.

**Variable** vs **covariate**. “Covariate” is another word for an explanatory variable. The word “covariate” signals that the variable is not itself of direct interest to the modeler but puts another explanatory variable in a correct context.

**Categorical** vs **quantitative** variables. Always be aware of whether a model’s response variable is categorical or quantitative. When categorical, expect to use `zero_one()`

to convert it to quantitative before modeling. In contrast, explanatory variables can be either categorical or quantitative.

**Regression model** vs **classifier**. A regression model always has a *quantitative* response variable. A classifier has a *categorical* response variable. In these Lessons, as in much professional use of data, our categorical response variables will have *two levels* (e.g., healthy or sick, up or down, yes or no). In this situation, regression techniques suffice to build classifiers.

**Model** vs **model function**. By “model,” we will almost always mean “regression model.” A regression model, typically constructed by the `lm()`

function, contains various information useful to summarize the model. The “model function” provides the mechanism for one important task, calculating from values from the explanatory variables the corresponding model output.

**Model coefficient** vs **effect size**. Model coefficients are numerical parameters. Training determines the appropriate values for the coefficients. In contrast, an effect size describes the relationship between the response variable and a selected explanatory variable.

**Point estimate** vs **interval estimate**. A point estimate is a single number. For instance, a model coefficient is a point estimate, as is the output from a model function. In contrast, interval estimates involve *two* numbers; one specifies the lower end of the interval and the other number specifies the upper end.

**Prediction interval** vs **confidence interval**. A prediction interval describes the anticipated range of the actual result for which we have made a prediction, e.g., “tomorrow’s wind will be between 5 and 10 mph.” A **confidence interval** is often used to express the uncertainty in a coefficient or effect size.