`|> model_train(mass ~ sex + flipper) |> conf_interval() Penguins `

term | .lwr | .coef | .upr |
---|---|---|---|

(Intercept) | -5970.0 | -5410 | -4850.0 |

sexmale | 268.0 | 348 | 427.0 |

flipper | 44.1 | 47 | 49.8 |

Our choice of explanatory variables sets the type of signal we are looking for. In the 1940 news report from France, the signal of interest is human speech; our ears and brains automatically separate the signal from the noise. But suppose we were interested in another kind of signal, say a generator humming in the background or the dots and dashes of a spy’s Morse Code signal. We would need a different sort of filtering to pull out the generator signal, and the speech and dots and dashes (and anything else) would be noise. Identifying the dots and dashes calls for still another kind of filtering.

The same is true for the penguins. If we look for a different type of signal, say body mass as a function of the bill shape, we get utterly different coefficients:

```
|>
Penguins model_train(mass ~ bill_length + bill_depth) |>
conf_interval()
```

term | .lwr | .coef | .upr |
---|---|---|---|

(Intercept) | 2550.0 | 3410.0 | 4270.0 |

bill_length | 62.9 | 74.8 | 86.8 |

bill_depth | -179.0 | -146.0 | -112.0 |

Given the type of signal we seek to find, and the model coefficients for that type of signal, we are in a position to make a claim about what is the signal and what is the measurement in an individual penguin’s body mass. Simply evaluate the model for that penguin’s values of the explanatory variables to get the signal. What’s left over—the **residuals**— is the noise.

To illustrate, lets look for the `sex`

& `flipper`

signal in the penguins:

```
<-
With_signal |>
Penguins mutate(signal = model_values(mass ~ sex + flipper),
residuals = mass - signal)
```

It’s time to point out something special about the residuals; there is no pattern component in the residuals. We can see that by modeling the residuals with the explanatory variables used to define the pattern:

```
|>
With_signal model_train(residuals ~ sex + flipper) |>
conf_interval()
```

term | .lwr | .coef | .upr |
---|---|---|---|

(Intercept) | -562.00 | 0 | 562.00 |

sexmale | -79.40 | 0 | 79.40 |

flipper | -2.84 | 0 | 2.84 |

The coefficients are zero! This means that the residuals do not show any sign of the pattern—everything about the pattern is contained in the signal!

A right triangle provides an excellent way to look at the relationship among the signal, residuals, and the response variable. We just saw that the residuals have nothing in common with the signal. This is much like the two legs of a right triangle; they point in utterly different directions!

For any triangle, any two sides add up to meet the third side. This is much like the response variable being the sum of the signal and the residuals. A right triangle has an additional property: the sum of the square lengths of the two legs gives the square length of the hypothenuse. For the penguin example, we can confirm this Pythagorean property when we use the **variance** to measure the “amount of” each component.

```
|>
With_signal summarize(var(mass),
var(signal) + var(residuals))
```

var(mass) | var(signal) + var(residuals) |
---|---|

648370 | 648370 |

Engineers often speak of the “signal-to-noise” (SNR) ratio. In sound, this refers to the loudness of the signal compared to the loudness of the noise. For sound, the signal-to-noise ratio is often measured in decibels (dB). An SNR of 5 dB means that the signal is three times louder than the noise.

You can listen to examples of noisy music and speech at this web site, part of which looks like this:

Press the links in the “Noisy” column. The noisiest examples have an SNR of 5 dB. Press the play/pause button to hear the noisy recording, then compare it to the de-noised transmission—the signal—by pressing play/pause in the “Clean” column.

It’s easy to calculate the signal-to-noise ratio in a model pattern; divide the amount of signal by the amount of noise:

```
|>
With_signal summarize(var(signal) / var(residuals))
```

var(signal)/var(residuals) |
---|

4.2 |

The signal is about four times larger than the noise. Converted to the engineering units of decibels, this is 6.2 dB. You can get a sense for what this means by listening to the 5 dB recordings and judging how clearly you can hear the signal.