8 Parameters
The variety of shapes of the nine pattern-book functions means that, often, one or another will be suitable for the modeling situation in hand. But, even if the shape of the function used is appropriate, the pattern still needs to be “adjusted” so that the units of output and input are well matched to the phenomenon being modeled. Let’s consider data from the outbreak of COVID-19 as an example. Figure 8.1 shows, day-by-day, the number of officially confirmed COVID-19 cases as the in the US in March 2020.
During the outbreak, case numbers increased with time. As time went on, the rate of case-number increase itself grew faster and faster. This is the same pattern provided by the exponential function.
Alongside the case-number data Figure 8.1 shows the function \(\text{cases}(t) \equiv e^t\) plotted as a \(\color{magenta}{\text{magenta}}\) curve.
There is an obvious mismatch between the data and the function \(e^t\). Does this mean the COVID pattern is not exponential?
This chapter will introduce how modelers stretch and shift the individual patter-book functions so that they can be used in models of real-world situations such as the outbreak of COVID-19.
8.1 Matching numbers to quantities
The coordinate axes in Figure 8.1 represent quantities. On the horizontal axis is time, measured in days. The vertical axis is denominated in “10000 cases,” meaning that the numbers on the vertical scale should be multiplied by 10000 to get the number of cases.
The exponential function takes as input a pure number and produces an output that is also a pure number. This is true for all the pattern-book functions. Since the graph axes don’t show pure numbers, it is no surprise then that the pattern-book exponential function doesn’t align with the COVID case data.
If we want the input to the model function \(\text{cases}(t)\) to be denominated in days, we will have to convert \(t\) to a pure pure number (e.g. 10, not “10 days”) before the quantity is handed off as the argument to \(\exp()\). We do this by introducing a parameter.
The standard parameterization for the exponential function is \(e^{kt}\). The parameter \(k\) will be a quantity with units of “per-day.” Suppose we set \(k=0.2\) per day. Then \(k\, t{\LARGE\left.\right|}_{t=10 days} = 2\). This “2” is a pure number because the units on the 0.2 (“per day”) and on the 10 (days) cancel out: \[0.2\, \text{day}^{-1} \cdot 10\, \text{days} = 2\ .\] The use of a parameter like \(k\) does more than handle the formality of converting input quantities into pure numbers. Having a choice for \(k\) allows us to stretch or compress the function to align with the data. Figure 8.2 plots the modeling version of the exponential function to the COVID-case data:
8.2 Parallel scales
At the heart of how we use the pattern-book functions to model the relationship between quantities is the idea of conversion between one scale and another. Consider these everyday objects: a thermometer and a ruler.
Each object presents a read-out of what’s being measured—temperature or length—on two different scales. At the same time, the objects provide a way to convert one scale to another.
A function gives the output for any given input. We represent the input value as a position on a number line—which we call an “axis”—and the output as a position on another output line, almost always drawn perpendicular to one another. But the two number lines can just as well be parallel to one another. To evaluate the function, find the input value on the input scale and read off the corresponding output.
We can translate the correspondance between one scale and the other into the form of a straight-line function. For instance, if we know the temperature in Fahrenheit (\(^\circ\)F) and want to convert it to Celsius (\(^\circ C\)) we have the following function: \[C(F) \equiv {\small\frac{5}{9}}(F-32)\ .\] Similarly, converting inches to centimeters can be accomplished with \[\text{cm(inches)} \equiv 2.54 \, (\text{inches}-0)\ .\] Both of these scale conversion functions have the form of the straight-line function, which can be written as \[f(x) \equiv a x + b\ \ \ \text{or, equivalently as}\ \ \ \ f(x) \equiv a(x-x_0)\ ,\] where \(a\), \(b\), and \(x_0\) are parameters.
Section 8.3 will use the \(ax + b\) form of scale conversion to scale the input to pattern-book functions. But we could equally well have used \(a(x-x_0)\).
Section 8.4 introduces a second scale conversion function, for the output from pattern-book functions. That scaling will also be in the form of a straight-line function: \(A x + B\). The use of the lower-case parameter names (\(a\), \(b\)) versus the upper-case parameter names (\(A\), \(B\)) will help us distinguish the two different uses for scale conversion, namely input scaling versus output scaling.
8.3 Input scaling
Figure 8.4 is based on the data frame RI-tide
, a minute-by-minute record of the tide level in Providence, Rhode Island (USA) for the period April 1 to 5, 2010. The level
variable is measured in meters; the hour
variable gives the time of the measurement in hours after midnight at the start of April 1.
The pattern-book \(\sin()\) and the function \(\color{magenta}{\text{level}}\color{blue}{(hour)}\) have similar shapes, so it seems reasonable to model the tide data as a sinusoid. However, the scale of the axes is different on the two graphs.
To model the tide with a sinusoid, we need to modify the sinusoid to change the scale of the input and output. First, let’s look at how to accomplish the input scaling. Specifically, we want the pure-number input \(t\) to the sinusoid be a function of the quantity \(hour\). Our framework for this re-scaling is the straight-line function. We will replace the pattern-book input \(t\) with a function \[t(\color{blue}{hour}) \equiv a\, \color{blue}{hour} + b\ .\]
The challenge is to find values for the parameters \(a\) and \(b\) that will transform the \(\color{blue}{\mathbf{\text{blue}}}\) horizontal axis into the black horizontal axis, like this:
By comparing the two axes, we can estimate that \(\color{blue}{10} \rightarrow 4\) and \(\color{blue}{100} \rightarrow 49\). With these two coordinate points, we can find the straight-line function that turns \(\color{blue}{\mathbf{\text{blue}}}\) into black by plotting the coordinate pairs \((\color{blue}{0},1)\) and \((\color{blue}{100}, 51)\) and finding the straight-line function that connects the points.
You can calculate for yourself that the function that relates \(\color{blue}{\mathbf{\text{blue}}}\) to black is \[t(\color{blue}{time}) = \underbrace{\frac{1}{2}}_a \color{blue}{time} \underbrace{-1\LARGE\strut}_b\]
Replacing the pure number \(t\) as the input to pattern-book \(\sin(t)\) with the transformed \(\frac{1}{2} \color{blue}{time}\) we get a new function: \[g(\color{blue}{time}) \equiv \sin\left(\strut {\small\frac{1}{2}}\color{blue}{time} - 1\right)\ .\] Figure 11.6 plots \(g()\) along with the actual tide data.
8.4 Output scaling
Just as the natural input needs to be scaled before it reaches the pattern-book function, so the output from the pattern-book function needs to be scaled before it presents a result suited for interpreting in the real world.
The overall result of input and output scaling is to tailor the pattern-book function so that it is ready to be used in the real world.
Let’s return to Figure 11.6 which shows that the function \(g(\color{blue}{time})\), which scales the input to the pattern-book sinusoid, has a much better alignment to the tide data. Still, the vertical axes of the two graphs in the figure are not the same.
This is the job for output scaling, which takes the output of \(g(\color{blue}{time})\) (bottom graph) and scales it to match the \(\color{magenta}{level}\) axis on the top graph. That is, we seek to align the black vertical scale with the \(\color{magenta}{\mathbf{\text{magenta}}}\) vertical scale. To do this, we note that the range of the \(g(\color{blue}{time})\) is -1 to 1, whereas the range of the tide-level is about 0.5 to 1.5. The output scaling will take the straight-line form \[{\color{magenta}{\text{level}}}({\color{blue}{time}}) = A\, g({\color{blue}{time}}) + B\] or, in graphical terms
We can figure out parameters \(A\) and \(B\) by finding the straight-line function that connects the coordinate pairs \((-1, \color{magenta}{0.5})\) and \((1, \color{magenta}{1.5})\) as in Figure 8.8.
You can confirm for yourself that the function that does the job is \[{\color{magenta}{\text{level}}} = 0.5 g({\color{blue}{time}}) + 1\ .\]
Putting everything together, that is, scaling both the input to pattern-book \(\sin()\) and the output from pattern-book \(\sin()\), we get
\[{\color{magenta}{\text{level}}}({\color{blue}{time}}) = \underbrace{0.5}_A \sin\left(\underbrace{\small\frac{1}{2}}_a {\color{blue}{time}} \underbrace{-1}_b\right) + \underbrace{1}_B\]
8.5 A procedure for building models
We’ve been using pattern-book functions as the intermediaries between input scaling and output scaling, using this format.
\[f(x) \equiv A e^{ax + b} + B\ .\] We can use the other pattern-book functions—the gaussian, the sigmoid, the logarithm, the power-law functions—in the same way. That is, the basic framework for modeling is this:
\[\text{model}(x) \equiv A\, {g_{pattern\_book}}(ax + b) + B\ ,\] where \(g_{pattern\_book}()\) is one of the pattern-book functions. To construct a basic model, you task has two parts:
Pick the specific pattern-book function whose shape resembles that of the relationship you are trying to model. For instance, we picked \(e^x\) for modeling COVID cases versus time (at the start of the pandemic). We picked \(\sin(x)\) for modeling tide levels versus time.
Find numerical values for the parameters \(A\), \(B\), \(a\), and \(b\). Chapter 11 shows some ways to make this part of the task easier.
It is remarkable that models of a very wide range of real-world relationships between pairs of quantities can be constructed by picking one of a handful of functions, then scaling the input and the output. As we move on to other Blocks in MOSAIC Calculus, you will see how to generalize this to potentially complicated relationships among more than two quantities. That is a big part of the reason you’re studying calculus.
8.6 Other formats for scaling
Often, modelers choose to use input scaling in the form \(a (x - x_0)\) rather than \(a x + b\). The two are completely equivalent when \(x_0 = - b/a\). The choice between the two forms is largely a matter of convention. But almost always the output scaling is written in the format \(A y + B\).
For the COVID case-number data shown in Figure 8.2, we found that a reasonable match to the data can be had by input- and output-scaling the exponential: \[\text{cases}(t) \equiv \underbrace{573}_A e^{\underbrace{0.19}_a\ t}\ .\]
You might wonder why the parameters \(B\) and \(b\) aren’t included in the model. One reason is that cases and the exponential function already have the same range: zero and upwards. So there is no need to shift the output with a parameter B.
Another reason has to do with the algebraic properties of the exponential function. Specifically, \[e^{a x + b}= e^b e^{ax} = {\cal A} e^{ax}\] where \({\cal A} \equiv e^b\).
In the case of exponentials, writing the input scaling in the form \(e^{a(x-x_0)}\) can provide additional insight.
A bit of symbolic manipulation of the model can provide some additional insight. As you know, the properties of exponentials and logarithms are such that \[A e^{at} = e^{\lb(A)} e^{at} = e^{a t + \ln(A)} = e^{a\left(\strut t + \lb(A)/a\right)} = e^{a(t-t_0)}\ ,\] where \[t_0 = - \ln(A)/a = - \ln(593)/0.19 = -33.6\ .\] You can interpret \(t_0\) as the starting point of the pandemic. When \(t = t_0\), the model output is \(e^{k 0} = 1\): the first case. According to the parameters we matched to the data for March, the pandemic’s first case would have happened about 33 days before March 1, which is late January. We know from other sources of information, the outbreak began in late January. It is remarkable that even though the curve was constructed without any data from January or even February, the data from March, translated through the curve-fitting process, pointed to the start of the outbreak. This is a good indication that the exponential form for the model is fundamentally correct.
8.7 Parameterization conventions
There are conventions for the symbols used for input-scaling parameterization of the pattern-book functions. Knowing these conventions makes it easier to read and assimilate mathematical formulas. In several cases, there is more than one conventional option. For instance, the sinusoid has a variety of parameterization forms that get used depending on which feature of the function is easiest to measure. Table 8.1 list several that are frequently used in practice.
Function | Written form | Parameter 1 | Parameter 2 |
---|---|---|---|
Exponential | \(e^{kt}\) | \(k\) | Not used |
Exponential | \(e^{t/\tau}\) | \(\tau\) “time constant” | Not used |
Exponential | \(2^{t/\tau_2}\) | \(\tau_2\) “doubling time” | Not used |
Exponential | \(2^{-\tau_{1/2}}\) | \(-\tau_{1/2}\) “half life” | Not used |
Power-law | \([x - x_0]^p\) | \(x_0\) x-intercept | exponent |
Sinusoid | \(\sin\left(\frac{2 \pi}{P} (t-t_0)\right)\) | \(P\) “period” | \(t_0\) “time shift” |
Sinusoid | \(\sin(\omega t + \phi)\) | \(\omega\) “angular frequency” | \(\phi\) “phase shift” |
Sinusoid | \(\sin(2 \pi \omega t + \phi)\) | \(\omega\) “frequency” | \(\phi\) “phase shift” |
Gaussian | dnorm(x, mean, sd) | “mean” (center) | sd “standard deviation” |
Sigmoid | pnorm(x, mean, sd) | “mean” (center) | sd “standard deviation” |
Straight-line | \(mx + b\) | \(m\) “slope” | \(b\) “y-intercept” |
Straight-line | \(m (x-x_0)\) | \(m\) “slope” | \(x_0\) “center” |