t | x | y |
---|---|---|
1 | 496 | 191 |
2 | 1037 | 138 |
3 | 1251 | 191 |
... 16 rows in total ... | ||
15 | 928 | 432 |
16 | 737 | 240 |
49 Data-driven functions
As early as Chapter 3 of this book, we noted that a function can be described as a table where each row stands for one set of input values together with a corresponding output value. We did not, however, make much use of the table-as-function concept. Instead, we used data tables to motivate the choice of parameters, as in linear combinations of the basic modeling functions. We called this function fitting: constructing a function that stays near the data values. (We will say more about function fitting in Block 5 where we introduce new tools for treating functions as geometrical objects.)
This chapter introduces yet another important method for constructing functions that match with data. What’s different here is that each data point will be a mandate; the function is required to go through each and every data point exactly.
49.1 Generating smooth motion
As a motivating example, consider the programming of robotic arms of the sort displayed in Figure 49.1:
Since this isn’t a robots course, we will simplify. The arm has a resting position. When a car frame comes into place, the arm moves so that its welding electrodes are at a specific, known place in space near the car body. Then it moves in sequence to other places where a weld is required, perhaps passing through waypoints to avoid obstacles.
The problem of converting the discrete list of weld and waypoints into a continuous signal for the actuator is an instance of a mathematical process called interpolation. In real robot arms, there are multiple joints that need to be controlled simultaneously. For our illustration, we will use a simple setup (see Figure 49.2) where the robot hand rolls along a set of rails in the y-direction and another x-rail running crosswise to the y direction.
The task for our example robot will be to visit the points shown in Figure 49.3 in order, taking 15 seconds to traverse the whole path.
Figure 49.3 shows a continuous path in \((x,y)\) coordinates together with discrete labels indicating when each waypoint is to be reached. Note that the path is not a function \(y(x)\). Mathematical functions are required to be single valued, meaning that for each value of the input (in the function domain) there can be only one, unique output value. The path in Figure 49.3 often involves two or more different \(y\) values for a single \(x\) value. There is even a small domain of \(x\) near \(x=900\) where the path at any given \(x\) crosses six different \(y\)-values.
Even so, functions can be a useful way of describing the \((x,y)\) path. The key is the plural: functions. For the path in Figure 49.3 we need two quantities varying with time in a coordinated way. One approach, familiar to navigators, is to specify direction of movement and velocity at each instant of time. Perhaps not as familiar, but more fundamental mathematically, is to specify \(x\) as a function of time and, separately, \(y\) as a function of time. Using this formalism, the trajectory of the robot arm will consist of two functions, \(x(t)\) and \(y(t)\). To build those functions, we start with the waypoints stored in the data frame Robot_stations
.
The data frame `Robot_stations”
The \(x(t)\) and \(y(t)\) functions in this table aren’t complete enough to operate the robot. We need to provide the \(x,y\)-location data in the form of two continuous functions of \(t\) so that the robot, at any time \(t\), can look up where it is supposed to be, what its velocity should be, and how that velocity should be changing in time (acceleration).
One strategy is to construct the functions as piecewise linear functions of \(t\), like this:
It can be difficult at first glance to see the relationship between the \(x(t)\) and \(y(t)\) functions and the path shown in Figure 49.3. As an exercise, look specifically at the segment \(9 \leq t \leq 10\). In Figure 49.4, this is the segment connecting points 9 and 10. In the path view, you can see that on this segment \(x\) changes a lot while \(y\) changes only a little. Correspondingly, in the function view, \(\partial_t x(t)\) is large in magnitude compared to \(\partial_t y(t)\).
Each functions in Figure 49.4 is an interpolating function. You’re entitled to think of the \(x(t)\) function as connecting with lines the sequence of \(x\) versus \(t\) coordinates from the table and similarly for \(y(t)\). Each of the two functions is continuous. But, based on your work in Blocks 1 through 3, you have a richer set of concepts for interpreting those two functions.
For instance, let’s look at \(\partial_t y(t)\). Since \(y\) is a position along the cross rail, \(\partial_t y(t)\) is the velocity in that direction. Figure 49.5 shows the velocity versus time for both the \(x\) and \(y\) components of the movement.
The speed of the robot arm maxes out at about 600 mm-per-second. You can get a sense for this by moving your finger two feet in 1 second: a normal human speed of movement.
Since the original \(x(t)\) and \(y(t)\) functions are piecewise linear, it makes sense that the derivatives with respect to time are piecewise constant. But the robot hand is a physical thing; it has to have a velocity at every instant in time. It cannot instantaneously have an undefined velocity.
Think about what it is that causes the change from one velocity step to another. There is a motor that is spinning and changing its rate of spin, perhaps using a pulley and a belt to move the robot hand to the right position at any instant of time. Changing the velocity requires a force to create an acceleration. We can differentiate the velocity to see what the acceleration must be to create the simple piecewise linear function shown in Figure 49.4.
Mathematically, the second derivatives \(\partial_{tt} x(t)\) and \(\partial_{tt} y(t)\) do not exist, because \(\partial_{t} x(t)\) and \(\partial_{t} y(t)\) are discontinuous. There is no physical amount of force that will change the velocity in an instant.
As an accommodation to the physical existence of the robot hand, we’ve softened the transition between consecutive velocity segments to allow it to take 0.2 seconds, ramping up from zero force 0.1 second before the hand reaches the station, to maximum force at the station, then back down to zero 0.1 second after the hand reaches the station. Consequently the actual motion is smoother and the maximum acceleration is about half that of gravity. Figure 49.7 shows the resulting trajectory which can be likened to that of a baseball player rounding a base.
A consequence of smoothing the trajectory is that the robot hand comes near, but does not touch the station. It misses by about 2 mm. For many human tasks that might be good enough, but for precision manufacturing a miss by 2 mm is about 1000 times more than allowed.
If you like working with practical problems, you might find a simple solution to the problem. For instance, we could have aimed the robot hand 2 mm further to the right than the actual station. In falling short by 2mm, the hand would miss the new target but cross right over the originally intended station.
Solutions like this are sometimes called ad hoc, meaning that they are so specifically tailored to one situation that they do not generalize well to slightly different problems. The next section introduces a superior approach that is much more general.
49.2 Piecewise but smooth
The approach we will take to smoothly connect the points on the path is based on ideas of derivatives and on the construction of low-order polynomials. In Block 2, we emphasized low-order polynomials up to the square term, and we will pick that up again here for demonstration purposes. For this example, we will construct only the \(y(t)\) function. Constructing \(x(t)\) would be done using the same procedure.
Our task is to find a function \(y(t)\) to interpolate discrete points such as those shown in Figure 49.8. The discrete points are called knots1 in the language of interpolating functions. Each knot is a coordinate pair \((t_i, y(t_i))\) shown as an orange dot in Figure 49.8.
The piecewise linear interpolating function is easily constructed and is shown as a dotted curve. As we saw in the previous section, such a function has a discontinuous first derivative. We would like something smoother, with a continuous first derivative. A curve such as the one we seek is shown as the multi-colored function.
The framework we will adopt for the smooth interpolating function is piecewise quadratic segments between adjacent knots. There are four knots, requiring three segments. We will call the segment \(p_1(y)\) connecting the first knot to the second, with \(p_2(y)\) connecting the second to the third knot and \(p_3(y)\) connecting the third to the fourth knot. Each of those segments will be a second-order polynomial. To keep things organized, we will use coefficient names systematically:
\[p_1(t) \equiv a_1 + b_1 \left[t - t_1\right] + c_1 \left[t - t_1\right]^2\] \[p_2(t) \equiv a_2 + b_2 \left[t - t_2\right] + c_2 \left[t - t_2\right]^2\] \[p_3(t) \equiv a_3 + b_3 \left[t - t_3\right] + c_3 \left[t - t_3\right]^2\] The four knots are \[\left[\begin{array}{c}\left(t_1, x_1\right)\\ \left(t_2, y_2\right)\\ \left(t_3, y_3\right)\\ \left(t_4, y_4\right)\\ \end{array}\right]\] which you can think of as two columns of a data frame, one with the \(t\)-coordinates of the knots and the other with the \(y\)-coordinates. For the knots in Figure 49.8 the data table is
t | y |
---|---|
1 | 0.0 |
2 | 2.0 |
3 | 0.5 |
4 | 1.7 |
Constructing the interpolating function is a matter of making good choices for \(a_1,\) \(a_2,\) \(a_3,\) \(b_1,\) \(b_2,\) \(b_3,\) \(c_1,\) \(c_2,\) and \(c_3\).
We require these things of each of the interpolating polynomials:
- It passes exactly through the two knots marking the segment’s endpoints. That is \(p_1(t_1) = y_1\) and \(p_1(t_2) = y_2 = p_2(t_2)\) and \(p_2(t_3) = y_3 = p_3(t_3)\) and, finally, \(p_3(t_4) = y_4\). Note that at the interior knots where two polynomials join, the left-hand polynomial and the right-hand polynomial should exactly match the function value and each other.
- The derivative (with respect to \(t\)) should match where the segments join. That is, \(\partial_t p_1(t_1) = \partial_t p_2(t_2)\) and \(\partial_t p_2(t_3) = \partial_t p_3(t_3)\). Thus, the function we want to build will be \(C^1\), that is, have a continuous first derivative.
How to accomplish (1) and (2)?
Notice first that because we wrote each of the polynomials in the style of Taylor polynomials, we can read the values of \(a_1\), \(a_2\), and \(a_3\) directly from the data table:
\[p_1(t_1) = a_1 = y_1\] \[p_2(t_2) = a_2 = y_2\] \[p_3(t_3) = a_3 = y_3\]
We can find other coefficients from the requirement that the right side of each segment pass through the knot on that side. This gives:
\[p_1(t_2) = y_2 = a_1 + b_1 \left[t_2-t_1\right] + c_1\left[t_2-t_1\right]^2\] \[p_2(t_c) = y_3 = a_2 + b_2 \left[t_3-t_2\right] + c_2\left[t_3-t_2\right]^2\] \[p_3(t_c) = y_4 = a_3 + b_3 \left[t_4-t_3\right] + c_3\left[t_4-t_3\right]^2\] (Notice that \(t_2 - t_1\) and the like are simply numbers that can be computed from the known knot points.)
Another two conditions are that the derivatives of the polynomials from either side of each interior knot point must match at the knot point. Finding the derivatives of the segments is a simple exercise:
\[\partial_t p_1(t) = b_1 + 2 c_1 \left[t - t_1\right]\] \[\partial_t p_1(t) = b_2 + 2 c_2 \left[t - t_2\right]\] \[\partial_t p_1(t) = b_3 + 2 c_3 \left[t - t_3\right]\]
Matching these derivatives at the \(t_2\) and \(t_3\) knot points—the interior knots where two segments come together—gives two more equations: \[\partial_t p_1(t_2) = b_1 + 2 c_1 \left[t_2 - t_1\right] = b_2 = \partial_t p_2(t_2)\] \[\partial_t p_2(t_3) = b_2 + 2 c_2 \left[t_3 - t_2\right] = b_3 = \partial_t p_3(t_3) \] All together, we have five equations in six unknowns: \(b_1, b_2, b_3\) and \(c_1, c_2, c_3\).
Plugging in the specific values \(t_1\) through \(t_4\), and \(x_1\) through \(x_4\) from the data table translates the equations for the polynomial values and derivatives gives this system of equations:
\[b_1 + c_1 = x_2 - x_1 = \ \ \ \ \ \ 2\] \[b_2 + c_2 = x_3 - x_2 = -1.5\] \[b_3 + c_3 = x_4 - x_3= \ \ 1.2\] \[b_1 + 2 c_1 - b_2 = 0\] \[b_2 + 2 c_2 - b_3 = 0\]
This is not the place to go into the details of solving the five equations to find the six unknowns. (Block 5 introduces the mathematics of such things, which turns out to the same math used to find model parameters to “fit” data.) But there are some simple things to say about the task.
First, you may recall being told in high-school mathematics that to find six unknowns you need six equations. We have only five equations to work with. But it is far from true that there is no solution for six unknowns with five equations. There are in fact an infinite number of solutions. (Again, Block 5 will show the mathematics behind this statement.) Essentially, all we need to do is make up a sixth equation to identify a particular one of the infinite number of solutions. It is nice if this made-up equation reflects something interpretable about the curve.
We will choose to have the sixth equation specify what the derivative of the interpolating function should be at the far right end of the graph. That right-most derivative value will be \[\partial_t p_3(t_4) = b_3 + 2 c_3 \left[t_4 - t_3\right]\ .\] We can set this value to anything we like. For instance, in Figure 49.8 the right-most derivative is set to zero; you can see this from the curve being flat at the right-most knot point.
Keeping in mind the piecewise nature of the interpolating polynomial, it may seem surprising that changing the slope at \(t_4\) leads to a change in value of the function almost everywhere. Yet the stiffness of the parabolic segments means that conditions in one segment have an impact on adjacent segments. In turn, the segments adjacent to these also change, a change that percolates down to every segment in turn.
Putting together the \(x(t)\) and \(y(t)\) interpolating functions, each of which has that extremum between knot points, leads to an absurdly complicated path, as seen in Figure 49.10.
Quadratic splines are rarely used in practice. (A cubic spline provides helpful flexibility. See Section 49.3.) In Figure 49.9 you can see one of the reasons: the quadratic form is so stiff that the interpolating function tends to shift from concave up to concave down (or vice versa) at each knot point. This results in the interpolating function tending to have a local minimum or maximum between adjacent knots, even if the data themselves to not indicate such a structure.
49.3 C2 smooth functions
In the previous section, we arranged the functions \(x(t)\) and \(y(t)\) composed from the piecewise quadratic segments to be \(C^1\) smooth. (Recall that \(C^1\) smooth means that the derivatives \(\partial_t x(t)\) and \(\partial_t y(t)\) are continuous.) We established this continuity by make sure that each segment has a value of the derivative at its end-point know that matches the derivative of the adjacent segment.
To arrange \(C^2\) continuity requires that the segment include a new parameter. Most commonly, this is done by moving from quadratic segments to cubic segments. This can be done by an approach similar to that of the previous section but somewhat more elaborate. Such a \(C^2\) interpolating function is called a cubic spline. Cubic splines are very commonly encountered in applications requiring interpolation.
With the ability to match piecewise cubic polynomials to a set of knots, we can easily construct the smooth path to connect the knots in Figure 49.3. Figure 49.11 shows a \(C^2\) path connecting the knots. The path is constructed by plotting simultaneously the output of two functions, \(x(t)\) and \(y(t)\), with the input \(t\) on the domain \(1 \leq t \leq 16\).
Cubic spline functions can be created with the R/mosaic spliner()
function. The second argument is a data frame giving the knot locations. The first argument is a tilde expression specifying the variables to use from the data frame.
<- spliner(x ~ t, data = Robot_stations)
xfun <- spliner(y ~ t, data = Robot_stations) yfun
Algebraically, each BĂ©zier segement is a pair of cubic functions, \(x(t)\) for the x-coordinate and \(y(t)\) for the y-coordinate. The input \(t\) varies between 0 and 1 for each segment. The coordinate pair \(\left({\large\strut} x(0), y(0)\right)\) is one endpoint of the curve, while \(\left({\large\strut} x(1), y(1)\right)\). Each intermediate value of \(t\) corresponds to a point on the interior of the curve.
The \(x(t)\) and \(y(t)\) functions have the same form, the difference between the functions being only the values of the end values (\(x_1\) and \(x_4\) for the \(x(t)\) function, and similarly \(y_1\) and \(y_4\) for the \(y(t)\) function), as well as the control-point values (\(x_2\) and \(x_3\) for one function, \(y_2\) and \(y_3\) for the other.)
\[x(t) = (1-t)^3\, x_1 + 3(1-t)^2 t\, x_2 + 3(1-t) t^2\, x_3 + t^3\, x_4\] and \[y(t) = (1-t)^3\, y_1 + 3(1-t)^2 t\, y_2 + 3(1-t) t^2\, y_3 + t^3\, y_4\]
Called such possibly because the curves are tied together at each of the knots.↩︎