29  Linear combinations of vectors

In this chapter, we introduce linear combinations of vectors. As you recall, a linear combination is a sum of basic elements each of which has been scaled. For instance, in Block 1 we looked at linear combinations of functions such as \[g(t) = A + B e^{kt}\] which involves the basic functions \(\text{one}(t)\) and \(e^{kt}\) scaled respectively by \(A\) and \(B\).

Linear combinations of vectors involve scaling and addition, which are simple seen either as numerical operations or a geometric ones. A useful concept will be the set of all vectors that can be constructed as linear combinations of given vectors. This set of all possibilities, called the subspace spanned by the given vectors is key to understanding how to find the “best” scalars for a given purpose.

29.1 Scaling vectors

To scale a vector means to change its length without altering its direction. Scaling by a negative number flips the vector tip-for-tail. Figure 29.1 shows two vectors \(\vec{v}\) and \(\vec{w}\) together with several scaled versions of each.

The vectors we create by scalar multiplication can be placed anywhere we like.

Arithmetically, scaling a vector is accomplished by multiplying each component of the vector by the scalar, e.g.

\[\vec{u} = \left[\begin{array}{r}1.5\\-1\end{array}\right]\ \ \ \ 2\vec{u} = \left[\begin{array}{r}3\\-2\end{array}\right]\ \ \ \ -\frac{1}{2}\vec{u} = \left[\begin{array}{r}-0.75\\0.5\end{array}\right]\ \ \ \ \]

Every vector is associated with a subspace that is one-dimensional; you can only reach the points on a line by stepping in the direction of a vector.

29.2 Adding vectors

To add two vectors, choose either one of the vectors as a start, then move the tail of the second vector to the tip of the first, as in Figure 29.2.

Adding vectors in this way takes advantage of the rootlessness of a vector. So long as we keep the direction and length the same, we can move a vector to whatever place is convenient. For adding vectors, the convenient arrangement is to place the tail of the second vector at the tip of the first. The result—the blue pencil in the picture above—has the length and direction from the tail of the first pencil (yellow) to the tip of the second (green). But so long as we maintain this length and direction, we can put the result (blue) anywhere we want.

Arithmetically, vector addition is simply a matter of applying addition component-by-component, that is, componentwise. For instance, consider adding two vectors \(\vec{v}\) and \(\vec{w}\):


<!--\underbrace{\left[\begin{array}{r}1.5\\-1\\2\\6\end{array}\right]}_\vec{v} + \underbrace{\left[\begin{array}{r}2\\4\\-2\\-3.2\end{array}\right]}_\vec{w} = \underbrace{\left[\begin{array}{r}3.5\\3\\0\\2.8\end{array}\right]}_{\vec{v} + \vec{w}}-->

Unlike our pencil exemplars of vectors, which must of physical necessity always be in the three-dimensional space we inhabit, mathematical vectors can be embedded in any-dimensional space. Addition is applicable to vectors embedded in the same space. Arithmetically, this means that the two vectors to be added must have the same number of components.

Arithmetic subtraction of one vector from another is a simple component-wise operation. For example:

$$\underbrace{\left[\begin{array}{r}1.5\\-1\\2\\6\end{array}\right]}_\vec{v} {\Large -} \underbrace{\left[\begin{array}{r}2\\4\\-2\\-3.2\end{array}\right]}_\vec{w} = \underbrace{\left[\begin{array}{r}-0.5\\-5\\4\\9.2\end{array}\right]}_{\vec{v} - \vec{w}}\ .$$

From a geometrical point of view, many people like to think of \(\vec{v} - \vec{w}\) in terms of placing the two vectors tail to tail as in Figure 29.3. Read out the result as the vector running from the tip of \(\vec{v}\) to the tip of \(\vec{w}\). In Figure 29.3, the yellow vector is \(\vec{v}\), the blue vector is \(\vec{w}\). The result of the subtraction is the green vector.

29.3 Linear combinations

In the previous chapter, we suggested that you think of a vector as a “step” or displacement in a given direction and of a given magnitude as in, “1 foot to the northeast.” This interpretation highlights the mathematical structure of vectors: just a direction and a length, nothing else.

The “step”-interpretation is also faithful to an important reason why vectors are useful. We use steps to get from one place to another. Similarly, a central use for the formalism of vectors is to guide our thinking and our algorithms for figuring out how best to get from one “place” to another. We’ve used quotation marks around “place” because we are not necessarily referring to a physical destination. We will get to what else we might mean by “place” later in this Block.

As a fanciful example of getting to a “place,” consider a treasure hunt. You are given these instructions to get there:

  1. On June 1, go to the flagpole before sunrise.
  2. At 6:32, walk 213 paces away from the sun.
  3. At 12:19, walk 126 paces toward the sun.

The sun position varies over the day, so the direction to the sun on June 1 at 6:32 will be different than at 12:19. Figure 29.4 the Sun vectors at 6:32 and 12:19 on June 1.

The treasure-hunt directions are in the form of a linear combination of vectors. For each of the two vectors described in the treasure instructions, the length of the vector is 1 pace. (Admittedly, not a scientific unit of length.) Scaling \(\color{magenta}{\text{the magenta vector}}\) by -213 and \(\color{blue}{\text{the blue vector}}\) by 126, then adding the two scaled vectors gives a vector that takes you from the flagpole to the treasure.

A stickler for details might point out tht the “direction of the sun” has an upward component. Common sense will guide you to walk in the direction of the Sun as projected onto Earth’s surface. Section Section 30 deals with projections of vectors.

29.4 Functions as vectors

This is a calculus book, and calculus is about functions. So you can imagine that there is going to be some connection between functions and vectors.

We will start with the idea that a vector is a column of numbers. Recall from Block 1 (Section 4.002) the idea of representing a function as a table of inputs and the corresponding outputs.

Here is such a table with some of our pattern-book functions.

t one(t) identity(t) exp(t) sin(t) pnorm(t)
0.0 1 0.0 1.000000 0.0000000 0.5000000
0.1 1 0.1 1.105171 0.0998334 0.5398278
0.2 1 0.2 1.221403 0.1986693 0.5792597
0.3 1 0.3 1.349859 0.2955202 0.6179114
0.4 1 0.4 1.491825 0.3894183 0.6554217
... and so on ...
4.6 1 4.6 99.48432 -0.9936910 0.9999979
4.7 1 4.7 109.94717 -0.9999233 0.9999987
4.8 1 4.8 121.51042 -0.9961646 0.9999992
4.9 1 4.9 134.28978 -0.9824526 0.9999995
5.0 1 5.0 148.41316 -0.9589243 0.9999997

In this representation, each of the pattern-book functions is a column of numbers, that is, a vector.

Functions that we construct by linear combination are, in this vector format, just a linear combination of the vectors. For instance, the function \(g(t) \equiv 3 - 2 t\) is \(3\cdot \text{one}(t) - 2 \cdot \text{identity}(t)\)

t one(t) identity(t) g(t)
0.0 1 0.0 3.0
0.1 1 0.1 2.8
0.2 1 0.2 2.6
0.3 1 0.3 2.4
0.4 1 0.4 2.2
... and so on ...
4.6 1 4.6 -6.2
4.7 1 4.7 -6.4
4.8 1 4.8 -6.6
4.9 1 4.9 -6.8
5.0 1 5.0 -7.0

The table above is a collection of four vectors: \(\vec{\mathtt t}\), \(\vec{\mathtt{ one(t)}}\), \(\vec{\mathtt{identity(t)}}\), and \(\vec{\mathtt{g(t)}}\). Each of those vectors has 51 components. In math-speak, we can say that the vectors are “embedded in a 51-dimensional space.”

The functions in the table are being represented as discrete values. Still, a table, combined with interpolation (Chapter 33) can produce a continuum.

29.5 Matrices and linear combinations

A collection of vectors, such as the one displayed in the previous table, is called a matrix. Each of the vectors in a matrix must have the same number of components.

As mathematical notation, we will use bold-faced, capital letters to stand for matrices, for example \(\mathit{M}\). The symbol \(\rightleftharpoons\) is a reminder that a matrix can contain multiple vectors, just as the symbol \(\rightharpoonup\) in \(\vec{v}\) reminds us that the name “\(v\)” refers to a vector.

In the conventions for data, we give a name to each column of a data frame so that we can refer to it individually. In the conventions used in vector mathematics, single letter are used to refer to the individual vectors.

As a case in point, let’s look at a matrix \(\mathit{M}\) containing the two vectors which we’ve previously called \(\vec{\mathtt{one(t)}}\) and \(\vec{\mathtt{identity(t)}}\): \[\mathit{M} \equiv \left[\begin{array}{rr}1 & 0\\ 1 & 0.1\\ 1 & 0.2\\ 1 & 0.3\\ \vdots & \vdots\\ 1 & 4.9\\ 1 & 5.0\\ \end{array}\right]\ .\] The linear combination which we might previous have called \(3\cdot \vec{\mathtt{t}} - 2\,\vec{\mathtt{identity(t)}}\) can be thought of as

1 \\
1 \\
1 \\
\vdots &\\
1 \\
\end{array}}^{3 \times}
\end{array}}{\Large + \ }
0.1 \\
0.2 \\
0.3 \\
4.9 \\
\end{array}}^{-2 \times}\right] = \left[\begin{array}{r}
\\ \\ 3\\
2.8\\2.6\\2.4\\\vdots\\-6.8\\-7.0\\ \\ \\
\end{array}\right]\ ,$$ but this is not conventional notation. Instead, we would write this more concisely as 
$$\stackrel{\Large\mathit{M}}{\left[\begin{array}{rr}1 & 0\\
1 & 0.1\\
1 & 0.2\\
1 & 0.3\\
\vdots & \vdots\\
1 & 4.9\\
1 & 5.0\\
\end{array}\right]} \ 

In symbolic form, the linear combination of the columns of \(\mathit{M}\) using respectively the scalars in \(\vec{w}\) is simply \(\mathit{M} \, \vec{w}\). This is called matrix multiplication.

Naturally, the operation only makes sense if there are as many components to \(\vec{w}\) as there are columns in \(\mathit{M}\).

“Matrix multiplication” might better have been called “\(\mathit{M}\) linearly combined by \(\vec{w}\).” But “matrix multiplication” is the standard term for such linear combinations.

In R, you can make vectors with the rbind() command, short for “bind rows,” as in ::: {.cell layout-align=“center” fig.showtext=‘false’}

rbind(2, 5, -3)
##      [,1]
## [1,]    2
## [2,]    5
## [3,]   -3

with the components of the vector presented as successive arguments to the function.

One way to make a matrix is with the cbind() command, short for “bind columns”. The arguments to cbind() will typically be vectors created by rbind(). For instance, the matrix \[\mathit{A} \equiv \left[\vec{u}\ \ \vec{v}\right]\ \ \text{where}\ \ \vec{u} \equiv \left[\begin{array}{r}2\\5\\-3\end{array}\right]\ \ \text{and}\ \ \vec{v} \equiv \left[\begin{array}{r}1\\-4\\0\end{array}\right]\] can be constructed in R with these commands.

u <- rbind(2, 5, -3)
v <- rbind(1, -4, 0)
A <- cbind(u, v)
##      [,1] [,2]
## [1,]    2    1
## [2,]    5   -4
## [3,]   -3    0

To compute the linear combination \(3 \vec{u} + 1 \vec{v}\), that is, \(\mathit{A} \cdot \left[\begin{array}{r}3\\1\end{array}\right]\) you use the matrix multiplication operator %*%. For instance, the following defines a vector \[\vec{x} \equiv \left[\begin{array}{r}3\\1\end{array}\right]\] to do the job in a way that is easy to read:

x <- rbind(3, 1)
A %*% x
##      [,1]
## [1,]    7
## [2,]   11
## [3,]   -9


It is a mistake to use * instead of %*% for matrix multiplication. Remember that * is for componentwise multiplication which is different from matrix multiplication. Componentwise multiplication with vectors and matrices will usually give an error message as with:

A * x
## Error in A * x: non-conformable arrays

The phrase “non-conformable arrays” is R-speak for saying “I don’t know how to do component-wise multiplication with two differently shaped objects.

In chapters to come, we will sometimes make several different linear combinations of the vectors in a matrix. Of course the result of each individual linear combination will be a vector, so the “several different linear combinations” can be thought of as a collection of vectors, that is, a matrix.

For example, consider the possible linear combinations of the two vectors in a matrix \[\mathit{A} = \left[\begin{array}{r}2\\5\\-3\end{array}\ \begin{array}{r}1\\-4\\0\end{array}\right]\ .\]

The combinations we have in mind are: \[ \mathit{A}\left[\begin{array}{r}3\\1\end{array}\right]= \left[\begin{array}{r}7\\11\\-9\end{array}\right] \ \ \ \ \ \ \ \ \ \ \mathit{A} \left[\begin{array}{r}-0\\2\end{array}\right]= \left[\begin{array}{r}2\\-8\\0\end{array}\right] \ \ \ \ \ \ \ \ \ \ \mathit{A} \left[\begin{array}{r}-1\\0\end{array}\right] = \left[\begin{array}{r}-2\\-5\\3\end{array}\right] \] A more concise way to write this collects the vectors with the values for the scalars into a matrix, which we will call \[\mathit{X} \equiv \left[\begin{array}{rrr}3 & 0 & -1\\1 & 2 & 0\end{array}\right]\ .\]

\[\mathit{A} \ \mathit{X} = \left[\begin{array}{rrr}7 &2 &-2\\11 & -8 & -5\\-9 & 0 & 3\end{array}\right]\]

In R, to create the set of linear combinations, we create the matrices \(\mathit{A}\) and \(\mathit{X}\) and combine them with matrix multiplication. ::: {.cell layout-align=“center” fig.showtext=‘false’}

A <- cbind(
       rbind( 2,  5, -3),
       rbind( 1, -4,  0)

A is a collection of two vectors. Therefore, each vector in X must have two components: one for each vector in A

X <- cbind(
       rbind( 3, 1),
       rbind( 0, 2),
       rbind(-1, 0)

The overall result will be a new matrix, containing three vectors, one for each vector in X:

A %*% X
##      [,1] [,2] [,3]
## [1,]    7    2   -2
## [2,]   11   -8   -5
## [3,]   -9    0    3


29.6 Sub-spaces

Recall that a vector with \(n\) components can be said to be embedded in an \(n\)-dimensional space. You might like to think of the embedding space as a kind of club with restricted membership. A vector with 2 elements is entitled to join the 2-dimensional club, but a vector with more or fewer than 2 elements cannot be admitted to the club. Similarly, there are clubs for 3-component vectors, 4-component vectors, and so on.

The clubhouse itself is a kind of space, the space in which any and all of the vectors that are eligible for membership can be embedded.

Now imagine the clubhouse arranged into meeting rooms. Each meeting room is just part of the clubhouse space. Which part? That depends on a set of vectors who sponsor the meeting. For instance, in the ten-dimensional clubhouse, a few members, let’s say \(\color{blue}{\vec{u}}\) and \(\color{magenta}{\vec{v}}\) decide to sponsor a meeting. That meeting room, part of the whole clubhouse space, is called a subspace.

A subspace has its own rules for admission. Vectors belong to the subspace only if they can be constructed as a linear combination of the sponsoring members. Mathematically, although the subspace is defined by the founding vectors, the subspace itself consists f all the possible vectors can be constructed by a linear combination of the sponsors.

As an example, consider the clubhouse that is open to any and all vectors with three components. The diagram in ?fig-two-vecs shows the clubhouse with just two members present, \(\color{blue}{\vec{u}}\) and \(\color{magenta}{\vec{v}}\).

Any vector can sponsor its own subspace. In ?fig-two-vecs the subspace sponsored by \(\color{blue}{\vec{u}}\) is the extended line through \(\color{blue}{\vec{u}}\), that is, all the possible scaled versions of \(\color{blue}{\vec{u}}\). Similarly, the subspace sponsored by \(\color{magenta}{\vec{v}}\) is the extended line through \(\color{magenta}{\vec{v}}\). Each of these subspaces is one-dimensional.

Multiple vectors can sponsor a subspace. The subspace sponsored by both \(\color{blue}{\vec{u}}\) and \(\color{magenta}{\vec{v}}\) contains all the vectors that can be constructed as linear combinations of \(\color{blue}{\vec{u}}\) and \(\color{magenta}{\vec{v}}\). In Figure 29.5, this subspace is shown in gray.

On the other hand, the subspace sponsored by \(\color{magenta}{\vec{v}}\) and \(\color{blue}{\vec{u}}\) is not the entire clubhouse. \(\color{magenta}{\vec{v}}\) and \(\color{blue}{\vec{u}}\) lie in a common plane, but not all the vectors in the 3-dimensional clubhouse lied in that plane. In fact, if you rotate ?fig-two-vecs-plane to “look down the barrel” of either \(\color{magenta}{\vec{v}}\) or \(\color{blue}{\vec{u}}\), the plane will entirely disappear from view. A subspace is an infinitesimal slice of the embedding space.

“Sponsored a subspace” is metaphorical. In technical language we speak of the subspace spanned by a set of vectors in the same embedding space. Usually, we refer to a “set of vectors” as a matrix. For instance, letting \[\mathit{M} \equiv \left[{\Large \strut}\color{blue}{\vec{u}}\ \ \color{magenta}{\vec{v}}\right]\ ,\] the gray plane in Figure 29.5 is the subspace spanned by \(\mathit{M}\) or, more concisely, \(span(\mathit{M})\).

For a more concrete, everyday representation of the subspace spanned by two vectors, a worthwhile experiment is to pick up two pencils pointing in different directions. Place the eraser ends together, pinched between thumb and forefinger. You can point the whole rigid assembly in any direction you like. The angle between them will remain the same.

Place a card on top of the pencils, slipping it between your pressed fingers to hold it tightly in place. The card is another kind of geometrical object: a planar surface. The orientation of two vectors together determine the orientation of the surface. This simple fact will be extremely important later on.

You could replace the pencils with line segments drawn on the card underneath each pencil. Now you have the angle readily measurable in two dimensions. The angle between two vectors in three dimensions is the same as the angle drawn on the two-dimension surface that rests on the vectors.

Notice that you can also lay a card along a single vector. What’s different here is that you can roll the card around the pencil; there are many different orientations for such a card even while the vector stays fixed. So a single fixed vector does not determine uniquely the orientation of the planar surface in which the two vectors can reside. But with two fixed vectors, there is only one such surface.

29.7 Exercises

Problem with NA NA