Chap 24 Exercises

WebR Status

🟡 Loading...

Loading webR...

Exercise 1

Consider the function $f (x) \equiv x^{3}$ . Confirm that the value of the derivative $\partial_{x} f (x = 0)$ and so $x^{⋆} = 0$ is a critical point. Which sort of critical point is $x^{⋆} = 0$ ? (Hint: Draw the graph of $f (x)$ near $x = 0$ to see what’s going on.)

An argmax An argmin Neither

question id: calf-bring-pants-1

Still working with the function $f (x) \equiv x^{3}$ , find the value of the second-derivative $\partial_{x x} f (x^{⋆})$ evaluated at the critical point $x = x^{⋆} = 0$ . Which of these is $\partial_{x x} f (x = 0)$ ?

Negative Positive Zero

question id: calf-bring-pants-2

Exercise 2 We’d like to make a folded cardboard box in the most efficient way possible. As you know, cardboard boxes have four sides as well as eight flaps, four for the top and four for the bottom. The flaps are arranged to provide double coverage; you fold the flaps from one direction and then fold over them the flaps from the other direction.

The diagram depicts the box sides and flaps laid out on a flat piece of cardboard. The flaps are shaded with diagonal lines. The formulas give the areas of various rectangles on the sheet.

Suppose the height, width, and depth of the box are $h$ , $w$ , and $d$ respectively. The box volume is easy:

$V = h w d$ The area of cardboard consists of the the four sides and the eight flaps. Each component’s area is a product of the two edge lengths. For example, the box sides are either $w h$ or $d h$ . The flaps, each of which extends half-way across the bottom or top have areas $w d / 2$ .

Which of these formulas gives the area of the cardboard making up the box?

A common size for a box is 1.3 cubic feet. We will use feet as the units for $w$ , $h$ , and $d$ .

The following formulas do not describe the area of the cardboard, but they are nonetheless formulas for something. Except one of them, which cannot be true. Which one?

As $w$ , $h$ , or $d$ are changed, the volume and surface area of the box are changed. Asking for the $w$ , $h$ , and $d$ that minimize the surface area of the box is not a complete statement of a problem. The minimum surface area will be zero whenever two of the three dimensions have length zero. In other words, we can minimize the surface area by making a box that is no box at all!

To complete the problem statement we need something else. Here, that something is a constraint: We demand that the box have a volume of $V = 1.3$ cubic feet.

Often, a constraint plays the role of a dimension reduction. With $w$ , $h$ , and $d$ , we have a 3-input optimization problem. But we can use the constraint equation to solve for one of the quantities as a function of the others and the (known) volume. For instance, we can find $h$ as $h = V / d w$

Plug in the above expression for $h$ into the formula for the surface area of cardboard. Which of the following is the resulting formula in terms of $w$ , $d$ , and $V$ ?

Active R chunk 1 contains the formula for the surface area $A (w, d, V)$ of a box of volume $V$ . The graphics command draw a contour plot of $A ()$ as a function of $w$ and $d$ , holding $V = 1.3$ cubic feet.

Active R chunk 1

When you draw the contour plot, you will see a broad area near the center inside the contour at area = 9.5. Towards the upper-right and lower-left corners of the plot frame are contours at higher levels of area.

The spacing between the contours in the corners is tight, but there is no similarly spaced contour inside the region delimited by the contour at area=9.5. Why not?

We didn’t ask for contours inside 9.5.

The function shape inside the 9.5 contour is the top of a bowl, so it is pretty flat.

The function shape inside the 9.5 contour is the bottom of a bowl, so it is pretty flat.

All the points inside the 9.5 contour are at the same height.

question id: duck-tell-laundry-4

Use Active R chunk 1 to place contours at 10, 9.5, 9.4. You can do this by replacing the argument contours_at = NULL with this:

contours_at = c(10, 9.5, 9.4))

Add more contours to build a fence tighter and tighter around the argmin. When the fenced region is tiny, you can read off the min from the contour label. (Remember, the “argmin” is the value of the inputs $w$ and $d$ at which the function is minimized. The “min” is the value of the function at the argmin.) But watch out as you do this. If you ask for a contour at a level that is lower than the min, it will simply not be drawn. Or, more precisely, there are no inputs that produce an output that is lower than the min. So you may have to change the interval between levels (e.g. 10, 9.5, 9.4, …) to home in on the argmin.

The following are values for the output of the function where you might be able to draw a contour. Which one of the values is the smallest for which a contour appears?

From your contour plot, read off the values of $w$ and $d$ that produce the minimum surface area for a 1.3 cubic-foot box. What are they? (Hint: You may need to zoom in on the domain to get the precision needed to answer the question.)

It is easy enough for a person to look at a contour plot and roughly locate the argmin. But this is not feasible if there are more than two inputs to the function being optimized. For such functions, another set of numerical techniques are used based on the gradient of the objective function. Remember that the gradient at any point is a vector that points in the uphill direction and whose length is proportional to the steepness of the slope. (Skiers, beware. In skiing what people call the gradient is the steepest downhill direction. This might account for all the mathematicians learning to ski who point their skis uphill in response to the ski instructor’s instruction!)

You can display the gradient on the plot of the area function by piping (remember |>) the contour plot into the commented-out command in the sandbox. (Also, replace #pipe to with |>.)

Since the end of the term is coming, here is a question that might be a good review for the final. Which of these describes the relationship between the gradient vector and the contours?

On a contour the gradient vector is perpendicular (“orthogonal”) to the contour.

On a contour, the gradient vector has zero length.

There is no definite relationship; it depends on the function itself.

On a contour, the gradient vector has a length proportional to the contour level.

question id: duck-tell-laundry-7

Which of these best describes the gradient vector at the argmin?

The gradient points due North.

The length of the gradient vector is maximal.

The length of the gradient vector is minimal.

The length of the gradient vector is zero.

question id: duck-tell-laundry-8

Many numerical optimization techniques are based on the idea like this: treating the field of gradient vectors as a flow field in a differential equation. Starting at some initial value, follow the gradient vectors (as you did in sketching the trajectory in a flow field). If seeking a maximum, the flow will be in the direction of the gradient. If seeking a minimum, the flow will be opposite the direction of the gradient. It is not necessary to calculate the gradient everywhere; you just have to calculate it at the present point on your trajectory to know which way to go next.

Occasionally, particularly in textbook problems, the argmin or argmax is found algebraically. This still involves calculating the gradient as a function of the inputs. Then, find the inputs that make all the components of the gradient vector zero.

Which of these formulas give the gradient vector of $A (w, d)$ ?

$\partial_{w} A = - 2 \frac{V}{w^{2}} + 4 d, \partial_{d} A = - 2 \frac{V}{d^{2}} - 4 w$

$\partial_{w} A = - 2 \frac{V}{d^{2}} + 4 d, \partial_{d} A = - 2 \frac{V}{w^{2}} - 4 w$

$\partial_{w} A = - 2 \frac{V}{w^{2}} + 4 w, \partial_{d} A = - 2 \frac{V}{d^{2}} - 4 d$

$\partial_{w} A = - 2 \frac{V}{d w} + 4 w, \partial_{d} A = - 2 \frac{V}{d w} - 4 d$

question id: duck-tell-laundry-9

If the lengths $w$ , $d$ , $h$ are measured in feet, what unit will $\partial_{w} A$ be in?

For those of you who are pining for algebra problems, here you go.

Taking the gradient of $A (w, d)$ (given in a previous question), set both components to zero, giving you two equations in the two variables $w$ and $d$ . There is also a $V$ in the equations, but we’ve set up the problem saying that we already know $V$ . Numerically, we used $V = 1.3$ cubic-feet, but in the algebra solution we can just leave $V$ as a symbol, giving general formulas for $w$ and for $d$ in terms of $V$ .

Which of these is the correct formula for the optimal $w^{⋆}$ as a function of $V$ ? (Hint: You can weed out one of the choices by checking for dimensional consistency.)

$w^{⋆} = \frac{\sqrt[3]{V}}{\sqrt[3]{2}}$

$w^{⋆} = \frac{\sqrt[3]{V}}{\sqrt[3]{3}}$

$w^{⋆} = \frac{\sqrt[2]{V}}{\sqrt[2]{3}}$

question id: duck-tell-laundry-11

The solution for $d^{⋆}$ is the same as for $w^{⋆}$ . (An experienced algebraist would have noticed that in the formula for area, you can swap inputs $w$ and $d$ without changing the output.)

Now compute the formula for the optimal value $h^{⋆}$ . (Hint: Early in the section we gave a formula that involves $V$ , $h$ , $w$ , and $d$ .)

Which of these is the correct formula for the optimal $h^{⋆}$ as a function of $V$ ?

It turns out that $h^{⋆}$ is somewhat larger than either $w^{⋆}$ or $d^{⋆}$ ; the optimal box has a square top and bottom, but the sides are not square.

Which of these is an appropriate explanation for why $h^{⋆}$ is larger than $w^{⋆}$ or $d^{⋆}$ ?

People don’t like using boxes that are perfect cubes.

$h$ multiplies both $w$ and $d$ , but not vice versa, in the formula for surface area.

The flaps need to get longer as $h$ gets longer, so smaller $h$ helps to minimize the amount of cardboard.

The flap-length does not depend on $h$ , only on $w d$ . So we can make $h$ larger without contributing to the “wasted” area of the doubling over of flaps. The flaps get smaller as $w d$ gets smaller, so larger $h$ is preferred.

question id: duck-tell-laundry-13

Exercise 3 You and your pet dog Swimmer often go to the beach and walk along the water’s edge. You throw a ball down the beach, but at an angle so it lands in the water. Swimmer goes to work. She runs down the beach (fast) and then plunges into the water, heading toward the ball. She can run fast on the beach: 400 m/minute. But she swims rather slower: 50 m/min.

Suppose you threw the ball to a point about 50 meters down the beach and 10 meters out in the water. The overall distance to the ball is therefore $\sqrt{50^{2} + 10^{2}} \approx 51$ meters. If Swimmer entered the water immediately, she would take about a minute to reach the ball (51 m / 50 m/min). Swimmer can get to the ball faster by running down the beach a bit and then turning into the water. If Swimmer ran all 50 meters down the beach and then turned to swim the 10 meters, it would take her (50/400 + 10/50) minutes, about one-third of a minute.

Figure 1: Swimmer’s optimal path to the ball consists of running $x$ meters along the shore, then swimming diagonally to the ball.

Can Swimmer do better? You can set up the calculation like this. Imagine $x$ to be the distance down the beach that Swimmer runs. The time to run this distance will be $x / 400$ . The distance remaining to the ball can be found by the Pythagorean theorem. One leg of the triangle has length $(50 - x)$ , the other has length 10 m. So, the length of the third side is $\sqrt{(50 - x)^{2} + 10^{2}}$ . For instance, if $x$ were 45, the distance to swim in the water would be $\sqrt{(50 - 45)^{2} + 10^{2}} = 11.2$ m. Divide this distance by 50 m/min to get the time spent in the water.

Active R chunk 2

Time_to_ball() takes one argument, the distance $x$ Swimmer runs down the beach before turning into the water. Complete and use Active R chunk 2 to find the distance that calculus-savvy Swimmer runs down the beach before turning into the water, if Swimmer’s goal is to get to the ball as fast as possible.

What’s the optimal running distance for Swimmer?

Here’s a news story about a mathematician’s dog on the shore of Lake Michigan. It is not plausible that Swimmer has been trained in calculus. Perhaps the way Swimmer solves the running distance problem is simply to graph time_to_ball(x) ~ x over a suitable domain and find the argmax by eye!

Activities

Exercise 10 The key steps in optimization are setting up the objective function(s) and setting constraints as needed to represent the problem at hand. There are many ways to perform the work to extract the argmax once the objective function and constraints are set.

Understandably, calculus textbooks tend to emphasize techniques based on finding an input where the derivative of the objective function is zero. For problems involving multiple inputs, the task is to find an input where the gradient vector is zero.

Contemporary work often involves problems with tens, hundreds, thousands, or even millions of inputs. Even in such large problems, the mechanics of finding the corresponding gradient vector are straightforward. Searching through a high-dimensional space, however, is not generally a task that can be accomplished using calculus tools. Instead, starting in the 1940s, great creativity has been applied to develop algorithms with names like linear programming, quadratic programming, dynamic programming, etc. many of which are based on ideas from linear algebra such as the qr.solve() algorithm for solving the target problem, or ideas from statistics and statistical physics that incorporate randomness as an essential component. An entire field, operations research, focuses on setting up and solving such problems. Building appropriate algorithms requires deep understanding of several areas of mathematics. But using the methods is mainly a matter of knowing how to set up the problem and communicate the objective function, constraints, etc. to a computer.

Purely as an example, let’s examine the operation of an early algorithmic optimization method: Nelder-Mead, dating from the mid-1960s. (There are better, faster methods now, but they are harder to understand.)

Nelder-Mead is designed to search for maxima of objective functions with $n$ inputs. The video shows an example with $n = 2$ in the domain of a contour plot of the objective function. Of course, you can simply scan the contour plot by eye to find the maxima and minima. The point here is to demonstrate the Nelder-Mead algorithm.

Start by selecting $n + 1$ points on the domain that are not colinear. When $n = 2$ , the $2 + 1$ points are the vertices of a triangle. The set of points defines a simplex, which you can think of as a region of the domain that can be fenced off by connecting the vertices.

Evaluate the objective function at the vertices of the simplex. One of the vertices will have the lowest score for the output of the objective. From that vertex, project a line through the midpoint of the fence segment defined by the other $n$ vertices. In the video, this is drawn using dashes. Then try a handful of points along that line, indicated by the colored dots in the video. One of these will have a higher score for the objective function than the vertex used to define the line. Replace that vertex with the new, higher-scoring point. Now you have another simplex and can repeat the process. The actual algorithm has additional rules to handle special cases, but the gist of the algorithm is simple.

Exercise 11 When we locate an argmax, it’s helpful to have some guidance about how much precision make practical sense. We can figure this out using the second derivative of the function at the argmax.

Consider a function f(x) and its curvature at an argmax. The curvature is the reciprocal of the radius of a circle tangent to the function. That circle approximates the function itself. At an argmax, the highest point in the circle will be right on the function. But a few degrees to either side of the highest point will be very close to the highest point. How wide will the chord of the circle be that connects the points 5 degrees to either side of the highest point? Your answer will involve sin() but also the curvature of the function.

No answers yet collected

webR Code Links

Activities

R History Command Contents