question id: calf-bring-pants-1
Chap 24 Exercises
Loading webR...
Exercise 1
- Consider the function
. Confirm that the value of the derivative and so is a critical point. Which sort of critical point is ? (Hint: Draw the graph of near to see whatβs going on.)
- Still working with the function
, find the value of the second-derivative evaluated at the critical point . Which of these is ?
question id: calf-bring-pants-2
Exercise 2 Weβd like to make a folded cardboard box in the most efficient way possible. As you know, cardboard boxes have four sides as well as eight flaps, four for the top and four for the bottom. The flaps are arranged to provide double coverage; you fold the flaps from one direction and then fold over them the flaps from the other direction.
The diagram depicts the box sides and flaps laid out on a flat piece of cardboard. The flaps are shaded with diagonal lines. The formulas give the areas of various rectangles on the sheet.
Suppose the height, width, and depth of the box are
- Which of these formulas gives the area of the cardboard making up the box?
question id: duck-tell-laundry-1
A common size for a box is 1.3 cubic feet. We will use feet as the units for
- The following formulas do not describe the area of the cardboard, but they are nonetheless formulas for something. Except one of them, which cannot be true. Which one?
question id: duck-tell-laundry-2
As
To complete the problem statement we need something else. Here, that something is a constraint: We demand that the box have a volume of
Often, a constraint plays the role of a dimension reduction. With
- Plug in the above expression for
into the formula for the surface area of cardboard. Which of the following is the resulting formula in terms of , , and ?
question id: duck-tell-laundry-3
Active R chunk 1 contains the formula for the surface area
When you draw the contour plot, you will see a broad area near the center inside the contour at area = 9.5. Towards the upper-right and lower-left corners of the plot frame are contours at higher levels of area.
- The spacing between the contours in the corners is tight, but there is no similarly spaced contour inside the region delimited by the contour at area=9.5. Why not?
We didnβt ask for contours inside 9.5.
The function shape inside the 9.5 contour is the top of a bowl, so it is pretty flat.
The function shape inside the 9.5 contour is the bottom of a bowl, so it is pretty flat.
All the points inside the 9.5 contour are at the same height.
question id: duck-tell-laundry-4
Use Active R chunk 1 to place contours at 10, 9.5, 9.4. You can do this by replacing the argument contours_at = NULL
with this:
= c(10, 9.5, 9.4)) contours_at
Add more contours to build a fence tighter and tighter around the argmin. When the fenced region is tiny, you can read off the min from the contour label. (Remember, the βargminβ is the value of the inputs
- The following are values for the output of the function where you might be able to draw a contour. Which one of the values is the smallest for which a contour appears?
question id: duck-tell-laundry-5
- From your contour plot, read off the values of
and that produce the minimum surface area for a 1.3 cubic-foot box. What are they? (Hint: You may need to zoom in on the domain to get the precision needed to answer the question.)
question id: duck-tell-laundry-6
It is easy enough for a person to look at a contour plot and roughly locate the argmin. But this is not feasible if there are more than two inputs to the function being optimized. For such functions, another set of numerical techniques are used based on the gradient of the objective function. Remember that the gradient at any point is a vector that points in the uphill direction and whose length is proportional to the steepness of the slope. (Skiers, beware. In skiing what people call the gradient is the steepest downhill direction. This might account for all the mathematicians learning to ski who point their skis uphill in response to the ski instructorβs instruction!)
You can display the gradient on the plot of the area function by piping (remember |>
) the contour plot into the commented-out command in the sandbox. (Also, replace #pipe to
with |>
.)
- Since the end of the term is coming, here is a question that might be a good review for the final. Which of these describes the relationship between the gradient vector and the contours?
On a contour the gradient vector is perpendicular (βorthogonalβ) to the contour.
On a contour, the gradient vector has zero length.
There is no definite relationship; it depends on the function itself.
On a contour, the gradient vector has a length proportional to the contour level.
question id: duck-tell-laundry-7
- Which of these best describes the gradient vector at the argmin?
The gradient points due North.
The length of the gradient vector is maximal.
The length of the gradient vector is minimal.
The length of the gradient vector is zero.
question id: duck-tell-laundry-8
Many numerical optimization techniques are based on the idea like this: treating the field of gradient vectors as a flow field in a differential equation. Starting at some initial value, follow the gradient vectors (as you did in sketching the trajectory in a flow field). If seeking a maximum, the flow will be in the direction of the gradient. If seeking a minimum, the flow will be opposite the direction of the gradient. It is not necessary to calculate the gradient everywhere; you just have to calculate it at the present point on your trajectory to know which way to go next.
Occasionally, particularly in textbook problems, the argmin or argmax is found algebraically. This still involves calculating the gradient as a function of the inputs. Then, find the inputs that make all the components of the gradient vector zero.
- Which of these formulas give the gradient vector of
?
question id: duck-tell-laundry-9
- If the lengths
, , are measured in feet, what unit will be in?
question id: duck-tell-laundry-10
For those of you who are pining for algebra problems, here you go.
Taking the gradient of
- Which of these is the correct formula for the optimal
as a function of ? (Hint: You can weed out one of the choices by checking for dimensional consistency.)
question id: duck-tell-laundry-11
The solution for
Now compute the formula for the optimal value
- Which of these is the correct formula for the optimal
as a function of ?
question id: duck-tell-laundry-12
It turns out that
- Which of these is an appropriate explanation for why
is larger than or ?
People donβt like using boxes that are perfect cubes.
The flaps need to get longer as
The flap-length does not depend on
question id: duck-tell-laundry-13
Exercise 3 You and your pet dog Swimmer often go to the beach and walk along the waterβs edge. You throw a ball down the beach, but at an angle so it lands in the water. Swimmer goes to work. She runs down the beach (fast) and then plunges into the water, heading toward the ball. She can run fast on the beach: 400 m/minute. But she swims rather slower: 50 m/min.
Suppose you threw the ball to a point about 50 meters down the beach and 10 meters out in the water. The overall distance to the ball is therefore

Can Swimmer do better? You can set up the calculation like this. Imagine
Time_to_ball()
takes one argument, the distance
Whatβs the optimal running distance for Swimmer?
question id: optim-violet-1
Hereβs a news story about a mathematicianβs dog on the shore of Lake Michigan. It is not plausible that Swimmer has been trained in calculus. Perhaps the way Swimmer solves the running distance problem is simply to graph time_to_ball(x) ~ x
over a suitable domain and find the argmax by eye!
Exercise 4 If youβre skeptical that a dog might do a calculus problem (as in Exercise 3) before running to fetch a ball, consider the path taken by a photon. βFermatβs Principleβ is that light takes the path of least time. To illustrate, consider the problem of a photon traveling from a point A to a point B, as in the diagram. The shortest path between the two points is a straight line. Along this straight-line path, the time taken by the photon will be the distance divided by the speed of light.
The diagram shows another path consisting of two segments, one of length
The reason the indirect path might be shorter is that the speed of light differs in different physical media. Light traveling in a vacuum famously has a speed of about 300,000 km per second. In air, the speed is smaller by a factor of 1/1.003. In water, the speed is smaller still: the factor is 1/1.3.
Imagine that the blue zone of the diagram is water and the clear zone air. The time for the photon to travel from point A to B is proportional to
To see the path taken by light, letβs imagine that point A is
- Which of these formulas gives the total time it takes for light to traverse the path from A to P at relative speed 1/1.003 and then the path from P to B at relative speed 1/1.3? A is located at
, B is located at , and P is located at
question id: optim-pink-1
Implement the calculation of total_time()
in Active R chunk 3, then use a graph to find the argmin.
- What value of
(that is, the argmin) minimizes the travel time of light between points A and B? (Choose the best answer from the slice plot you made in Active R chunk 3)
question id: optim-pink-2
- Suppose that instead of being water, the blue area was glass. The speed of light in glass is roughly 1/1.5 times as big as in vacuum. What value of
minimizes the travel time of light between points A and B? (Choose the best answer)
question id: optim-pink-3
- At the argmin for
in the travel time, what will be the derivative of travel time with respect to ?
question id: optim-pink-4
- At the argmin for
in the travel time, what will be the second derivative of travel time with respect to ?
question id: optim-pink-5
Exercise 5
Loading webR...
The graphic drawn by Active R chunk 4 shows a lens together with a source and target point. Functions are used to define the top and bottom surface of the lens. Light passing through the lens is refracted. The path followed by the light will be the one with the shortest time of transit from source to target. But Active R chunk 4 initially is set up to trace a non-optimal, out-of-the-way path running through
Note: The plot_lens()
function has been provided for you, but itβs too computer-ese to bother showing you with how itβs defined.
The light enters the lens at some point
To do this, we find the distance from the source to
Of course, light also has to travel through the lens. We will make the lens out of glass with a high refractive index, so the transit time will be the distance from
The objective function will be the sum of the three legsβ transit times. It is already programmed for you in the sandbox. So is the command to make a contour plot of the output of the objective function over a domain of
- When the index of refraction of the lens is 1.80, what are the optimal values for
and ? (Choose the closest answer.)
question id: optim-purple-1
It is a good practice to test software against situations where you know the right answer. A simple situation is when there is no lens at all. One way to do this is to change the middle line of transit_time()
so that the index of refraction is 1.03, just like the surrounding air.
- When the index of refraction of the lens is 1.03, what are the optimal values for
and ? (Choose the best answer.)
question id: optim-purple-2
Letβs explore an extreme situation. Diamond is the transparent material that has the highest index of refraction, 2.417. Imagine a material with an index of refraction of 10. This means that light will travel very slowly within the lens.
When you examine the contour plot of transit_time()
for this high index of refraction, there will be two, widely separated local minima. Explain briefly which part of the lens these two minima correspond to. Hint: High index means slow speed of light. Sometimes it is worthwhile to go out of your way to avoid slowdowns.
Exercise 6 Your uncle Bob is writing a business plan for a tree farm for lumber. Having heard that you are taking Math 141Z, he emails you giving some information asking for some numbers. In particular, Bob sends you a report saying that, for the species of tree he plans to plant, the amount of usable lumber is a function of growth time lumber()
:
Bob has heard that the time to harvest is when the tree is growing fastest.
- What is the value of
t
(in years) at whichlumber(t)
is largest?
question id: optim-green-A
You patiently explain to your uncle that you certainly do not want to harvest trees when they are growing the fastest. You say, βYou want to wait until the average growth rate up to that point is fastest. That will be a little while before the tree reaches its adult volume.β
In Active R chunk 7, give the expression for the average growth rate function in terms of lumber()
and t
. Keep in mind the starting point is 0 lumber at time 0.
- What is the value of t (in years) for which the average growth rate, up to that time, is fastest.
question id: optim-green-B
Exercise 7 It often happens in decision making that there are multiple criteria. For instance, in selecting cadets for pilot training, two obvious criteria are the cadetβs demonstrated flying aptitude and the leadership potential of the cadet. Letβs assume that the merit
Currently, the merit score is a simple function of the
The general in charge of the training program is not satisfied with the current merit function. βIβm getting too many cadets who are great leaders but poor pilots, and too many pilot hot-shots who are poor leaders. I would rather have an good pilot who is a good leader than have a great pilot who is a poor leader or a poor pilot who is a great leader.β (You might reasonably agree or disagree with this point of view, but the general is in charge.)
Youβve been tasked to develop an improved formula for the merit score
A low-order polynomial model, without quadratic terms, is
You are trying to decide whether
Take a few minutes now to think about how you would decide whether
Using the low-order polynomial model, find algebraically these two partial derivatives
and
- Which of these possibilities is true about
as a function of ?
question id: rooster-red-1
- Which of these possibilities is true about
as a function of ?
question id: rooster-red-2
- Which one of these describes the relationship between
as a function of and as a function of ? (Hint: Remember that and are specified as being positive.)
If
If
Neither of the above needs to be true.
question id: rooster-red-3
Think now about the generalβs statement and how to translate it into mathematical terms. Hereβs a hint: Imagine Drew has very high
- Based on the generalβs statement, do you want
to be positive, negative, or zero?
Positive
Zero
Negative
It has nothing to do with the generalβs statement.
question id: rooster-red-4
Exercise 8 Based on an extensive but fictive observation of activity and grades of college students, the model shown in Figure 4 was constructed to give GPA as a function of the number of hours each weekday (Monday-Friday) spent studying and spent in social activity and play. (Activity during the weekend was not monitored.)
Several points in the graphic frame have been marked with letters. Refer to these letters when answering the following questions.
- According to the model, whatβs the optimal combination of Study and Play to achieve a high GPA?
question id: sh1-1
- Which of these letters marks a place on the graph where the partial derivative of GPA with respect to Play is positive?
question id: sh1-2
- Which if these ketters marks a place on the graph where the partial derivative of GPA with respect to Play is negative.
question id: sh1-3
- Where is the partial derivative with respect to Study is negative?
question id: sh1-4
Study and Play are not the only activities possible. Sleep is important, too, as are meals, personal care, etc. In the study, students were observed who spent up to 22 hours per day in Study or Play. Presumably, such students crashed on the weekend.
- Suppose you decide to budget 12 hours each weekday day in activities other than Study and Play. Which letter labels the constrained optimal mix (argmax) of Study and Play.
question id: sh1-5
- What is the βshadow priceβ of GPA with respect to the budget for a budget constraint of 12 hours? Give both an estimated numerical value as well as units.
-0.8 hour/gradepoints
0.3 gradepoints/hour
0.9 gradepoints/hour
1.3 hour/gradepoints
question id: sh1-6
- Consider a student who budgets 22 hours per day for Study and Play. Which letter is closest to the constrained argmax with a 22-hour constraint?
question id: sh1-7
- What is the βshadow priceβ of GPA with respect to the budget constraint of 22 hours? Give the estimated numerical value.
-0.4 gradepoints/hour
0 gradepoints/hour
0.4 gradepoints/hour
1.0 gradepoints/hour
question id: sh1-8
- Based on the shadow price from the previous question, which of these is the best advice to give the student (who seeks to maximize GPA)?
Youβre hopeless. There arenβt enough hours in the day for you to get a good GPA.
Youβve got to squeeze out more effort studying. Give it your all!
Play more, study less!
Study less
Study less, play less. Sleep!
question id: sh1-9
Exercise 9 Consider the problem faced by a propane distribution company manager: find the optimal radius of a cylindrical tank with spherical ends. The point is to choose the sphere radius
- Which of these is correct? (Hint: Only one of the answers is dimensionally consistent.)
question id: tb1-1
- Which of these is the correct expression for
question id: tb1-2
- Find
and set to zero. Solve for in terms of . Which of these is correct?
question id: tb1-3
- What is the optimal value of
in cm to a precision of one micron?
question id: tb1-4
- Find the optimum value of
to minimize when liters.
question id: tb1-5
Use Active R chunk 8 to plot a graph of
- From the graph of
versus at liters, read off a range of that produces no worse than 1% greater than the minimum. How wide is that range, approximately?
question id: tb1-6
Activities
Exercise 10 The key steps in optimization are setting up the objective function(s) and setting constraints as needed to represent the problem at hand. There are many ways to perform the work to extract the argmax once the objective function and constraints are set.
Understandably, calculus textbooks tend to emphasize techniques based on finding an input where the derivative of the objective function is zero. For problems involving multiple inputs, the task is to find an input where the gradient vector is zero.
Contemporary work often involves problems with tens, hundreds, thousands, or even millions of inputs. Even in such large problems, the mechanics of finding the corresponding gradient vector are straightforward. Searching through a high-dimensional space, however, is not generally a task that can be accomplished using calculus tools. Instead, starting in the 1940s, great creativity has been applied to develop algorithms with names like linear programming, quadratic programming, dynamic programming, etc. many of which are based on ideas from linear algebra such as the qr.solve()
algorithm for solving the target problem, or ideas from statistics and statistical physics that incorporate randomness as an essential component. An entire field, operations research, focuses on setting up and solving such problems. Building appropriate algorithms requires deep understanding of several areas of mathematics. But using the methods is mainly a matter of knowing how to set up the problem and communicate the objective function, constraints, etc. to a computer.
Purely as an example, letβs examine the operation of an early algorithmic optimization method: Nelder-Mead, dating from the mid-1960s. (There are better, faster methods now, but they are harder to understand.)
Nelder-Mead is designed to search for maxima of objective functions with
Start by selecting
Evaluate the objective function at the vertices of the simplex. One of the vertices will have the lowest score for the output of the objective. From that vertex, project a line through the midpoint of the fence segment defined by the other
Exercise 11 When we locate an argmax, itβs helpful to have some guidance about how much precision make practical sense. We can figure this out using the second derivative of the function at the argmax.
Consider a function f(x) and its curvature at an argmax. The curvature is the reciprocal of the radius of a circle tangent to the function. That circle approximates the function itself. At an argmax, the highest point in the circle will be right on the function. But a few degrees to either side of the highest point will be very close to the highest point. How wide will the chord of the circle be that connects the points 5 degrees to either side of the highest point? Your answer will involve sin() but also the curvature of the function.