Now comes one of the cornerstones of calculus: the Mean Value Theorem. It connects the local pictu.e (slope at a point) to the global picture (average slope across an interval). In other words it relates df /dx to Af /Ax. Calculus depends on this connection, which we saw first for velocities. If the average velocity is 75, is there a moment when the instantaneous velocity is 75?
Without more information, the answer to that question is no. The velocity could be 100 and then 50—averaging 75 but never equal to 75. If we allow a jump in velocity, it can jump right over its average. At that moment the velocity does not exist. (The distance function in Figure 3.26a has no derivative at x = 1.) We will take away this cheap escape by requiring a derivative at all points inside the interval.
In Figure 3.26b the distance increases by 150 when t increases by 2. There is a derivative df/dt at all interior points (but an infinite slope at t = 0). The average velocity is
Δƒ | = | ƒ(2) − ƒ(0) | = | 150 | = 75. |
Δt | 2 − 0 | 2 |
[Fig. 3.26]
The conclusion of the theorem is that dƒ/dt = 75 at some point inside the interval. There is at least one point where ƒ'(c) = 75.
This is not a constructive theorem. The value of c is not known. We don't find c, we just claim (with proof) that such a point exists.
3M Mean Value Theorem Suppose ƒ(x) is continuous in the closed interval
a ≤ x ≤ b and has a derivative everywhere in the open interval a < x < b. Then
(ƒ(b) − ƒ(a)) / (b − a) = ƒ'(c) at some point a < c < b. (1)
The left side is the average slope Δƒ/Δx. It equals dƒ/dx at c. The notation for a closed interval [with endpoints] is [a, b]. For an open interval (without endpoints) we write (a, b). Thus ƒ' is defined in (a, b), and f remains continuous at a and b. A derivative is allowed at those endpoints too-but the theorem doesn't require it.
The proof is based on a special case—when ƒ(a) = 0 and ƒ(b) = 0. Suppose the function starts at zero and returns to zero. The average slope or velocity is zero. We have to prove that ƒ'(c)= 0 at a point in between. This special case (keeping the assumptions on ƒ(x)) is called Rolle's theorem.
Geometrically, if ƒ goes away from zero and comes back, then ƒ' = 0 at the turn.
3N Rolle's theorem Suppose ƒ(a) = ƒ(b) = 0 (zero at the ends). Then ƒ'(c) =0 at some point with a < c < b.
Proof At a point inside the interval where ƒ(x) reaches its maximum or minimum, df/dx must be zero. That is an acceptable point c. Figure 3.27a shows the difference between ƒ = 0 (assumed at a and b) and ƒ' = 0 (proved at c).
Small problem: The maximum could be reached at the ends a and b, if ƒ(x) < 0 in between. At those endpoints dƒ/dx might not be zero. But in that case the minimum is reached at an interior point c, which is equally acceptable. The key to our proof is that a continuous function on [a, b] reaches its maximum and minimum. This is the Extreme Value Theorem.*
It is ironic that Rolle himself did not believe the logic behind calculus. He may not have believed his own theorem! Probably he didn't know what it meant—the language of "evanescent quantities" (Newton) and "infinitesimals" (Leibniz) was exciting but frustrating. Limits were close but never reached. Curves had infinitely many flat sides. Rolle didn't accept that reasoning, and what was really serious, he didn't accept the conclusions. The Acadkmie des Sciences had to stop his battles (he fought against ordinary mathematicians, not Newton and Leibniz). So he went back to number theory, but his special case when ƒ(a) = ƒ(b) = 0 leads directly to the big one.
[Fig. 3.27]
Proof of the Mean Value Theorem We are looking for a point where dƒ/dx equals Δƒ/Δx. The idea is to tilt the graph back to Rolle's special case (when Af was zero). In Figure 3.27b, the distance F(x) between the curve and the dotted secant line comes from subtraction:
F(x) = ƒ(x) − [ƒ(a) + (Δƒ/Δx)(x − a)]. (2)
At a and b, this distance is F(a) = F(b) = 0. Rolle's theorem applies to F(x). There is an interior point where F'(c) = 0. At that point take the derivative of equation (2):
0 = ƒ'(c) − (Δƒ/Δx). The desired point c is found, proving the theorem.
EXAMPLE 1 The function ƒ(x) = √x goes from zero at x = 0 to ten at x = 100. Its average slope is Δƒ/Δx = 10/100. The derivative ƒ'(x) = 1/2√x exists in the open interval (0, 100), even though it blows up at the end x = 0. By the Mean Value Theorem there must be a point where 10/100 = ƒ'(c) = 1/2√c. That point is c = 25.
The truth is that nobody cares about the exact value of c. Its existence is what matters. Notice how it affects the linear approximation ƒ(x) ≈ ƒ(a) + ƒ'(a)(x − a), which was basic to this chapter. Close becomes exact ( ≈ becomes = ) when ƒ' is computed at c instead of a:
3O The derivative at c gives an exact prediction of ƒ(x):
ƒ(x) = ƒ(a) + ƒ'(c)(x − a). (3)
The Mean Value Theorem is rewritten here as Δƒ = ƒ'(c)Δx. Now a < c < x.
EXAMPLE 2 The function ƒ(x)= sin x starts from ƒ(0)= 0. The linear prediction (tangent line) uses the slope cos 0 = 1. The exact prediction uses the slope cos c at an unknown point between 0 and x:
(approximate) sin x ≈ x (exact) sin x = (cos c)x. (4)
The approximation is useful, because everything is computed at x = a = 0. The exact formula is interesting, because cos c ≤ 1 proves again that sin x ≤ x. The slope is below 1, so the sine graph stays below the 45° line.
EXAMPLE 3 If ƒ'(c) = 0 at all points in an interval then ƒ(x) is constant.
Proof When ƒ' is everywhere zero, the theorem gives Δƒ = 0. Every pair of points has ƒ(b) = ƒ(a). The graph is a horizontal line. That deceptively simple case is a key to the Fundamental Theorem of Calculus.
Most applications of Δƒ = ƒ'(c)Δx do not end up with a number. They end up with another theorem (like this one). The goal is to connect derivatives (local) to differences (global). But the next application—l'Hôpital's Rule—manages to produce a number out of 0/0.
When ƒ(x) and g(x) both approach zero, what happens to their ratio ƒ(x)/g(x)?
ƒ(x) | = | x² | or | sin x | or | x − sin x | all become | 0 | at x = 0. |
g(x) | x | x | 1 − cos x | 0 |
Since 0/0 is meaningless, we cannot work separately with ƒ(x) and g(x). This is a "race toward zero," in which two functions become small while their ratio might do anything. The problem is to find the limit of ƒ(x)/g(x).
One such limit is already studied. It is the derivative! Δƒ/Δx automatically builds in a race toward zero, whose limit is dƒ/dx:
ƒ(x) − ƒ(a) → 0 | but limx→a | ƒ(x) − ƒ(a) | = ƒ'(a). (5) |
x − a → 0 | x − a |
The idea of l'Hôpital is to use ƒ'/g' to handle ƒ/g. The derivative is the special case g(x) = x − a, with g' = 1. The Rule is followed by examples and proofs.
3P l'Hôpital Rule Suppose ƒ(x) and g(x) both approach zero as x → a. Then ƒ(x)/g(x) approaches the same limit as ƒ'(x)/g'(x), if that second limit exists:
limitx→a ƒ(x) / g(x) = limx→a ƒ'(x) / g'(x). Normaly this limit is ƒ'(a) / g'(x). (6)
This is not the quotient rule! The derivatives of ƒ(x) and g(x) are taken separately. Geometrically, l'Hôpital is saying that when functions go to zero their slopes control their size. An easy case is ƒ = 6(x − a) and g = 2(x − a). The ratio ƒ/g is exactly 6/2, the ratio of their slopes. Figure 3.28 shows these straight lines dropping to zero, controlled by 6 and 2.
[Fig. 3.28]
The next figure shows the same limit 6/2, when the curves are tangent to the lines. That picture is the key to l'Hôpital's rule.
Generally the limit of ƒ/g can be a finite number L or +∞ or -∞. (Also the limit point x = a can represent a finite number or +∞ or -∞. We keep it finite.) The one absolute requirement is that ƒ(x) and g(x) must separately approach zero-we insist on 010. Otherwise there is no reason why equation (6) should be true. With ƒ(x) = x and g(x) = x − 1, don't use l'Hôpital:
ƒ(x) | → | a | but | ƒ'(x) | = | 1 |
g(x) | a − 1 | g'(x) | 1 |
Ordinary ratios approach lim ƒ(x) divided by lim g(x). l'Hôpital enters only for 0/0.
EXAMPLE 4 (an old friend) limx→0 (1 − cos x) / x equals limx→0 sin x / 1. This equals zero.
EXAMPLE 5 ƒ/g = tan x / sin x leads to ƒ'/g' = sec² x / cos x. At x = 0 the limit is 1/1.
EXAMPLE 6 ƒ/g = (x − sin x) / (1 − cos x) leads to ƒ'/g' = (1 − cos x) / sin x. At x = 0 this is still 0/0.
Solution Apply the Rule to ƒ'/g'. It has the same limit as ƒ"/g": if ƒ/g → 0/0 and ƒ'/g' → 0/0 then compute ƒ"/g" = sin x /cos x → 0/1 = 0.
The reason behind l'Hôpital's Rule is that the following fractions are the same:
ƒ(x) | = | (ƒ(x) − ƒ(a)) / (x − a) | (7) |
g(x) | (g(x) − g(a)) / (x − a) |
That is just algebra; the limit hasn't happened yet. The factors x − a cancel, and the numbers ƒ(a) and g(a) are zero by assumption. Now take the limit on the right side of (7) as x approaches a.
What normally happens is that one part approaches ƒ' at x = a. The other part approaches g'(a). We hope g'(a) is not zero. In this case we can divide one limit by the other limit. That gives the "normal" answer
limx→a ƒ(x) / g(x) = limit of (7) = ƒ'(x) / g'(x). (8)
This is also l'Hôpital's answer. When ƒ'(x) → ƒ'(a) and separately g'(x) → g'(a), his overall limit is ƒ'(a)/g'(a). He published this rule in the first textbook ever written on differential calculus. (That was in 1696—the limit was actually discovered by his teacher Bernoulli.) Three hundred years later we apply his name to other cases permitted in (6), when ƒ'/g' might approach a limit even if the separate parts do not.
To prove this more general form of l'Hôpital's Rule, we need a more general Mean Value Theorem. I regard the discussion below as optional in a calculus course (but required in a calculus book). The important idea already came in equation (8).
Remark The basic "indeterminate" is ∞ − ∞. If ƒ(x) and g(x) approach infinity, anything is possible for ƒ(x) − g(x). We could have x² − x or x − x² or (x + 2) − x. Their limits are ∞ and -∞ and 2.
At the next level are 0/0 and ∞/∞ and 0⋅∞. To find the limit in these cases, try l'Hôpital's Rule. See Problem 24 when ƒ(x)/g(x) approaches ∞/∞. When ƒ(x) − 0 and g(x) → ∞, apply the 0/0 rule to ƒ(x)/(1/g(x)).
The next level has 0⁰ and 1∞ and ∞⁰. Those come from limits of ƒ(x)g(x). If ƒ(x) approaches 0, 1, or ∞ while g(x) approaches 0, ∞, or 0, we need more information. A really curious example is x1/ln x, which shows all three possibilities 0⁰ and 1∞ and ∞⁰. This function is actually a constant! It equals e.
To go back down a level, take logarithms. Then g(x) ln ƒ(x) returns to 0/0 and 0⋅∞ and l'Hôpital's Rule. But logarithms and e have to wait for Chapter 6.
The MVT can be extended to two functions. The extension is due to Cauchy, who cleared up the whole idea of limits. You will recognize the special case g = x as the ordinary Mean Value Theorem.
3Q Generalized MVT If ƒ(x) and g(x) are continuous on [a, b] and differentiable on (a, b), there is a point a < c < b where
[ƒ(b) − ƒ(a)]g'(c) = [g(b) − g(a)]ƒ'(c). (9)
The proof comes by constructing a new function that has F(a) = F(b):
F(x) = [ƒ(b) − ƒ(a)]g(x) − [g(b)− g(a)]ƒ(x).
The ordinary Mean Value Theorem leads to F'(c)= 0 —which is equation (9).
Application 1 (Proof of l'Hôpital's Rule) The rule deals with ƒ(a)/g(a) = 0/0. Inserting those zeros into equation (9) leaves ƒ(b)g'(c) = g(b)ƒ'(c). Therefore ƒ(b)/g(b) = ƒ'(c)/g'(c). (10)
As b approaches a, so does c. The point c is squeezed between a and b. The limit of equation (10) as b → a and c → a is l'Hôpital's Rule.
Application 2 (Error in linear approximation) Section 3.2 stated that the distance between a curve and its tangent line grows like (x-a)'. Now we can prove this, and find out more. Linear approximation is
ƒ(x) = ƒ(a) + ƒ'(a)(x − a) + error e(x). (11)
The pattern suggests an error involving ƒ"(x) and (x − a)². The key example ƒ = x² shows the need for a factor (to cancel ƒ" = 2). The error in linear approximation is
e(x) = ½ƒ"(c)(x − a)² with a<c<x. (12)
Key idea Compare the error e(x) to (x − a)². Both are zero at x = a:
e = ƒ(x) − ƒ(a) − ƒ'(a)(x − a) e' = ƒ'(x) − ƒ'(a) e" = ƒ"(x)
g = (x − a)² g' = 2(x − a) g" = 2
The Generalized Mean Value Theorem finds a point C between a and x where e(x)/g(x) = e'(C)/g'(C). This is equation (10) with different letters. After checking e'(a) = g'(a) = 0, apply the same theorem to e'(x) and g'(x). It produces a point c between a and C—certainly between a and x—where
e'(C)/g'(C) = e"(c)/g"(c) and therefore e'(x)/g'(x) = e"(c)/g"(c).
With g = (x − a)² and g" = 2 and e" = ƒ", the equation on the right is e(x)= ½ƒ"(c)(x − a)². The error formula is proved. A very good approximation is ½ƒ"(a)(x − a)².
EXAMPLE 7 ƒ(x) = √x near a = 100: √102 ≈ 10 + 2(1/20) + ½(-1/4000)2².
That last term predicts e = -.0005. The actual error is √102 − 10.1 = -.000496.