日本語 | English

The previous section reached the definition of ∫_{a}^{b} v(x) dx. But the subject cannot stop there. The integral was defined in order to be used. Its properties are important, and its applications are even more important. The definition was chosen so that the integral has properties that make the applications possible.

One direct application is to the **average value** of v(x). The average of n numbers is
clear, and the integral extends that idea—it produces the average of a whole continuum of numbers v(x). This develops from the last rule in the following list (Property 7). We now collect toget her seven basic properties of defirrite integrals.

The addition rule for ∫ [v(x) + w(x)] dx will not be repeated—even though this property of linearity is the most fundamental. We start instead with a different kind of addition. There is only one function v(x), but now there are two intervals.

**The integral from a to b is added to its neighbor from b to c.** **Their sum is the integral from a to c.** That is the first (not surprising) property in the list.

*Property 1* Areas over neighboring intervals add to the area over the combined interval:

∫ | ^{b} | v(x) dx + | ∫ | ^{c} | v(x) dx = | ∫ | ^{c} | v(x) dx. (1) |

_{a} | _{b} | _{a} |

This sum of areas is graphically obvious (Figure 5.11a). It also comes from the formal definition of the integral. Rectangular areas obey (1)—with a meshpoint at x = b to make sure. When Δx_{max} approaches zero, their limits also obey (1). All the normal rules for rectangular areas are obeyed in the limit by integrals.

Property 1 is worth pursuing. It indicates how to define the integral when a = b. The integral "from b to b" is the area over a point, which we expect to be zero. It is.

Property 2 | ∫ | ^{b} | v(x) dx = 0 |

_{b} |

That comes from Property 1 when c = b. Equation (1) has two identical integrals, so the one from b to b must be zero. Next we see what happens if c = a—which makes the second integral go from b to a.

What happens when **an integralgoes backward**? The "lower limit" is now the larger number b. The "upper limit" a is smaller. Going backward reverses the sign:

Property 3 | ∫ | ^{a} | v(x) dx = − | ∫ | ^{b} | v(x) dx = ƒ(a) − ƒ(b). |

_{b} | _{a} |

Proof When c = a the right side of (1) is zero. Then the integrals on the left side must cancel, which is Property 3. In going from b to a the steps Δx are negative. That justifies a minus sign on the rectangular areas, and a minus sign on the integral (Figure 5.1 1b). Conclusion: Property 1 holds for any ordering of a, b, c.

EXAMPLE | ∫ | ^{0} | t² dt = − | x³ | ∫ | ^{0} | dt = −1 | ∫ | ^{2} | dt | = 0 |

_{x} | 3 | _{1} | _{2} | t |

Property 4 | For odd functions | ∫ | ^{a} | v(x) dx = 0. "Odd" means that v(−x) = −v(x). |

_{-a} |

For even functions | ∫ | ^{a} | v(x) dx = 2 | ∫ | ^{a} | v(x) dx. "Even" means that v(−x) = +v(x). |

_{-a} | _{0} |

The functions x, x³, x⁵, … are odd. If x changes sign, these powers change sign. The functions sin x and tan x are also odd, together with their inverses. This is an important family of functions, and **the integral of an odd function from −a to a equals zero**. Areas cancel:

∫ | ^{a} | 6x⁵ = x⁶ | ^{a} | = a⁶ − (−a)⁶ = 0. | |

_{-a} | _{-a} |

If v(x) is odd then ƒ(x) is even! All powers 1, x², x⁴, … are even functions. Curious fact: Odd function times even function is odd, but odd number times even number is even.

For even functions, areas add: ∫_{-a}^{a} cos x dx = sin a − sin(−a) = 2 sin a.

The next properties involve inequalities. If v(x) is positive, the area under its graph is positive (not surprising). Now we have a proof: The lower sums s are positive and they increase toward the area integral. So the integral is positive:

Property 5 | If v(x) > 0 for a < x < b then | ∫ | ^{b} | v(x) dx > 0 |

_{a} |

A positive velocity means a positive distance. A positive v lies above a positive area. A more general statement is true. Suppose v(x) stays between a lower function l(x) and an upper function u(x). Then the rectangles for v stay between the rectangles for l and u. In the limit, the area under v (Figure 5.12) is between the areas under l and u:

Property 6 | If l(x) ≤ v(x) ≤ u(x) for a ≤ x ≤ b then | ∫ | ^{b} | l(x) dx ≤ | ∫ | ^{b} | v(x) dx ≤ | ∫ | ^{b} | u(x) dx |

_{a} | _{a} | _{a} |

EXAMPLE 1 | cos t ≤ 1 ⇒ | ∫ | ^{x} | cos t dt ≤ | ∫ | ^{x} | 1 dt ⇒ | sin x ≤ x |

_{0} | _{0} |

EXAMPLE 2 | 1 ≤ sec² t ⇒ | ∫ | ^{x} | 1 dt ≤ | ∫ | ^{x} | sec² t dt ⇒ | x ≤ tan x |

_{0} | _{0} |

EXAMPLE 3 | Integrating | 1 | ≤ 1 leads to tan^{-1} x ≤ x. |

1+x² |

All those examples are for x > 0. You may remember that Section 2.4 used geometry to prove sin h < h < tan h. Examples 1-2 seem to give new and shorter proofs. But I think the reasoning is doubtful. The inequalities were needed to compute the derivatives (therefore the integrals) in the first place.

*Property 7* (**Mean Value Theorem for Integrals**) If v(x) is continuous, there is a point c between a and b where v(c) equals the average value of v(x):

v(c) = | 1 | ∫ | ^{b} | v(x) dx = "average value of v(x)." (3) |

b−a | _{a} |

This is the same as the ordinary Mean Value Theorem (for the derivative of ƒ(x)):

ƒ'(c) = | ƒ(b)−ƒ(a) | = "average slope of ƒ." (4) |

b−a |

With ƒ' = v, (3) and (4) are the same equation. But honesty makes me admit to a flaw in the logic. We need the Fundamental Theorem of Calculus to guarantee that ƒ(x) = ∫_{a}^{x} v(t) dt really gives ƒ' = v.

A direct proof of (3) places one rectangle across the interval trom a to b. Now raise the top of that rectangle, starting at Vmin (the bottom of the curve) and moving up to v_{max} (the top of the curve). At some height the area will be just right—equal to the area under the curve. Then the rectangular area, which is (b − a) times v(c), equals the curved area ∫_{a}^{b} v(x) dx. This is equation (3).

That direct proof uses the **intermediate value theorem**: A continuous function v(x) takes on every height between v_{min} and v_{max}. At some point (at two points in Figure 5.12c) the function v(x) equals its own average value.

**Figure 5.13 shows equal areas above and below the average height v(c) = v _{ave}.**

*EXAMPLE 4* The average value of an odd function is zero (between -1 and 1):

1 | ∫ | ^{1} | x dx = | x² | ^{1} | = | 1 | − | 1 | = 0 (note 1/(b-a) = 1/2) | |

2 | _{1} | 4 | _{1} | 4 | 4 |

For once we know c. It is the center point x = 0, where v(c) = v_{ave} = 0.

*EXAMPLE 5* The average value of x² is 1/3 (between 1 and -1):

1 | ∫ | ^{1} | x² dx = | x³ | ^{1} | = | 1 | − | (−1) | = | 1 | = 0 (note 1/(b-a) = 1/2) | |

2 | _{-1} | 6 | _{-1} | 6 | 6 | 3 |

Where does this function x² equal its average value 1/3? That happens when c² = 1/3, so c can be either of the points 1/√3 and −1/√3? in Figure 5.13b. Those are the **Gauss points**, which are terrific for numerical integration as Section 5.8 will show.

*EXAMPLE 6* The average value of sin² x over a period (zero to π) is ½:

1 | ∫ | ^{π} | sin² x dx = | x − sin x cos x | ^{π} | = | 1 | (note 1/(b-a) = 1/π) | |

π | _{0} | 2π | _{0} | 2 |

The point c is π/4 or 3π/4, where sin² c = ½. **The graph of sin² x oscillates around its average value ½**. See the figure or the formula:

sin² x = ½ − ½ cos 2x. (5)

The steady term is ½, the oscillation is −½ cos 2x. The integral is ƒ(x) = ½x − ¼ sin 2x, which is the same as ½x − ½ sin x cos x. This integral of sin² x will be seen again. Please verify that dƒ/dx = sin² x.

The "average value" from a to b is the integral divided by the length b−a. This was computed for x and x² and sin² x, but not explained. It is a major application of the integral, and it is guided by the ordinary average of n numbers:

v_{ave} = | 1 | ∫ | ^{b} | v(x) dx comes from v_{ave} = | 1 | (v_{1} + v_{2} + … + v_{n}). |

b−a | _{a} | n |

**Integration is parallel to summation!** Sums approach integrals. Discrete averages approach continuous averages. The average of 1/3, 2/3, 3/3 is 2/3. The average of 1/5, 2/5, 3/5, 4/5, 5/5 is 3/5. The average of n numbers from 1/n to n/n is

v_{ave} = | 1 | 1 | + | 2 | + … + | n | = | n + 1 | . (7) | ||

n | n | n | n | 2n |

The middle term gives the average, when n is odd. Or we can do the addition. As n → ∞ the sum approaches an integral (do you see the rectangles?). The ordinary average of numbers becomes the continuous average of v(x) = x:

n+1 | → | 1 | and | ∫ | ^{1} | x dx = | 1 | note | 1 | = 1 | |||

2n | 2 | _{0} | 2 | b−a |

In ordinary language: "The average value of the numbers between 0 and 1 is 4." Since a whole continuum of numbers lies between 0 and 1, that statement is meaningless until we have integration.

The average value of the squares of those numbers is (x²)_{ave} = ∫ x² dx / (b − a) = 4. **If you pick a number randomly between 0 and 1, its expected value is 5 and its expected square is 1/3.**

To me that sentence is a puzzle. First, we don't expect the number to be exactly ½—so we need to define "expected value." Second, if the expected value is ½, why is the expected square equal to 1/3 instead of 1/4? The ideas come from probability theory, and calculus is leading us to **continuous probability**. We introduce it briefly here, and come back to it in Chapter 8.

Suppose you throw a pair of dice. The outcome is not predictable. Otherwise why throw them? But the average over more and more throws is totally predictable. We don't know what will happen, but we know its probability.

For dice, we are adding two numbers between 1 and 6. The outcome is between 2 and 12. The probability of 2 is the chance of two ones: (1/6)(1/6) = 1/36. Beside each outcome we can write its probability:

2(1/36)3(2/36)4(3/36)5(4/36)6(5/36)7(6/36)8(5/36)9(4/36)10(3/36)11(2/36)12(1/36)

To repeat, one roll is unpredictable. Only the probabilities are known, and they add to 1. (Those fractions add to 36/36; all possibilities are covered.) The total from a million rolls is even more unpredictable—it can be anywhere between 2,000,000 and 12,000,000. Nevertheless the average of those million outcomes is almost completely predictable. This expected value is found by adding the products in that line above:

**Expected value: multiply (outcome)times (probability of outcome) and add**:

2/36 + 6/36 + 12/36 + 20/36 + 30/36 + 42/36 + 40/36 + 36/36 + 30/36 + 22/36 + + 12/36 = 7.

If you throw the dice 1000 times, and the average is not between 6.9 and 7.1, you get an A. Use the random number generator on a computer and round off to integers.

Now comes **continuous probability**. Suppose all numbers between 2 and 12 are equally probable. This means all numbers—not just integers. What is the probability of hitting the particular number x = π? It is zero! By any reasonable measure, π has no chance to occur. In the continuous case, every x has probability zero. But an interval of x's has a nonzero probability:

- the probability of an outcome between 2 and 3 is 1/10
- the probability of an outcome between x and x + Δx is Δx/10

To find the average, add up each outcome times the probability of that outcome. First divide 2 to 12 into intervals of length Δx = 1 and probability p = 1/10. If we round off x, the average is 6½:

2(1/10) + 3(1/10) + … + 11(1/10) = 6.5.

Here all outcomes are integers (as with dice). It is more accurate to use 20 intervals of length 1/2 and probability 1/20. The average is 6¾, and you see what is coming. These are rectangular areas (Riemann sums). As Δx → 0 they approach an integral. The probability of an outcome between x and x + dx is p(x) dx, and this problem has p(x) = 1/10. **The average outcome in the continuous case is not a sum but an integral**:

expected value E(x) = | ∫ | ^{12} | xp(x) dx = | ∫ | ^{12} | x | dx | = | x² | ^{12} | = 7. | |

_{2} | _{2} | 10 | 20 | _{2} |

That is a big jump. From the point of view of integration, it is a limit of sums. From the point of view of probability, the chance of each outcome is zero but the **probability density** at x is p(x) = 1/10. The integral of p(x) is 1, because some outcome must happen. The integral of xp(x) is x_{ave} = 7, the expected value. Each choice of x is random, but the average is predictable.

This completes a first step in probability theory. The second step comes after more calculus. Decaying probabilities use e^{-x} and e^{-x²}—then the chance of a large x is very small. Here we end with the expected values of x^{n} and 1/√x and 1/x, for a random choice between 0 and 1 (so p(x) = 1):

E(x^{n}) = | ∫ | ^{1} | x^{n} dx = | 1 | E | 1 | = | ∫ | ^{1} | dx | = 2 | E | 1 | = | ∫ | ^{1} | dx | = ∞(!) | ||||

_{0} | n+1 | √x | _{0} | √x | x | _{0} | x |

A college can advertise an average class size of 29, while most students are in large classes most of the time. I will show quickly how that happens.

Suppose there are 95 classes of 20 students and 5 classes of 200 students. The total enrollment in 100 classes is 1900 + 1000 = 2900. A random professor has expected class size 29. But a random student sees it differently. The probability is 1900/2900 of being in a small class and 1000/2900 of being in a large class. Adding class size times probability gives the expected class size for the student:

(20)(1900/2900) + (200)(1000/2900) = 82 students in the class.

Similarly, the average waiting time at a restaurant seems like 40 minutes (to the customer). To the hostess, who averages over the whole day, it is 10 minutes. If you came at a random time it would be 10, but if you are a random customer it is 40.

Traffic problems could be eliminated by raising the average number of people per car to 2.5, or even 2. But that is virtually impossible. Part of the problem is the difference between (a) the percentage of cars with one person and (b) the percentage of people alone in a car. Percentage (b) is smaller. In practice, most people would be in crowded cars. See Problems 37-38.