日本語 | English

**Iteration means repeating the same function.** Suppose the function is F(x) = cos x. Choose any starting value, say x₀ = 1. Take its cosine: x₁ = cos x₀ = .54. **Then take the cosine of x₁.** That produces x₂ = cos .54 = .86. The iteration is x_{n+1} = cos x_{n}. I am in radian mode on a calculator, pressing "cos" each time. The early numbers are not important, what is important is the output after 12 or 30 or 100 steps:

*EXAMPLE 1* x₁₂ = .75, x₁₃ = .73, x₁₄ = .74, …, x₂₉ = .7391, x₃₀ = .7391.

The goal is to explain why the x's approach x* = .739085 …. Every starting value x₀ leads to this same number x*. **What is special about .7391?**

Note on iterations Do x₁ = cos x, and x₂ = cos x₁ mean that x₂ = cos² x₀? Absolutely not! Iteration creates a new and different function cos (cos x). It uses the cos button, not the squaring button. The third step creates F(F(F(x))). As soon as you can, iterate with x_{n+1} = ½cos x_{n}. What limit do the x's approach? Is it ½(.7931)?

Let me slow down to understand these questions. **The central idea is expressed by the equation x _{n+1} = F(x_{n}).** Substituting xo into F gives x₁. This output x, is the input that leads to x₂. In its turn, x₂ is the input and out comes x₃ = F(x₂). This is iteration, and it produces the sequence x₀, x₁, x₂, ….

The x's may approach a limit x*, depending on the function F. Sometimes x* also depends on the starting value x₀. Sometimes there is no limit. Look at a second example, which does not need a calculator.

*EXAMPLE 2* x_{n+1} = F(x_{n}) = ½x_{n} + 4. Starting from x₀ = 0 the sequence is x₁ = ½⋅0 + 4 = 4, x₂ = ½⋅4 + 4 = 6 , x₃ = ½⋅6 + 4 = 7, x₄ = ½⋅7 + 4 = 7½, ….

Those numbers 0, 4, 6, 7, 7½, … seem to be approaching x* = 8. A computer would convince us. So will mathematics, when we see what is special about 8:

When the x's approach x*, the limit of x_{n+1} = ½x_{n} + 4 is X* = ½x* + 4. This limiting equation yields x* = 8.

8 is the "steady state" where input equals output: 8 = F(8). It is thefixedpoint.

If we start at x, = 8, the sequence is 8, 8, 8, ... . When we start at x₀ = 12, the sequence goes back toward 8:

x₁ = ½⋅12 + 4 = 10, x₂ = ½⋅10 + 4 = 9 , x₃ = ½⋅9 + 4 = 8.5, …

**Equation for limit: If the iterations x _{n+1} = F(x_{n}) converge to x*, then x* = F(x*).** To repeat: 8 is special because it equals ½⋅8 + 4. The number .7391 … is special because it equals cos .7391 …. The graphs of y = x and y = F(x) intersect at x*. To explain why the x's converge (or why they don't) is the job of calculus.

*EXAMPLE 3* x_{n-1} = x_{n}² has two fixed points: 0 = 0² and 1 = 1². Here F(x) = x². Starting from x₀ = 1/2 the sequence 1/4, 1/16, 1/256, … goes quickly to x* = 0. The only approaches to x* = 1 are from x₀ = 1 (of course) and from x₀ = -1. Starting from x₀ = 2 we get 4, 16, 256, … and the sequence diverges to +∞.

Each limit x* has a "**basin of attraction**." The basin contains all starting points x₀ that lead to x*. For Examples 1 and 2, every x₀ led to .7391 and 8. The basins were the whole line (that is still to be proved). Example 3 had three basins—the interval -1 < x₀ < 1, the two points x₀ = +1, and all the rest. The outer basin |x₀| < 1 led to ±∞. I challenge you to find the limits and the basins of attraction (by calculator) for F(x) = x − tan x.

**In Example 3, x* = 0 is attracting.** Points near x* move toward x*. The fixed point x* = 1 is **repelling**. Points near 1 move away. We now find the rule that decides whether x* is attracting or repelling. **The key is the slope dF/dx at x*.**

*3J* Start from any x₀ near a fixed point x* = F(x*):

x* is **attracting** if |dF/dx| is below 1 at x*

x* is **repelling** if |dF/dx| is above 1 at x*.

First I will give a calculus proof. Then comes a picture of convergence, by "cobwebs." Both methods throw light on this crucial test for attraction: |dF/dx| > 1.

First proof: Subtract x* = F(x*) from x_{n+1} = F(x_{n}). The difference x_{n+1} − x* is the same as F(x_{n}) − F(x*). This is AF. The basic idea of calculus is that ΔF is close to F'Δx:

x_{n+1} − x* = F(x,) − F(x*) ≈ F'(x*)(x_{n} − x*). (1)

The "error" x_{n} − x* is multiplied by the slope dF/dx. The next error x_{n+1} − x* is smaller or larger, based on |F'| < 1 or |F'| > 1 at x*. Every step multiplies approximately by F'(x*). **Its size controls the speed of convergence.**

In Example 1, F(x) is cos x and F1(x) is -sin x. There is attraction to .7391 because |sin x*| < 1. In Example 2, F is ½x + 4 and F' is ½. There is attraction to 8. In Example 3, F is x² and F' is 2x. There is superattraction to x* = 0 (where F' = 0). There is repulsion from x* = 1 (where F' = 2).

I admit one major difficulty. The approximation in equation (1) only holds near x*. If x, is far away, does the sequence still approach x*? When there are several attracting points, which x* do we reach? This section starts with good iterations, which solve the equation x* = F(x*) or ƒ(x) = 0. At the end we discover **Newton's method**. The next section produces crazy but wonderful iterations, not converging and not blowing up. They lead to "fractals" and "Cantor sets" and "chaos."

The mathematics of iterations is not finished. It may never be finished, but we are converging on the answers. Please choose a function and join in.

The iteration x_{n+1} = F(x_{n}) involves two graphs at the same time. One is the graph of y = F(x). The other is the graph of y = x (the 45° line). The iteration jumps back and forth between these graphs. It is a very convenient way to see the whole process.

Example 1 was x_{n+1} = cos x_{n}. Figure 3.19 shows the graph of cos x and the "cobweb." Starting at (x₀, x₀) on the 45° line, the rule is based on x₁ = F(x₀):

From (x₀, x₀) go up or down to (x₀, x₁) **on the curve**.

From (x₀, x₁) go across to (x₁, x₁) **on the 45° line**.

These steps are repeated forever. From x, go up to the curve at F(x₁). That height is x, . Now cross to the 45" line at (x₂, x₂). The iterations are aiming for (x*, x*) = (.7391, .7391). This is the crossing point of the two graphs y = F(x) and y = x.

[Fig. 3.19]

Example 2 was x_{n+1} = ½x_{n} + 4. Both graphs are straight lines. The cobweb is one-sided, from (0,0) to (0,4) to (4,4) to (4,6) to (6,6). Notice how y changes (vertical line) and then x changes (horizontal line). The slope of F(x) is ½, so the distance to 8 is multiplied by ½ at every step.

Example 3 was x_{n+1} = x_{n}². The graph of y = x² crosses the 45° line at two fixed points: O² = 0 and 1² = 1. Figure 3.20a starts the iteration close to 1, but it quickly goes away. This fixed point is repelling because F'(1) = 2. Distance from x* = 1 is doubled (at the start). One path moves down to x* = 0 —which is superattractive because F' = 0. The path from x₀ > 1 diverges to infinity.

*EXAMPLE 4* F(x) has two attracting points x* (a repelling x* is always between).

Figure 3.20b shows two crossings with slope zero. The iterations and cobwebs converge quickly. In between, the graph of F(x) must cross the 45° line from below. That requires a slope greater than one. Cobwebs diverge from this unstable point, which separates the basins of attraction. The fixed point x = π is in a basin by itself!

Note 1 To draw cobwebs on a calculator, graph y = F(x) on top of y = x. On a Casio, one way is to plot (x₀, x₀) and give the command `LINE: PLOT X ,Y`

followed by `EXE`

. Now move the cursor vertically to y = F(x) and press `EXE`

. Then move horizontally to y = x and press `EXE`

. Continue. Each step draws a line.

[Fig. 3.20]

For the TI-81 (and also the Casio) a short program produces a cobweb. Store F(x) in the Y = function slot Y1. Set the range (square window or autoscaling). Run the program and answer the prompt with x₀:

`PrgmC:COBWEB :DISP "INITIAL X0" :Input X :All-Off :Y1-On :"X"→Y4 :Lbl 1 :X→S :Y1→T :Line(S,S,S,T) :Line(S,T,T,T) :T→X :Pause :Goto 1`

Note 2 The x's approach x* from one side when 0 < dF/dx < 1.

Note 3 A basin of attraction can include faraway x₀'s (basins can come in infinitely many pieces). This makes the problem interesting. If no fixed points are attracting, see Section 3.7 for "cycles" and "chaos."

At this point we offer the reader a choice. One possibility is to jump ahead to the next section on "Newton's Method." That method is an iteration to solve ƒ(x) = 0. The function F(x) combines x, and ƒ(x_{n}) and ƒ'(x_{n}) into an optimal formula for x,+ ,. We will see how quickly Newton's method works (when it works). It is the outstanding algorithm to solve equations, and it is totally built on tangent approximations.

The other possibility is to understand (through calculus) a whole family of iterations. This family depends on a number c, which is at our disposal. **The best choice of c produces Newton's method.** I emphasize that iteration is by no means a new and peculiar idea. It is a fundamental technique in scientiJic computing

We start by recognizing that there are many ways to reach ƒ(x*) = 0. (I write x* for the solution.) A good algorithm may switch to Newton as it gets close. Theiterations use ƒ(x_{n}) to decide on the next point x_{n+1}:

x_{n+1} = F(x_{n}) = x_{n} − cƒ(x_{n}). (2)

Notice how F(x) is constructed from ƒ(x)—they are different! We move ƒ to the right side and multiply by a "preconditioner" c. The choice of c (or c_{n}, if it changes from step to step) is absolutely critical. The starting guess x₀ is also important—but its accuracy is not always under our control.

Suppose the x, converge to x*. Then the limit of equation (2) is

x* = x* − cƒ(x*). (3)

That gives ƒ(x*) = 0. If the x_{n}'s have a limit, it solves the right equation. It is a fixed point of F (we can assume c_{n} → c ≠ 0 and ƒ(x_{n}) → ƒ(x*)). There are two key questions, and both of them are answered by the slope F'(x*):

- How quickly does x
_{n}approach x* (or do the x_{n}diverge)? - What is a good choice of c (or c
_{n})?

*EXAMPLE 5* ƒ(x) = ax − b is zero at x* = b/a. The iteration xn_{n+1} = x_{n} − c(ax_{n} − b) intends to find bla without actually dividing. (Early computers could not divide; they used iteration.) Subtracting x* from both sides leaves an equation for the error:

x_{n+1} − x* = x_{n} − x* − c(ax_{n} − b).

Replace b by ax*. The right side is (1 − ca)(x_{n} − x*). This "error equation" is

(error)_{n+1} = (1 − ca)(error)_{n}. (4)

**At every step the error is multiplied by (1 − ca)**, which is F'. The error goes to zero if |F'| is less than 1. The absolute value |1 − ca| decides everything:

**x _{n} converges to x* if and only if -1 < 1 − ca < 1.** (5)

The perfect choice (if we knew it) is c = 1/a, which turns the multiplier 1 − ca into zero. Then one iteration gives the exact answer: x₁ = x₀ − (1/a)(ax₀ − b) = b/a. That is the horizontal line in Figure 3.21a, converging in one step. But look at the other lines.

This example did not need calculus. Linear equations never do. The key idea is that close to x* the nonlinear equation ƒ(x) = 0 is nearly linear. We apply the tangent approximation. You are seeing how calculus is used, in a problem that doesn't start by asking for a derivative.

The immediate goal is to study the errors x_{n} − x*. They go quickly to zero, if the multiplier is small. To understand x_{n+1} = x_{n} − cƒ(x_{n}), subtract the equation x* = x* − cƒ(x*):

x_{n+1} − x* = x_{n} − x* − c( ƒ(x_{n}) − ƒ(x*)). (6)

Now calculus enters. **When you see a difference of ƒ's think of dƒ/dx.** Replace ƒ(x_{n}) − ƒ(x*) by A(x_{n} − x*), where A stands for the slope dƒ/dx at x*:

x_{n+1} − x* ≈ (1 − cA)(x_{n} − x*). (7)

This is the **error equation**. The new error at step n + 1 is approximately the old error
multiplied by m = 1 − cA. This corresponds to m = 1 − ca in the linear example. We
keep returning to the basic test |m| = |F'(x*)| < 1:

*3K* Startingnear x*, the errors x_{n} − x* go to zero if the multiplier |m| < 1. The perfect choice is c = 1/A = 1/ƒ'(x*). Then m = 1 − cA = 0.

There is only one difficulty: We don't know x*. Therefore we don't know the perfect c. It depends on the slope A = ƒ'(x*) at the unknown solution. However we can come close, by using the slope at x,:

Choose c_{n} = 1/ƒ'(x_{n}). Then x_{n+1} = x_{n} − ƒ(x_{n})/ƒ'(x_{n}) = F(x_{n}).

**This is Newton's method.** The multiplier m = 1 − cA is as near to zero as we can make it. By building dfldx into F(x),Newton speeded up the convergence of the iteration.

[Fig. 3.21]

*EXAMPLE 6* Solve ƒ(x) = 2x − cos x = 0 with different iterations (different c's).

The line y = 2x crosses the cosine curve somewhere near x = 1/2. The intersection point where 2x* = cos x* has no simple formula. We start from x₀ = 1/2 and iterate x_{n+1} = x_{n} − c(2x_{n} − cos x_{n}) with three diflerent choices of c.

Take c = 1 or c = 1/ƒ'(x₀) or update c by Newton's rule c_{n} = 1/ ƒ'(x_{n}):

x₀ = .50 | c=1 | c=1/ƒ'(x₀) | c_{n}=1/ƒ'(x_{n}) |

x₁ = | .38 | .45063 | .45062669 |

x₂ = | .55 | .45019 | .45018365 |

x₃ = | .30 | .45018 | .45018361… |

The column with c = 1 is diverging (repelled from x*). The second column shows convergence (attracted to x*). The third column (Newton's method) approaches x* so quickly that .4501836 and seven more digits are exact for x₃.

How does this convergence match the prediction? Note that ƒ'(x) = 2 + sin x so A = 2.435. Look to see whether the actual errors x_{n} − x*, going down each column, are multiplied by the predicted m below that column:

c=1 | c=1/(2 + sin ½) | c=1/(2 + sin x_{n}) | |

x₀ − x* = | 0.05 | 4.98⋅10⁻² | 4.98⋅10⁻² |

x₁ − x* = | -0.07 | 4.43⋅10⁻⁴ | 4.43⋅10⁻⁴ |

x₂ − x* = | 0.10 | 7.88⋅10⁻⁶ | 3.63⋅10⁻⁸ |

x₃ − x* = | -0.15 | 1.41⋅10⁻⁷ | 2.78⋅10⁻¹⁶ |

multiplier | m = -1.4 | m = .018 | m→0(Newton) |

The first column shows a multiplier below -1. The errors grow at every step. Because m is negative the errors change sign—the cobweb goes outward.

The second column shows convergence with m = .018. It takes one genuine Newton step, then c is fixed. After n steps the error is closely proportional to m^{n} = (.018)^{n} —that is "**linear convergence**" with a good multiplier.

The third column shows the "**quadratic convergence**" of Newton's method. Multiplying the error by m is more attractive than ever, because m → 0. In fact m itself is proportional to the error, so **at each step the error is squared**. Problem 3.8.31 will show that (error)_{n+1} ≤ (error)_{n}². This squaring carries us from 10⁻² to 10⁻⁴ to 10⁻⁸ to "machine E" in three steps. The number of correct digits is doubled at every step as Newton converges.

Note 1 The choice c = 1 produces x_{n+1} = x_{n} − ƒ(x_{n}). This is "successive substitution." The equation ƒ(x) = 0 is rewritten as x = x − ƒ(x), and each x_{n} is substituted back to produce x_{n+1}. Iteration with c = 1 does not always fail!

Note 3 Edwards and Penney happened to choose the same example 2x = cos x. But they cleverly wrote it as x_{n+1} = ½ cos x_{n}, which has |F'| = |½ sin X| < 1. This iteration fits into our family with c = ½, and it succeeds. We asked earlier if its limit is ½(.7391). No, it is x* = .450….

Note 4 The choice c = 1/ƒ'(x₀) is "modified Newton." After one step of Newton's method, c is fixed. The steps are quicker, because they don't require a new ƒ'(x_{n}). But we need more steps. Millions of dollars are spent on Newton's method, so speed is important. In all its forms, ƒ(x) = 0 is the central problem of computing.