If Newton's method slow converges, in base 2 Newton's method is giving you an extra bit at each iteration. That's terrible. Newton's method,when it converges, usually doubles the number of correct digits at each iteration.

Definition 2.9.2 (Superconvergence). Set $x_i = |a_{i+1} - a_i|$. The sequence $a_0, a_1, \dots$ supercoverges if, when the $x_i$ are written as base 2, then each number $x_i$ starts with $2^i - 1\approx 2^i$ zeros.

Theorem. 2.9.4 (Newton's method superconverges). Set

$$ k = |\vec f(a_0)||[D\vec f(a_0)]^{-1}|^2 M < \frac{1}{2}\\ c = \frac{1-k}{1-2k}|[D\vec f(a_0)]^{-1}|\frac{M}{2}. $$

If $|\vec h_n| \leq \frac{1}{2c}$, then $|\vec h_{n+m}| \leq \frac{1}{c} \cdot \Big(\frac{1}{2}\Big)^{2^m}$. Since $\vec h_n = |a_{n+1} - a_n|$, starting at step $n$ and using Newton's method for $m$ iterations causes the distance between $a_n$ and $a_{n+m}$ to shrink to practically nothing before our eyes; if $m=10$,

$$ |\vec h_{n+m} |\leq \frac{1}{c}\cdot \Big(\frac{1}{2}\Big)^{1024}. $$

Kantorovich's theorem: a stronger version

Definition 2.9.6 (The norm of a linear transformation). Let $A: \R^n \to \R^m$ be a linear transformation. The norm $\|A\|$ of $A$ is

$$ \|A\| = \sup|A\vec x|, \text{ when } \vec x\in \R^n \text{ and }|\vec x| = 1. $$

Note that it is always true that $||A|| \leq |A|$. This is why using the norm rather than using the length makes Kantorovich's theorem stronger.

Theorem 2.9.8 (A stronger version of Kantorovich's theorem). Kantorovich's theorem 2.8.13 still holds if you replace both lengths of matrices by norm of matrices: $|[D\vec f(u_1)] - [D\vec f(u_2)]|$ is replaced by $\|[D\vec f(u_1)] - [D\vec f(u_2)]\|$ and $|[D\vec f(a_0)]^{-1}|^2$ by $\|[D\vec f(a_0)]^{-1}\|^2$.

Proof. Because the triangle inequality and Cauchy Schwartz also hold for norms of matrices, Kantorovich's theorem still holds.

Example 2.9.9 (Norm of a matrix is harder to compute). The length of a matrix is easy to compute. For $A = \begin{bmatrix}1&1\\0&1\end{bmatrix}$, it is $\sqrt{3}$. ****To compute the norm, let us parametricize $A\vec x$.

$$ \begin{bmatrix}1&1\\0&1\end{bmatrix}\begin{bmatrix}\cos t\\\sin t\end{bmatrix} = \begin{bmatrix} \cos t + \sin t\\\sin t \end{bmatrix}. $$

To compute the norm, we must calculate $\sup \sqrt{(\cos t + \sin t)^2 + \sin^2 t}$.