Differentiation done correctly: 4. Inverse and implicit functions

Navigation: 1. The derivative | 2. Higher derivatives | 3. Partial derivatives | 4. Inverse and implicit functions | 5. Maxima and minima

Now we’re going to prove the inverse function and implicit function theorems for Banach spaces.

Theorem 32 (Contraction principle). Let \((X,d)\) be a complete metric space and let \(\varphi:X\to X\) be a map satisfying $$
d(\varphi(x),\varphi(y)) \le cd(x,y)
$$ for all \(x,y\in X\) and some constant \(c < 1\). Then there is exactly one \(x\in X\) for which \(\varphi(x)=x\).

Proof. Choose any \(x_0\in X\) and define \(x_{n+1}=\varphi(x_n)\). For all \(n\ge 1\) we have $$
d(x_{n+1},x_n)=d(\varphi(x_n),\varphi(x_{n-1}))\le cd(x_n,x_{n-1}),
$$ so \(d(x_{n+1},x_n)\le c^n d(x_1,x_0)\) by induction. For all \(m > n\), \begin{align}
d(x_n,x_m) &\le d(x_n,x_{n+1})+\cdots+d(x_{m-1},x_m) \\
&\le (c^n+\cdots+c^{m-1})d(x_1,x_0) \\
&\le c^n(1-c)^{-1}d(x_1,x_0),
\end{align} which shows that \(\{x_n\}\) is a Cauchy sequence. Since \(X\) is complete, \(x_n\to x\) for some \(x\in X\). Furthermore, $$
x=\lim_{n\to\infty}x_{n+1}=\lim_{n\to\infty}\varphi(x_n)=\varphi(x)
$$ since \(\varphi\) is continuous. Uniqueness is obvious. \(\square\)

Theorem 33 (Inverse function theorem). Let \(A\subseteq E\) be an open set and let \(f:A\to F\) be of class \(C^p\) (with \(p\ge 1\)). Suppose that \(f'(p)\) is invertible for some \(p\in A\). Then there is a neighborhood \(U\subseteq A\) of \(p\) such that \(f(U)\) is open and \(f|_U:U\to f(U)\) is a \(C^p\) diffeomorphism.

Proof. Let \(\iota:E\to E\) be the identity map. By replacing \(f\) with \(f'(p)^{-1}\circ f\), we may assume that \(E=F\) and \(f'(p)=\iota\). Since \(f’\) is continuous at \(p\), there exists an open ball \(U\subseteq A\) around \(p\) such that \(|f'(x)-\iota| < \frac{1}{2}\) for all \(x\in U\). For \(y\in f(U)\), define the map \(\varphi_y(x)=x-f(x)+y\). Note that \(x\) is a fixed point of \(\varphi_y\) if and only if \(f(x)=y\). For \(y\in f(U)\) we have \(|\varphi'_y(x)|=|f'(x)-\iota|<\frac{1}{2}\) for all \(x\in U\), so by Corollary 16 we have $$
|\varphi_y(x_1)-\varphi_y(x_2)| \le \frac{1}{2}|x_1-x_2|\tag{*}
$$ for all \(x_1,x_2\in U\). Using the uniqueness argument in Theorem 32, we conclude that \(f|_U:U\to f(U)\) is a bijection.

Now let \(b\in f(U)\) so that \(b=f(a)\) for some \(a\in U\). Let \(B\) be an open ball with radius \(r\) around \(a\) such that \(\overline{B}\subseteq U\), and let \(B’\) be an open ball of radius \(r/2\) around \(b\). We want to show that \(B’\subseteq f(U)\), thus proving that \(f(U)\) is open. Let \(y\in B’\). If \(x\in\overline{B}\) then \begin{align}
|\varphi_y(x)-a| &\le |\varphi_y(x)-\varphi_y(a)|+|\varphi_y(a)-a| \\
&< \frac{1}{2}|x-a|+|y-b| \\ &< r, \end{align} so \(\varphi_y(x)\in B\). This together with (*) shows that \(\varphi_y|_{\overline{B}}:\overline{B}\to\overline{B}\) is a contraction mapping, and since \(\overline{B}\) is complete we can apply Theorem 32 to obtain a fixed point \(x\in\overline{B}\) of \(\varphi_y|_{\overline{B}}\), which implies that \(f(x)=y\) and \(y\in f(U)\). For the last part of the proof, we denote \(f|_U\) by \(f\) and \((f|_U)^{-1}\) by \(f^{-1}\) for convenience. Let \(y\in f(U)\) and \(y+k\in f(U)\) with \(k\ne 0\); there exist \(x\in U\) and \(x+h\in U\) with \(y=f(x)\) and \(y+k=f(x+h)\), noting that \(h\ne 0\). In fact we have \begin{align} |h-k| &= |h-f(x+h)+f(x)| \\ &= |\varphi_y(x+h)-\varphi_y(x)| \\ &\le \frac{1}{2}|h| \end{align} from (*), so \(|h|\le 2|k|\). Then \(h\to 0\) as \(k\to 0\) and \begin{align} \frac{|f^{-1}(y+k)-f^{-1}(y)-f'(x)^{-1}k|}{|k|} &= \frac{|f'(x)^{-1}(f(x+h)-f(x))-h|}{|k|} \\ &\le |f'(x)^{-1}|\frac{|f(x+h)-f(x)-f'(x)h|}{|k|} \\ &\le 2|f'(x)^{-1}|\frac{|f(x+h)-f(x)-f'(x)h|}{|h|} \\ &\to 0 \end{align} as \(h\to 0\). (Note that \(f'(x)\) is invertible since \(|f'(x)-\iota|<\frac{1}{2}\).) This proves that $$ (f^{-1})'(y)=f'(x)^{-1}=f'(f^{-1}(y))^{-1},\tag{**} $$ so \(f^{-1}\) is continuous and differentiable on \(f(U)\). Furthermore, (**) shows that \((f^{-1})'\) is of class \(C^p\) since the maps \(f^{-1}\), \(f'\) and \(\lambda\mapsto\lambda^{-1}\) (operator inversion) are all of class \(C^p\). \(\square\) Theorem 34 (Implicit function theorem). Let \(A\subseteq E\) and \(B\subseteq F\) be open sets and let \(f:A\times B\to G\) be of class \(C^p\) (with \(p\ge 1\)). Suppose \((a,b)\in A\times B\) such that \(f(a,b)=0\) and \(D_2 f(a,b):F\to G\) is invertible. Then there exists a neighborhood \(U\) of \(a\) and a \(C^p\) map \(g:U\to B\) with the following properties:

  1. \(g(a)=b\).
  2. \(f(x,g(x))=0\) for all \(x\in A\).
  3. \(g'(a)=-[D_2 f(a,b)]^{-1}\circ D_1 f(a,b)\).

Proof. Let \(\iota:E\to E\) be the identity map. Define \begin{align}
\widetilde{f}:A\times B &\to E\times G \\
(x,y) &\mapsto (x,f(x,y))
\end{align} and compute $$
\widetilde{f}'(a,b) = \begin{bmatrix}
\iota & 0 \\
D_1 f(a,b) & D_2 f(a,b)
\end{bmatrix}.
$$ Then \(\widetilde{f}'(a,b)\) is invertible, with $$
\widetilde{f}'(a,b)^{-1} = \begin{bmatrix}
\iota & 0 \\
-[D_2 f(a,b)]^{-1}\circ D_1 f(a,b) & [D_2 f(a,b)]^{-1}
\end{bmatrix}.\tag{*}
$$ By the inverse function theorem, there exist neighborhoods \(V\subseteq A\times B\) of \((a,b)\) and \(W\subseteq E\times G\) of \((a,0)\) such that \(\widetilde{f}|_V:V\to W\) is a \(C^p\) diffeomorphism. Let \(U=\{x\in E:(x,0)\in W\}\); it is clear that \(U\) is a neighborhood of \(a\). Define \(g:U\to B\) by \(g=\pi\circ(\widetilde{f}|_V)^{-1}\circ i\) where \(\pi:A\times B\to B\) is the canonical projection and \(i:A\to A\times B\) is given by \(i(x)=(x,0)\). To complete the proof, we check the three required properties. Firstly, $$
g(a)=\pi((\widetilde{f}|_V)^{-1}(a,0))=\pi(a,b)=b
$$ since \(\widetilde{f}(a,b)=(a,0)\). If \(x\in U\) then \((x,0)\in W\), so \((x,f(x,y))=\widetilde{f}(x,y)=(x,0)\) for a unique \(y\in B\) and $$
f(x,g(x))=f(x,\pi((\widetilde{f}|_V)^{-1}(x,0)))=f(x,\pi(x,y))=f(x,y)=0.
$$ Lastly, \(g'(b)\) is simply the bottom left entry of (*). \(\square\)

In the next and final post, we will look at some applications of Taylor’s theorem and the implicit function theorem to finding minima and maxima of maps from Banach spaces to \(\mathbb{R}\).

Navigation: 1. The derivative | 2. Higher derivatives | 3. Partial derivatives | 4. Inverse and implicit functions | 5. Maxima and minima

Leave a Reply