C. Hinsley

21 August 2021

Whereas in elementary calculus we often attempt to find extrema of functions by obtaining their critical points — setting, say, $\frac{d}{dx}f(x) = 0$ and solving for the extreme point $x$ — we often find that we should like, instead, to find a function which minimizes some functional. Recall that a functional maps from a function to (in our case) a real number.

Suppose we want to find a function $x^(t)$ defined on an interval $t \in [a, b]$ for $a, b \in \mathbb{R}$. Our criterion for this function is that it minimizes a functional $J(x)$ such that $J(x^) \leq J(x)$ for all $x(t)$ defined on $t \in [a, b]$. We call $x^*$ an extremal of $J$ — this is analogous to the familiar idea of an extreme point of a function.

We clearly do not know how to differentiate $J(x)$, so we need some other way to determine an extremal. Suppose we have already found our extremal $x^(t)$. For some sufficiently small deviation from this function (whatever that might mean in a particular circumstance), we should expect an arbitrarily small change in the value of the functional $J(x)$. Letting that "sufficiently small deviation from $x^(t)$" be denoted $\delta x(t)$, we may equivalently say that for any small $\epsilon > 0$, there exists some $\delta x(t)$ so that

$$J(x^(t) + \delta x(t)) - J(x^(t)) < \epsilon. \tag{1}$$

We refer to $\delta x(t)$ as the first variation, or simply the variation, of $x^*(t)$.

A nice fact from [K-1] is that $\epsilon$ can be shown to only directly bound a chosen norm of $\delta x(t)$, so that $\delta x(t)$ can take on any shape due to being able to scale a function with its shape down by an arbitrarily small coefficient.

That's great — we now have a rudimentary analog of a critical point to a real function, but rather for a functional. We will generalize the left-hand side of the inequality above as the increment of $J$, written with a capital delta:

$$\Delta J(x, \delta x) = J(x(t) + \delta x(t)) - J(x(t)). \tag{2}$$

The inequality $(1)$ then becomes $\Delta J(x^*, \delta x) < \epsilon$.

# Bounding neighborhoods of functions with norms

We now return to the idea of choosing norms on $\delta x(t)$; it turns out that doing so provides us with a notion of a local region of neighboring functions (for example, selecting $||\delta x|| < \alpha$ for some $\alpha > 0$ allows one to deal with all the functions $x + \delta x$ in some way "close to" the function $x$). The norm of a function $x$ is itself a functional, assigning to each function $x$ some number $||x|| \in \mathbb{R}$. Norm of functions obey three properties:

• Positive-definiteness: $||x|| \geq 0$, where $||x|| = 0$ if and only if $x$ is the constant function $x = 0$.
• Homogeneity: $||c \cdot x|| = |c| \cdot ||x||$ for $c \in \mathbb{R}$.
• Triangle inequality: $||x + y|| \leq ||x|| + ||y||$ for functions $x, y$.

Note that there are multiple functionals that qualify as norms. The norm you select will depend on the problem you are trying to solve; usually, the algebraic mess you find yourself in will signal what to look for in a norm.

In order to make use of norms, we first note that we can rewrite the increment $\Delta J$ as