(p.536) Appendix B Convex functions
(p.536) Appendix B Convex functions
Free energies of models of interacting lattice clusters are convex functions (as shown in theorem 3.3). This has certain consequences; namely, that is differentiable almost everywhere (if it is finite), and finite size approximations to the free energy of an infinite system may be shown to converge to the free energy almost everywhere. The convexity of is related to the stability of the thermodynamic equilibrium of a system.
If f is a convex function, then is a concave function, and with appropriate reinterpretation, all results for convex functions apply to concave functions. There are numerous classical and more recent references on convex functions; see, for example, the book by GH Hardy, JE Littlewood and G Polya , or references [43, 497], for more details.
B.1 Convex functions and the midpoint condition
An extended real-valued function is convex on if, for each , and ,
This reduces to the midpoint condition
if . (p.537)
Convexity may also be defined in terms of the epigraph of f. This is the subset in defined by
The function f is closed if is a closed set and is convex if and only if is a convex set in , as may be seen immediately from the definition of convexity in equation (B.1). Observe that, if f is a lower semicontinuous extended real-valued function, then f is necessarily closed. (A function f is lower semicontinuous at a point x0 if ).
Denote the Lebesgue (outer)-measure by μ. In the next theorem convex functions are shown to be continuous and thus Lebesgue measurable.
Suppose that f is convex on a closed interval I. If f is bounded in I, then f is continuous on I, except perhaps at the endpoints of I.
Put . Then f is bounded in , say in .
Let . Choose in , and a small real number, such that . Then
This can be rearranged into
and, if , then
Put and note that , and ;
Next, take and such that , and . Then, by the squeeze theorem for limits, f is left- and right-continuous at x.
Suppose that f satisfies the midpoint condition on a closed interval I. If f is bounded above in some open interval , then f is convex and bounded in J, and in .
The proof proceeds by generalising the midpoint condition through proving the following two claims.
CLAIM: If , and for , then
PROOF OF CLAIM: Put and consider m points in I.
(p.538) Then repeated application of the midpoint condition shows that(B.4)
If this relation is true for , then it will be true for all .
Consider points and put . Then and, by equation (B.4), it follows that
In other words, subtracting from both sides gives
This shows that equation (B.4) is true for any .
If , and , while , then a consequence of equation (B.4) is that(B.5)
for any rational number . This completes the proof of the claim.
If f is continuous, then a limit can be taken through rational numbers, and this will complete the proof. If f is not necessarily continuous, then prove the following claim.
CLAIM: It may be assumed that f is bounded on and infinite on .
PROOF OF CLAIM: Put , put , and assume there is a such that . Without loss of generality, suppose that .
Let and choose integers such that .
By equation (B.5),
Thus, , and put to find bounded in an enlarged interval .
A similar argument shows that can be grown by increasing d if there are points y with , with .
By theorem B.1, f is continuous in J since it is bounded on J.
By theorem B.1 and the last claim, f is continuous in J.
B.2 Derivatives of convex functions
Finite convex functions on open intervals are continuous and hence measurable.
Suppose that f is a finite and convex function on an open interval I. Then f has finite and non-decreasing left- and right-derivatives in I. Moreover, if is the left-derivative of f, and is the right-derivative of f, then .
Put and choose . Let and let small, such that . Then define . It follows that
by convexity of f. This shows that . Repeat the above to obtain a sequence of inequalities:
Choose such that for some . By continuity of f, . Thus is a non-decreasing function of x.
It remains to construct the derivatives of f.
CLAIM: The right-derivative of f exists and is finite in I.
PROOF OF CLAIM: It follows from the above that
Thus, for ,
These sums telescope to
if is substituted. Let and suppose that is a sequence of rational numbers which converges to . Taking and assuming that give(B.6)
by continuity of f and since is non-decreasing with x. This shows that is a non-decreasing function of h. Thus, exists in the extended real numbers, for every .
An argument similar to the above gives, for , and ,
A similar argument to the above proves the existence of the left-derivative , and shows that it is finite and non-decreasing.
(p.540) Finally, since is non-decreasing with x, and for ,
Taking shows that , as required. This proves the claim.
This completes the proof.
Thus, a finite convex function on an open interval has left- and right-derivatives everywhere. These derivatives are non-decreasing and, hence, measurable.
Finite convex functions are differentiable almost everywhere. This follows from a standard theorem in measure theory, namely, that a non-decreasing real-valued function on a closed interval is differentiable almost everywhere in the interval; see for example reference .
The proof that a convex function is differentiable almost everywhere uses Vitali converings. Let be an interval. The length of I is . A set E is covered (in the sense of Vitali) if there is an (infinite) collection of intervals such that and, for every , and , there is an such that and ; see for example reference .
The critical property about Vitali’s coverings is given by Vitali’s theorem.
Theorem B.4 (Vitali’s theorem)
Let E be a finite measure subset of the real line and let be a collection of intervals which cover E in the sense of Vitali. Given an , there is a finite and disjoint collection of intervals in , say , such that
where is Lebesgue outer measure.
To show that finite convex functions are differentiable almost everywhere, it is only necessary to show that non-decreasing functions are differentiable almost everywhere. To see this, suppose that and that f is a convex function. Put . Then , and
If , then replacing by in the last inequality gives .
In other words, if f is non-decreasing in an interval containing a point x, then it remains non-decreasing for larger values of x. A similar argument shows that, if f is non-increasing on an interval containing x, then it is non-increasing for all smaller values of x.
Hence, a convex function is either non-increasing or non-decreasing, or first non-increasing and then non-decreasing.
Let f be a non-decreasing real-valued function on . Then f is differentiable almost everywhere.
Define the following (Dini)-derivatives of f:
Since f is non-decreasing, , and similar relations are valid between the other derivatives.
By theorem B.3, , and .
For each pair of rational numbers , define the set
Then is the set of points where .
CLAIM: The set F is a null set: PROOF OF CLAIM: It is shown that for all pairs .
Put and let . There is an open set such that . By the definition of , there is an hx such that , and , for each .
The set of intervals is a Vitali covering, and there is a finite disjoint set which almost covers : if is the interior of interval Ij, then . Put . Then .
Let and put , the length of the j-th interval. Sum over the intervals:(B.7)
Each is the left endpoint of for some value of n and for small enough k (k can be found such that , by the definition of ).
The collection of intervals is a Vitali covering of A, and there is a finite disjoint collection almost covering A: .
Define so that . By the definition of ,(B.8)
Each of the Jj is contained in some interval Ii. Sum over those and index the Jj by ji to obtain
since f is increasing. Thus, by summing this over all i, one obtains
Take to find that . Since , this implies that , with the result that . This completes the proof of the claim.
By this claim, , and , for almost every x. This shows that f is differentiable almost everywhere (since it is real valued and finite). This completes the proof.
If f is a real-valued non-decreasing function on , then is measurable.
By theorem B.5, is defined almost everywhere in , except on a null set A (a zero-measure set).
Define for and put if , and put if . Then pointwise for almost all x, so that g is measurable. Since almost everywhere in , this shows that is measurable in .
Since a convex function is either non-increasing, non-decreasing or first non-increasing and then non-decreasing, it follows from theorem B.5 and corollary B.6 that, if f is convex on , then f is differentiable almost everywhere in , and is measurable in . Since the Lebesgue measure is σ-finite, these properties extend naturally to all of .
By theorem B.3, if f is a convex function, then is non-decreasing, and so is continuous and differentiable almost everywhere.
If f is a real-valued convex function, then f is differentiable everywhere, except on a countable set of points.
Since f is convex, is non-decreasing.
Thus, is continuous almost everywhere.
To see that the number of points where is not continuous is countable, consider first the case that f is convex and real valued (and so finite) on . Put and define the jump function
If , then is continuous at x. If , then is said to be discontinuous at x.
(p.543) The gap at x is the interval , where the left- and right-limits are . It follows that , the length of this interval.
Since is non-decreasing, the intervals Ix compose a disjoint family.
Put Nk equal to the number of intervals with . Then , or . Thus, Nk is finite for each k. Since the number of points where is discontinuous is less than the number of Ix is countable. Hence, the number of points where is discontinuous is countable. This completes the proof.
Thus, finite real-valued convex functions are differentiable except on a countable sets of points.
If is a sequence of convex functions, then, by Fatou’s lemma, is measurable. If, in addition, pointwise almost everywhere (or in measure), then f is measurable. By Lusin’s theorem, there is a continuous function g such that , except in a set of small Lebesgue measure.
More generally, if is a Cauchy in measure sequence of functions on a finite measure set, then there exists a measurable f such that in measure on the finite measure set. By the σ-finiteness of the Lebesgue measure, these results extend to all of .
Suppose that is a sequence of convex functions converging pointwise to a limit f almost everywhere. Then f is convex.
Without loss of generality the ae condition may be ignored.
Suppose first that fn and f are defined on the closed interval .
Suppose that f is not convex on . That is, there exist points such that .
Put . Since is convex,
For every , there exist Nc, Nd and N in such that
(1) for all , ;
(2) for all , ; and
(3) for all , .
Take . Then, by (1), (2) and (3) above, , or . This shows that , since is arbitrary. This is a contradiction, and hence f is convex on . Since the Lebesgue measure is σ-finite, this extends to .
(p.544) By lemma B.8, the limit of a sequence of convex functions is itself convex. The limit is differentiable almost everywhere. It is also the limit of the sequence of derivatives almost everywhere:
Suppose that is a sequence of convex functions converging pointwise to a limit f almost everywhere. Then f is convex. Moreover, the sequence of derivatives converges to almost everywhere.
Without loss of generality the ae condition may be ignored. Put and .
By lemma B.8, f is convex and differentiable almost everywhere. Without loss of generality, assume that f and fn are differentiable everywhere.
By theorem B.3, fn and f have non-decreasing left- and right-derivatives everywhere.
Next, note by convexity of fn that, for fixed , and ,
Rearrange this to see that
Take on the left-hand side and put . This gives(B.9)
Choose an . Fix . Then , and there is an such that for all ,(B.10)
Take the limit superior of the left-hand side. This gives
Take and then to find that .
A similar argument shows that .
Since f is differentiable almost everywhere by theorem B.5, for almost all x, and thus pointwise almost everywhere. This completes the proof.
(p.545) B.4 The Legendre transform
Suppose that f is an extended real-valued function and that . The essential domain of f is defined by the set
If , then .
The Legendre transform of an extended real-valued function f of one variable is usually defined by the supremum
The transform is necessarily convex. To see this, consider
for any . Take the supremum on the right-hand side to see that .
If f is itself convex, finite and closed, then it can be recovered from its Legendre transform, since , as will be seen below.
The above definition generalised to real-valued functions on is the Legendre-Fenchel transformation .
Instead of using equation (B.12) as a definition, consider the alternative definition, where is defined such that
Integrating the left-hand side with respect to x for fixed p shows that , and the right-hand side with respect to p for fixed x shows . This shows that
a result which follows essentially by inverting the derivatives of f and above.
This result is valid if f and are differentiable almost everywhere, in which case the last equality is true almost everywhere. More generally, definition (B.12) is used and is defined for a more general class of (measurable) functions f.
In the case that f is convex, then it is differentiable almost everywhere, and its derivative is monotone and continuous almost everywhere. This shows that has a solution almost everywhere, which is a convex function in p.
The above can be implemented by considering the subdifferential of f instead of the derivative. The subdifferential of f at x is the set defined by
This defines a map , which is not a bijection but which may be inverted at p by finding an xp such that . The function xp is not necessarily unique.
(p.546) This in particular shows that and so one may choose to maximise the right-hand side. This gives the definition for in equation (B.12) by inverting the subdifferential of f and generalising the arguments following equation (B.14).
The Legendre transform is necessarily closed and convex.
Define . Then is the supremum of over . This shows that . The epigraph of gx is a closed and convex set for each x. Thus, is closed and convex, since it is the intersection of closed and convex sets. Hence, is convex.
Since is convex, it is continuous and also differentiable æ. By theorem B.7, it is in fact differentiable everywhere except on a countable set of points in its essential domain.
Since and is convex, it follows that . Thus, the Legendre transform of (or the biconjugate of f) is defined by
In addition, is convex and therefore continuous and also differentiable almost everywhere, by lemma B.10.
For a measurable function f,
(a) is convex, and ;
(b) if , then f is closed and convex; and
(c) if f is closed and convex, then on the essential domain of f.
The function is convex, and is closed, by lemma B.10. For any , and , ; therefore,
Take the supremum on the left-hand side for to see that .
If , then, by lemma B.10, f is closed and convex.
Fix and assume f is closed and convex. Then , and . Let ; then it follows from equation (B.16) that
Choose y such that (since f is closed, there exists such a y). Then , and this shows that .
Hence, if f is finite and convex, then if f is closed. If f is a convex function on , then it is lower semicontinuous and so closed; thus .