## E.J. Janse van Rensburg

Print publication date: 2015

Print ISBN-13: 9780199666577

Published to Oxford Scholarship Online: August 2015

DOI: 10.1093/acprof:oso/9780199666577.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 26 February 2017

# (p.536) Appendix B Convex functions

Source:
The Statistical Mechanics of Interacting Walks, Polygons, Animals and Vesicles
Publisher:
Oxford University Press

Free energies $F$ of models of interacting lattice clusters are convex functions (as shown in theorem 3.3). This has certain consequences; namely, that $F$ is differentiable almost everywhere (if it is finite), and finite size approximations to the free energy of an infinite system may be shown to converge to the free energy almost everywhere. The convexity of $F$ is related to the stability of the thermodynamic equilibrium of a system.

If f is a convex function, then $−f$ is a concave function, and with appropriate reinterpretation, all results for convex functions apply to concave functions. There are numerous classical and more recent references on convex functions; see, for example, the book by GH Hardy, JE Littlewood and G Polya [274], or references [43, 497], for more details.

# B.1 Convex functions and the midpoint condition

An extended real-valued function $f:R→R∪{∞}$ is convex on $I=[a,b]$ if, for each $0≤λ≤1$, and $a≤x,

(B.1)
$Display mathematics$

This reduces to the midpoint condition

(B.2)
$Display mathematics$

if $λ=12$. (p.537)

Fig. B.1 The epigraph of a function f is the shaded region with boundary f and vertical lines as indicated. The function f is convex if and only if $epif$ is a convex set in the plane.

Convexity may also be defined in terms of the epigraph of f. This is the subset in $R×R$ defined by

(B.3)
$Display mathematics$

The function f is closed if $epif$ is a closed set and is convex if and only if $epif$ is a convex set in $R2$, as may be seen immediately from the definition of convexity in equation (B.1). Observe that, if f is a lower semicontinuous extended real-valued function, then f is necessarily closed. (A function f is lower semicontinuous at a point x0 if ).

Denote the Lebesgue (outer)-measure by μ‎. In the next theorem convex functions are shown to be continuous and thus Lebesgue measurable.

Theorem B.1

Suppose that f is convex on a closed interval I. If f is bounded in I, then f is continuous on I, except perhaps at the endpoints of I.

Proof

Put $I=[a,b]$. Then f is bounded in $(a,b)$, say $f in $(a,b)$.

Let $x∈(a,b)$. Choose $n>m$ in $N$, and $δ>0$ a small real number, such that $x+nδ∈(a,b)$. Then

$Display mathematics$

This can be rearranged into

$Display mathematics$

and, if $δ→−δ$, then

$Display mathematics$

Put $m=1$ and note that $f(x+δ)+f(x−δ)≥2f(x)$, and $f;

$Display mathematics$

Next, take $δ→0$ and $n→∞$ such that $x±nδ∈(c,d)$, and $nδ→0$. Then, by the squeeze theorem for limits, f is left- and right-continuous at x.

Theorem B.2

Suppose that f satisfies the midpoint condition on a closed interval I. If f is bounded above in some open interval $J⊂I$, then f is convex and bounded in J, and $f(x)=+∞$ in $I∖J$.

Proof

The proof proceeds by generalising the midpoint condition through proving the following two claims.

CLAIM: If $λr∈Q$, and for $x,y∈I$, then

$Display mathematics$

PROOF OF CLAIM: Put $m=2n$ and consider m points ${xi}i=1m$ in I.

(p.538) Then repeated application of the midpoint condition shows that

(B.4)
$Display mathematics$

If this relation is true for $m−1$, then it will be true for all $m∈N$.

Consider $m−1$ points ${xi}i=1m−1$ and put $xm=1m−1(x1+x2+⋯+xm−1)$. Then $mxm=(m−1)xm+xm=(x1+x2+⋯+xm)$ and, by equation (B.4), it follows that

$Display mathematics$

In other words, subtracting $f(xm)$ from both sides gives

$Display mathematics$

This shows that equation (B.4) is true for any $m∈N$.

If $p+q=m$, and $x=x1=x2=⋯=xp$, while $y=xp+1=xp+2=⋯=xm$, then a consequence of equation (B.4) is that

(B.5)
$Display mathematics$

for any rational number $λp=pm$. This completes the proof of the claim.

If f is continuous, then a limit $λp→λ$ can be taken through rational numbers, and this will complete the proof. If f is not necessarily continuous, then prove the following claim.

CLAIM: It may be assumed that f is bounded on $J⊆I$ and infinite on $I∖J$.

PROOF OF CLAIM: Put $I=[a,b]$, put $J=(c,d)$, and assume there is a $y∈I∖J$ such that $f(y)<∞$. Without loss of generality, suppose that $a.

Let $x∈(y,c)$ and choose integers $p>q$ such that $χ=y+pq(x−y)∈(c,d)$.

By equation (B.5),

$Display mathematics$

Thus, $f(x)<∞$, and put $c=y$ to find $f(x)$ bounded in an enlarged interval $(c,d)$.

A similar argument shows that $(c,d)$ can be grown by increasing d if there are points y with $d, with $f(y)<∞$.

By theorem B.1, f is continuous in J since it is bounded on J.

By theorem B.1 and the last claim, f is continuous in J.

# B.2 Derivatives of convex functions

Finite convex functions on open intervals are continuous and hence measurable.

Theorem B.3

Suppose that f is a finite and convex function on an open interval I. Then f has finite and non-decreasing left- and right-derivatives in I. Moreover, if $D−f=d−dxf$ is the left-derivative of f, and $D+f=d+dxf$ is the right-derivative of f, then $D−f≤D+f$.

(p.539)

Proof

Put $I=(a,b)$ and choose $x∈I$. Let $n∈N$ and let $h>0$ small, such that $x+(n+1n)h∈I$. Then define $q(x,h)=f(x+h)−f(x)$. It follows that

$Display mathematics$

by convexity of f. This shows that $qx+hn,h≥q(x,h)$. Repeat the above to obtain a sequence of inequalities:

$Display mathematics$

Choose $mn∈Q$ such that $mn→δn$ for some $δ>0$. By continuity of f, $q(x,h)≤q(x+δ,h)$. Thus $q(x,h)$ is a non-decreasing function of x.

It remains to construct the derivatives of f.

CLAIM: The right-derivative of f exists and is finite in I.

PROOF OF CLAIM: It follows from the above that

$Display mathematics$

Thus, for $0,

$Display mathematics$

These sums telescope to

$Display mathematics$

if $q(x,h)=f(x+h)−f(x)$ is substituted. Let $0 and suppose that $mn$ is a sequence of rational numbers which converges to $1hh′$. Taking $n→∞$ and assuming that $x give

(B.6)
$Display mathematics$

by continuity of f and since $q(x,h)$ is non-decreasing with x. This shows that $1h(f(x+h)−f(x))$ is a non-decreasing function of h. Thus, $limh→0+1h(f(x+h)−f(x))$ exists in the extended real numbers, for every $x∈(a,b)$.

Taking $h′→0$ in equation (B.6) with y and h fixed shows that $D+f<∞$. Since $x, taking $h→0+$ in equation (B.6) also shows that $D+f(x)$ is non-decreasing with x.

An argument similar to the above gives, for $0>h′>h$, and $x>y$,

$Display mathematics$

A similar argument to the above proves the existence of the left-derivative $D−f$, and shows that it is finite and non-decreasing.

(p.540) Finally, since $q(x,h)$ is non-decreasing with x, and for $h>0$,

$Display mathematics$

Taking $h→0+$ shows that $D−f≤D+f$, as required. This proves the claim.

This completes the proof.

Thus, a finite convex function on an open interval has left- and right-derivatives everywhere. These derivatives are non-decreasing and, hence, measurable.

Finite convex functions are differentiable almost everywhere. This follows from a standard theorem in measure theory, namely, that a non-decreasing real-valued function on a closed interval is differentiable almost everywhere in the interval; see for example reference [497].

The proof that a convex function is differentiable almost everywhere uses Vitali converings. Let $I=[a,b]$ be an interval. The length of I is $l(I)=b−a$. A set E is covered (in the sense of Vitali) if there is an (infinite) collection of intervals $I={Ii}$ such that $E⊂∪iIi$ and, for every $x∈E$, and $ϵ>0$, there is an $I∈I$ such that $x∈I$ and $l(I)<ϵ$; see for example reference [497].

The critical property about Vitali’s coverings is given by Vitali’s theorem.

Theorem B.4 (Vitali’s theorem)

Let E be a finite measure subset of the real line and let $I$ be a collection of intervals which cover E in the sense of Vitali. Given an $ϵ>0$, there is a finite and disjoint collection of intervals in $I$, say $I1,I2,…,IN$, such that

$Display mathematics$

where $μ∗$ is Lebesgue outer measure.

To show that finite convex functions are differentiable almost everywhere, it is only necessary to show that non-decreasing functions are differentiable almost everywhere. To see this, suppose that $x and that f is a convex function. Put $λ=y−zx−z$. Then $0<λ<1$, and

$Display mathematics$

If $f(x)≤f(y)$, then replacing $f(x)$ by $f(y)$ in the last inequality gives $f(y)≤f(z)$.

In other words, if f is non-decreasing in an interval containing a point x, then it remains non-decreasing for larger values of x. A similar argument shows that, if f is non-increasing on an interval containing x, then it is non-increasing for all smaller values of x.

Hence, a convex function is either non-increasing or non-decreasing, or first non-increasing and then non-decreasing.

Theorem B.5

Let f be a non-decreasing real-valued function on $[a,b]$. Then f is differentiable almost everywhere.

(p.541)

Proof

Define the following (Dini)-derivatives of f:

$Display mathematics$

Since f is non-decreasing, $D+f(x)≥D−f(x)$, and similar relations are valid between the other derivatives.

By theorem B.3, $D+f(x)=D+f(x)$, and $D−f(x)=D−f(x)$.

For each pair of rational numbers $(u,v)$, define the set

$Display mathematics$

Then $F=⋃u,v∈QEu,v$ is the set of points where $D+f(x)>D−f(x)$.

CLAIM: The set F is a null set: $μF=0$ PROOF OF CLAIM: It is shown that $μEu,v=0$ for all pairs $(u,v)∈Q2$.

Put $μEu,v=s$ and let $ϵ>0$. There is an open set $U⊇Eu,v$ such that $μU. By the definition of $D−f(x)$, there is an hx such that $Ix=[x−hx,x]⊂U$, and $f(x)−f(x−hx), for each $x∈U$.

The set of intervals $[x−hx,x]$ is a Vitali covering, and there is a finite disjoint set $I1,I2,…,IN$ which almost covers $Eu,v$: if $Ijo$ is the interior of interval Ij, then $⋃Ijo∩U>s−ϵ$. Put $A=⋃Ijo⊆U$. Then $μA≤μU≤s+ϵ$.

Let $Ij=[xj−hj,xj]$ and put $ℓ(Ij)=hj$, the length of the j-th interval. Sum over the intervals:

(B.7)
$Display mathematics$

Each $y∈A$ is the left endpoint of $Jy=(y,y+k)∈In$ for some value of n and for small enough k (k can be found such that $f(y+k)−f(y)>uk$, by the definition of $D+f(y)$).

The collection of intervals ${Jy}$ is a Vitali covering of A, and there is a finite disjoint collection ${J1,J2,…,Jm}$ almost covering A: $⋃Jj∩A>s−2ϵ$.

Define $Jj=(yj+kj,yj)$ so that $ℓ(Jj)=kj$. By the definition of $D+f(y)$,

(B.8)
$Display mathematics$

Each of the Jj is contained in some interval Ii. Sum over those $Jj⊂Ii$ and index the Jj by ji to obtain

(p.542)

$Display mathematics$

since f is increasing. Thus, by summing this over all i, one obtains

$Display mathematics$

This shows by equations (B.7) and (B.8) that $v(s+ϵ)≥u(s−2ϵ)$.

Take $ϵ→0+$ to find that $vs≥us$. Since $u>v$, this implies that $s=0$, with the result that $μEu,v=0$. This completes the proof of the claim.

By this claim, $μF=0$, and $D+f(x)=D−f(x)$, for almost every x. This shows that f is differentiable almost everywhere (since it is real valued and finite). This completes the proof.

Corollary B.6

If f is a real-valued non-decreasing function on $[a,b]$, then $f′$ is measurable.

Proof

By theorem B.5, $g=f′$ is defined almost everywhere in $[a,b]$, except on a null set A (a zero-measure set).

Define $gn(x)=nf(x+1n)−f(x)$ for $x∈[a,b]∖A$ and put $f(x)=f(a)$ if $x≤a$, and put $f(x)=f(b)$ if $x≥b$. Then $gn(x)→g(x)$ pointwise for almost all x, so that g is measurable. Since $g=f′$ almost everywhere in $[a,b]$, this shows that $f′$ is measurable in $[a,b]$.

Since a convex function is either non-increasing, non-decreasing or first non-increasing and then non-decreasing, it follows from theorem B.5 and corollary B.6 that, if f is convex on $[a,b]$, then f is differentiable almost everywhere in $(a,b)$, and $f′$ is measurable in $[a,b]$. Since the Lebesgue measure is σ‎-finite, these properties extend naturally to all of $R$.

By theorem B.3, if f is a convex function, then $f′$ is non-decreasing, and so $f′$ is continuous and differentiable almost everywhere.

Theorem B.7

If f is a real-valued convex function, then f is differentiable everywhere, except on a countable set of points.

Proof

Since f is convex, $f′$ is non-decreasing.

Thus, $f′$ is continuous almost everywhere.

To see that the number of points where $f′$ is not continuous is countable, consider first the case that f is convex and real valued (and so finite) on $[a,b]$. Put $A=f(b)−f(a)$ and define the jump function

$Display mathematics$

If $J(x)=0$, then $f′$ is continuous at x. If $J(x)>0$, then $f′$ is said to be discontinuous at x.

(p.543) The gap at x is the interval $Ix=f′(x−),f′(x+)$, where the left- and right-limits are $f′(x±)=limh→0+f′(x±h)$. It follows that $J(x)=lengthIx$, the length of this interval.

Since $f′$ is non-decreasing, the intervals Ix compose a disjoint family.

Put Nk equal to the number of intervals with $J(x)≥2−k$. Then $2−kNk≤A$, or $Nk≤2kA<∞$. Thus, Nk is finite for each k. Since the number of points where $f′$ is discontinuous is less than $⋃kNk$ the number of Ix is countable. Hence, the number of points where $f′$ is discontinuous is countable. This completes the proof.

Thus, finite real-valued convex functions are differentiable except on a countable sets of points.

# B.3 Convergence

If $fn$ is a sequence of convex functions, then, by Fatou’s lemma, is measurable. If, in addition, $fn→f$ pointwise almost everywhere (or in measure), then f is measurable. By Lusin’s theorem, there is a continuous function g such that $f=g$, except in a set of small Lebesgue measure.

More generally, if $fn$ is a Cauchy in measure sequence of functions on a finite measure set, then there exists a measurable f such that $fn→f$ in measure on the finite measure set. By the σ‎-finiteness of the Lebesgue measure, these results extend to all of $R$.

Lemma B.8

Suppose that $fn$ is a sequence of convex functions converging pointwise to a limit f almost everywhere. Then f is convex.

Proof

Without loss of generality the ae condition may be ignored.

Suppose first that fn and f are defined on the closed interval $[a,b]$.

Suppose that f is not convex on $[a,b]$. That is, there exist points $c,d∈[a,b]$ such that $f(c)+f(d)<2f(12(c+d))$.

Put $A=2f(12(c+d))−f(c)−f(d)>0$. Since $fn(x)$ is convex,

$Display mathematics$

For every $ϵ>0$, there exist Nc, Nd and N in $N$ such that

1. (1) for all $n>Nc$, $fn(c)−f(c)<ϵ$;

2. (2) for all $n>Nd$, $fn(d)−f(d)<ϵ$; and

3. (3) for all $n>N$, $fn(12(c+d))−f(12(c+d))<ϵ$.

However,

$Display mathematics$

Take $n>max{Nc,Nd,N}$. Then, by (1), (2) and (3) above, $2ϵ≥A−2ϵ$, or $A≤4ϵ$. This shows that $A≤0$, since $ϵ>0$ is arbitrary. This is a contradiction, and hence f is convex on $[a,b]$. Since the Lebesgue measure is σ‎-finite, this extends to $R$.

(p.544) By lemma B.8, the limit of a sequence of convex functions is itself convex. The limit is differentiable almost everywhere. It is also the limit of the sequence of derivatives almost everywhere:

Theorem B.9

Suppose that $fn$ is a sequence of convex functions converging pointwise to a limit f almost everywhere. Then f is convex. Moreover, the sequence of derivatives $fn′$ converges to $f′$ almost everywhere.

Proof

Without loss of generality the ae condition may be ignored. Put $D+≡d+dx$ and $D−≡d−dx$.

By lemma B.8, f is convex and differentiable almost everywhere. Without loss of generality, assume that f and fn are differentiable everywhere.

By theorem B.3, fn and f have non-decreasing left- and right-derivatives everywhere.

Next, note by convexity of fn that, for fixed $x, and $λ∈(0,1)$,

$Display mathematics$

Rearrange this to see that

$Display mathematics$

Take $λ→0+$ on the left-hand side and put $y=x+h$. This gives

(B.9)
$Display mathematics$

Choose an $ϵ>0$. Fix $x∈R$. Then $fn(x)→f(x)$, and there is an $N0∈N$ such that for all $n≥N0$,

(B.10)
$Display mathematics$

Suppose that $n≥N0$. By equations (B.9) and (B.10),

$Display mathematics$

Take the limit superior of the left-hand side. This gives

$Display mathematics$

Take $ϵ→0+$ and then $h→0+$ to find that .

A similar argument shows that .

Since f is differentiable almost everywhere by theorem B.5, $limn→∞ddxfn(x)=f′(x)$ for almost all x, and thus $fn′→f′$ pointwise almost everywhere. This completes the proof.

# (p.545) B.4 The Legendre transform

Suppose that f is an extended real-valued function and that $f>−∞$. The essential domain of f is defined by the set

(B.11)
$Display mathematics$

If $f<∞$, then $domf=R$.

The Legendre transform of an extended real-valued function f of one variable is usually defined by the supremum

(B.12)
$Display mathematics$

The transform $f∗$ is necessarily convex. To see this, consider

(B.13)
$Display mathematics$

for any $x∈domf$. Take the supremum on the right-hand side to see that $f∗(p)+f∗(q)≥2f(12(p+q))$.

If f is itself convex, finite and closed, then it can be recovered from its Legendre transform, since $f∗∗=f$, as will be seen below.

The above definition generalised to real-valued functions on $Rn$ is the Legendre-Fenchel transformation [550].

Instead of using equation (B.12) as a definition, consider the alternative definition, where $f∗$ is defined such that

(B.14)
$Display mathematics$

Integrating the left-hand side with respect to x for fixed p shows that $px=f(x)+Cp$, and the right-hand side with respect to p for fixed x shows $px=f∗(p)+Cx′$. This shows that

(B.15)
$Display mathematics$

a result which follows essentially by inverting the derivatives of f and $f∗$ above.

This result is valid if f and $f∗$ are differentiable almost everywhere, in which case the last equality is true almost everywhere. More generally, definition (B.12) is used and is defined for a more general class of (measurable) functions f.

In the case that f is convex, then it is differentiable almost everywhere, and its derivative is monotone and continuous almost everywhere. This shows that $p=ddxf(x)$ has a solution $x∗=f∗(p)$ almost everywhere, which is a convex function in p.

The above can be implemented by considering the subdifferential of f instead of the derivative. The subdifferential of f at x is the set $∂f(x)$ defined by

(B.16)
$Display mathematics$

This defines a map $x→∂f(x)$, which is not a bijection but which may be inverted at p by finding an xp such that $p∈∂f(xp)$. The function xp is not necessarily unique.

(p.546) This in particular shows that $f(y)−py≥f(xp)−pxp$ and so one may choose $xp∈domf$ to maximise the right-hand side. This gives the definition for $f∗$ in equation (B.12) by inverting the subdifferential of f and generalising the arguments following equation (B.14).

Lemma B.10

The Legendre transform $f∗$ is necessarily closed and convex.

Proof

Define $gx(p)=px−f(x)$. Then $f∗(p)$ is the supremum of $gx(p)$ over $x∈R$. This shows that $epif∗=⋂xepigx$. The epigraph of gx is a closed and convex set for each x. Thus, $epif∗$ is closed and convex, since it is the intersection of closed and convex sets. Hence, $f∗$ is convex.

Since $f∗$ is convex, it is continuous and also differentiable æ. By theorem B.7, it is in fact differentiable everywhere except on a countable set of points in its essential domain.

Since $f>−∞$ and $f∗$ is convex, it follows that $f∗>−∞$. Thus, the Legendre transform of $f∗$ (or the biconjugate of f) is defined by

(B.17)
$Display mathematics$

In addition, $f∗∗$ is convex and therefore continuous and also differentiable almost everywhere, by lemma B.10.

Theorem B.11

For a measurable function f,

1. (a) $f∗∗$ is convex, and $f∗∗≤f$;

2. (b) if $f∗∗=f$, then f is closed and convex; and

3. (c) if f is closed and convex, then $f∗∗=f$ on the essential domain of f.

Proof

The function $f∗∗$ is convex, and $epif∗$ is closed, by lemma B.10. For any $x∈R$, and $p∈domf∗$, $f∗(p)≤px−f(x)$; therefore,

$Display mathematics$

Take the supremum on the left-hand side for $p∈domf∗$ to see that $f∗∗≤f$.

If $f∗∗=f$, then, by lemma B.10, f is closed and convex.

Fix $x∈domf$ and assume f is closed and convex. Then $f(x)<∞$, and $∂f(x)≠∅$. Let $p∈∂f(x)$; then it follows from equation (B.16) that

$Display mathematics$

Choose y such that $f∗(p)=py−f(y)$ (since f is closed, there exists such a y). Then $f∗(p)=py−f(y)≤px−f(x)$, and this shows that $f(x)≤px−f∗(p)≤f∗∗(x)$.

Hence, if f is finite and convex, then $f∗∗=f$ if f is closed. If f is a convex function on $R$, then it is lower semicontinuous and so closed; thus $f∗∗=f$.