L p Spaces and Convexity
Transcription
L p Spaces and Convexity
Lp Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I ⊂ R be an interval. For I open, we say a function f : I → R is convex if for every a, b ∈ I and every λ ∈ (0, 1), we have (1) φ(λb + (1 − λ)a) ≤ λφ(b) + (1 − λ)φ(a). (Note we do not assume that φ is differentiable, as for example φ(x) = |x| is convex.) If I is not open, then we say φ : I → R is convex if (1) is satisfied and φ is continuous at any endpoint of I. Geometrically, I is convex if every secant line segment lies above the graph of φ. A convex function φ is said to be strictly convex if whenever the equality in (1) is satisfied for some λ ∈ (0, 1) and a, b ∈ I, then a = b. In other words, φ is strictly convex if for every a 6= b ∈ I and λ ∈ (0, 1), φ(λb + (1 − λ)a) < λφ(b) + (1 − λ)φ(a). Here are a few lemmas about convex functions, whose proofs will be left as exercises. Lemma 1. Let φ be a convex function, and let a, b ∈ I with a < b. Assume there is a λ ∈ (0, 1) so that φ(λb + (1 − λ)a) = λφ(b) + (1 − λ)φ(a). Then the restriction of φ to [a, b] is linear. Lemma 2. A convex function φ is strictly convex if and only if its graph contains no line segments. Lemma 3. Each tangent line to the graph of a differentiable strictly convex function φ intersects the graph of φ only at the point of tangency. Lemma 4. Any convex function is continuous. If φ : I → R is convex and x0 ∈ I, then the line given by the graph of `(x) = φ(x0 ) + m(x − x0 ) is a supporting line of φ at x0 , if φ(x) ≥ `(x) for all x ∈ I. Proposition 5. Let φ : I → R be convex, and let x0 ∈ I ◦ . Then there is a supporting line for φ at x0 . 0) Proof. For x ∈ I − {x0 }, let m(x) = φ(x)−φ(x . Then we claim m is an increasing function x−x0 of x. To prove the claim, first consider x0 < x0 < x be points in I, and define λ ∈ (0, 1) so that x0 = λx + (1 − λ)x0 . Consider the secant line from x0 to x. Then the convexity of φ implies φ(x0 ) = φ(λx + (1 − λ)x0 ) ≤ λφ(x) + (1 − λ)φ(x0 ). 1 2 Compute λ = 1−λ = φ(x0 ) ≤ = = = x0 − x0 , x − x0 x − x0 , x − x0 x0 − x0 x − x0 φ(x) + φ(x0 ) x − x0 x − x0 x0 − x0 x0 − x0 x0 − x0 x − x0 φ(x) − φ(x0 ) + φ(x0 ) + φ(x0 ) x − x0 x − x0 x − x0 x − x0 x0 − x0 x − x0 [φ(x) − φ(x0 )] + φ(x0 ) x − x0 x − x0 (x0 − x0 )m(x) + φ(x0 ), 0 φ(x ) − φ(x0 ) ≤ m(x), x0 − x0 m(x0 ) ≤ m(x). The other cases of x0 < x0 < x and x0 < x < x0 are similar. Now since m(x) is increasing on I − {x0 }, the one-sided limits m+ = limx→x+0 m(x) and m− = limx→x−0 m(x) exist and satisfy m+ ≥ m− . Then we claim that if m− ≤ m ≤ m+ , then `(x) = φ(x0 ) + m(x − x0 ) is a supporting line of φ at x0 . Since m ≤ m(x) for all x > x0 , `(x) = φ(x0 ) + m(x − x0 ) ≤ φ(x0 ) + m(x)(x − x0 ) = φ(x). Similarly, since m ≥ m(x) for all x < x0 , we also see `(x) ≤ φ(x) for all x < x0 , and thus the graph of ` is a supporting line for φ at x0 . Corollary 6. Let φ be a differentiable convex function. Let a, b ∈ I. Then φ(b) ≥ φ(a) + φ0 (a)(b − a). In other words, the graph of φ lies above the graph of each tangent line. For the proof, just recognize m+ = m− = φ0 (a) in this case. Proposition 7. If φ : I → R is strictly convex, and the graph of ` is a supporting line of φ at x0 ∈ I ◦ , then for all x ∈ I − {x0 }, φ(x) > `(x). Proof. Apply Lemma 2. Proposition 8. Let φ : I → R be continuous, and assume φ00 > 0 on the interior I ◦ of I. Then φ is strictly convex on I. Proof. Since φ00 > 0, we see that φ0 is strictly increasing on I ◦ . Let a < b in I. Define ψ(t) = φ(tb + (1 − t)a) − tφ(b) − (1 − t)φ(a). Then we want to show ψ < 0 on (0, 1). Note ψ(0) = ψ(1) = 0. Now ψ 0 (t) = φ0 (tb + (1 − t)a)(b − a) − φ(b) + φ(a) is strictly increasing. Since ψ(0) = ψ(1) = 0, there is a T ∈ (0, 1) where ψ 0 (T ) = 0 (either there is a local extremum point or ψ is constant; this is Rolle’s Theorem). Since ψ 0 is strictly increasing, we have ψ 0 (t) < 0 for t ∈ (0, T ) and ψ 0 (t) > 0 for t ∈ (T, 1). Therefore, 3 ψ is strictly decreasing on [0, T ] and strictly increasing on [T, 1]. Since ψ(0) = ψ(1) = 0, we find ψ(t) < 0 for t ∈ (0, 1). Corollary 9. For p ∈ (1, ∞), the function x 7→ xp is strictly convex on [0, ∞). The exponential function exp x = ex is strictly convex on (−∞, ∞). 2. The Banach space Lp Let (X, M, µ) be a measure space. For a measurable function f , define Z p1 p kf kLp = |f | dµ . X Then we define Lp (X) = {f : X → R (or C) : kf kLp < ∞}/ ∼, where as usual f ∼ g if f = g almost everywhere. Proposition 10. k · kLp is a norm on Lp (X). • It is obvious that kf kLp ≥ 0 always, and if kf kLp = 0, then we have |f |p dµ = 0, which implies |f |p = 0 a.e. This is equivalent to f = 0 a.e. X • It is obvious that if α is a constant, then kαf kLp = |α| · kf kLp . • The Triangle Inequality is harder, and we cover it in Minkowski’s Theorem below. Proof. R Theorem 1 (Minkowski’s Theorem). Let p ∈ [1, ∞]. If f, g ∈ Lp (X), then (2) kf + gkLp ≤ kf kLp + kgkLp . If p ∈ (1, ∞), then equality can hold only if there are nonnegative constants α, β, not both zero, so that βf = αg. Moreover, if f, g ≥ 0 are measurable (but not necessarily in Lp (X)), then (2) holds. Proof. We have already addressed the cases of p = 1, ∞. So we may assume p ∈ (1, ∞). Also, if kf kLp = 0, then f = 0 a.e., and the conclusion is valid. So now assume p ∈ (1, ∞), α = kf kLp > 0, and β = kgkLp > 0. Choose functions f0 = α−1 |f |, g0 = β −1 |g|. Therefore, kf0 kLp = kg0 kLp = 1. For λ = α/(α + β), and so 1 − λ = β/(α + β). Compute |f (x) + g(x)|p ≤ = = ≤ (|f (x)| + |g(x)|)p [αf0 (x) + βg0 (x)]p (α + β)p [λf0 (x) + (1 − λ)g0 (x)]p (α + β)p [λf0 (x)p + (1 − λ)g0 (x)p ] by the convexity of φ(t) = tp . Recall p ∈ (1, ∞) implies this last inequality is strict unless f0 (x) = g0 (x). z otherwise. Also define sgn(∞) = sgn(−∞) = For z ∈ C, define sgn 0 = 0 and sgn z = |z| 0. For f (x), g(x) finite and nonzero, we see |f (x) + g(x)|p = (|f (x)| + |g(x)|)p if and only 4 if sgn f (x) = sgn g(x). Thus, when f (x) and g(x) are finite, by considering various cases, we find |f (x) + g(x)|p ≤ (α + β)p [λf0 (x)p + (1 − λ)g0 (x)p ] (3) with equality if and only if α−1 f (x) = β −1 g(x) when f (x), g(x) are finite. Integrating both sides of (3) gives kf + gkpLp ≤ (α + β)p [λkf0 kpLp + (1 − λ)kg0 kpLp ] = (α + β)p = (kf kLp + kgkLp )p . Therefore kf + gkLp ≤ kf kLp + kgkLp for f, g ∈ Lp . Moreover, if there is equality, then Z |f (x) + g(x)|p − (α + β)p [λf0 (x)p + (1 − λ)g0 (x)p ] = 0, X and the integrand is nonnegative almost everywhere. Therefore, the integrand must vanish almost everywhere, and thus α−1 f (x) = β −1 g(x) for almost every x ∈ X. Finally, the remaining case in which f, g are nonnegative and kf kLp or kgkLp is infinite is trivial. Let (V, k · k) be a normed linear space. P In other words, V is a vector space over R or C the partial sums equipped with a norm k · k. A series ∞ n=1 vn for vn ∈ V is convergent if P converge to a limit in V . The series is said to be absolutely convergent if ∞ n=1 kvn k < ∞. Proposition 11. Let V be a vector space over the field R or C equipped with a norm k · k. Consider the metric on V with the distance function kx − yk. Then V is complete if and only if every absolutely convergent series in V is convergent. P∞ convergent series. Proof. First of all assume V is complete. Let n=1 vn be an absolutely P P Let sn = nj=1 vj be the partial sum. Then if m > n, sm − sn = m j=n+1 vj and m m ∞ X X X (4) ksm − sn k = vj ≤ kvj k ≤ kvj k. j=n+1 j=n+1 j=n+1 P P∞ But now since n=1P vn is absolutely convergent, the sum ∞ n=1 kvn k converges, and so the tail of the series ∞ kv k → 0 as n → ∞. In other words, for every > 0, there j j=n+1 P∞ is an N so that if n ≥ N , then j=n+1 kvj k ≤ . Then (4) shows the sequence of partial sums sn is a Cauchy sequence. Since V is complete, it has a limit s ∈ V , which is the sum of the series. On the other hand, assume every absolutely convergent series in V is convergent. Let wn be a Cauchy sequence. Define wnk as a subsequence as follows: For = 21 , there is an N so that if n, m ≥ N , then kwn − wm k ≤ 21 . Let n1 = N . Then define nk recursively as nk = max{nk−1 + 1, N }, for N a constant so that if n, m ≥ N , then kwn − wm k ≤ 21k . By induction, wnk is a subsequence of wn so that kwnk − wnk+1 k ≤ 21k for all k. Now if v1 = wn1 and vk = wnk − wnk−1 for k ≥ 2. By construction ∞ X k=1 kvk k ≤ kwn1 k + ∞ X k=2 1 2k = kwn1 k + 1 2 < ∞. 5 P Therefore, ∞ k=1 vk is absolutely convergent, and thus is convergent to a sum s by our assumption. P Now we show wn → s. Let > 0. Note the partial sum kj=1 vj = wnk , and so wnk → s as k → ∞. So there is a K so that if k ≥ K, then kwnk − sk ≤ 2 . Since wn is Cauchy, there is an N so that if n, m ≥ N , then kwn − wm k ≤ 2 . So choose L ≥ K so that nL ≥ N , and then for n ≥ nL , we have kwn − sk ≤ kwn − wnL k + kwnL − sk ≤ 2 + 2 So wn → s. = . Theorem 2. For p ∈ [1, ∞], Lp (X) is a Banach space. Proof. We have already addressed the case of p = ∞. Thus we may assume that p ∈ [1, ∞). We have also proved above that k·kLp is a norm. Thus we only need to prove Lp (X) is complete. We will use the previous proposition to show that absolutely convergent series in Lp (X) are convergent. P Let fnP ∈ Lp (X) be an absolutely convergent series, so that ∞ n=1 kfn kLp = M . Define n gn (x) = k=1 |fk (x)|. By Minkowski’s Inequality, kgn k Lp ≤ n X kfk k Lp ≤ k=1 ∞ X kfk kLp = M. k=1 Since gn is increasing pointwise, it converges gn (x) → g(x) as n → ∞ (where g may take the value ∞). Moreover, gn (x)p → g(x)p . By Fatou’s Lemma, we see Z Z p g ≤ lim inf gnp = lim inf kgn kpLp ≤ M p . X n→∞ n→∞ X So g p is integrable, and g(x) is finite for almost every x. P∞ and thus is convergent For x with g(x) < ∞, n=1 fn (x) is absolutely convergent, P∞ in R (or C). So for almost every x, we define s(x) = n=1 fn (x), and let sn (x) be the corresponding partial sum. Note |sn (x)| ≤ g(x) implies |s(x)| ≤ g(x) and thus s ∈ Lp (X). This implies |sn (x) − s(x)|p ≤ 2p g(x)p , and 2p g p is integrable. Therefore, the Dominated Convergence Theorem applies, and since |sn (x) − s(x)|p → 0 almost everywhere, Z Z p p ksn − skLp = |sn − s| → 0 = 0. X Thus the sum P∞ n=1 X p fn converges to s ∈ L (X). Theorem 3. Let p ∈ [1, ∞), and consider Rd with Lebesgue measure. Then the following sets of functions are dense in Lp (Rd ): • Simple functions. • Step functions. • Continuous functions with compact support. The proof is very similar to the case p = 1. 6 ¨ lder’s Inequality 3. Ho For p ∈ [1, ∞], the conjugate exponent is defined to be q so that p1 + 1q = 1. We consider 1, ∞ to be conjugate exponents. Theorem 4 (H¨older’s Inequality). Let p, q be conjugate exponents. Let f ∈ Lp (X) and g ∈ Lq (X). Then Z (5) kf gkL1 = |f g| ≤ kf kLp · kgkLq . X Moreover, if p ∈ (1, ∞), equality holds in (5) if and only if there are constants α, β which are not both zero so that α|f |p = β|g|q almost everywhere. More generally, if f, g are nonnegative measurable functions, then (5) holds. Proof. First of all, if kf kLp = 0, then f = 0 a.e. and the result is trivial. The same is true if kgkLq = 0. If p = 1 and q = ∞, then |g(x)| ≤ kgkL∞ for almost all x. Therefore, Z Z |f g| ≤ kgkL∞ |f | = kf kL1 · kgkL∞ . X X The same is true if p = ∞ and q = 1. Thus we assume p, q ∈ (1, ∞). We may assume α = kf kLp and β = kgkLq are positive. Let f0 = α−1 |f | and g0 = s t β −1 |g|. The convexity of the exponential function implies since p1 + 1q = 1 that e p + q ≤ p−1 es + q −1 et . Now for x so that f0 (x), g0 (x) ∈ (0, ∞), define s, t by f0 (x) = exp( ps ) and g0 (x) = exp( qt ). Therefore, f0 (x)g0 (x) ≤ p−1 f0 (x)p + q −1 g0 (x)q (6) for every x ∈ X. (The cases where f0 (x), g0 (x) are 0 or ∞ are easy to analyze.) Moreover, the strict convexity of the exponential function implies that if there is equality in (6), then s = t, which implies f0 (x)p = g0 (x)q , at least in the case when f0 (x), g0 (x) are both finite. Now integrate (6) to see Z Z Z p −1 −1 g0q = p−1 + q −1 = 1. f0 g0 ≤ p f0 + q X X X Then the definitions of f0 , g0 imply (5). If H¨older’s Inequality is an equality, then Z f0 g0 − (p−1 f0p + q −1 g0p ) = 0, X while the integrand is nonnegative. Thus we have f0 (x)g0 (x) = p−1 f0 (x)p + q −1 g0 (x)q for almost all x ∈ X. This implies f0 (x)p = g0 (x)q for almost all x. One remaining case is that of f, g ≥ 0 but kf kLp = ∞. The inequality is trivially true here. The last remaining case, of kgkLq = ∞, is handled the same way. 7 4. Jensen’s Inequality A measure µ on a σ-algebra M on a set X is called a probability measure if µ(X) = 1. Proposition 12. Let (X, M, µ) be a measure space. R Let f be a nonnegative measurable function on X. For every E ∈ M, define ν(E) = E f dµ. Then ν is a measure on M. Proof. We need to check countable additivity. So let Ej be a countable disjoint collection of measurable sets. Then ! Z Z X ∞ ∞ ∞ Z ∞ [ X X ν Ej = f · χ ∪∞ dµ = f · χEj = f · χEj = ν(Ej ). j=1 Ej X j=1 X j=1 X j=1 j=1 Here the second equality is by the assumption that the Ej are disjoint, while the third follows from the Monotone Convergence Theorem, since f · χEj ≥ 0. This proposition shows how to produce a probability measure from any measure space together with a measurable nonnegative function with integral 1. Theorem 5 (Jensen’s Inequality). Let (X, M, µ) be a probability measure space. Let g be an integrable function on X with range in an interval I ⊂ R. Let φ : I → R be convex. Then Z Z g dµ . φ ◦ g dµ ≥ φ X X R Proof. Let α = X g dµ. Then we claim α ∈ I. To prove the claim, consider b = sup I. If b = ∞, then α < b since g is integrable. On the other hand, if b is finite, then Z Z b dµ = b µ(X) = b. g dµ ≤ α= X X A similar analysis applies to inf I, and this implies α ∈ I¯ the closure of I. Moreover, if b is an endpoint of I, then α 6= b unless g(x) = b for almost every x ∈ X. (Why?) Thus there are two cases. In the trivial case g(x) = b for almost every x. In this case, Z Z Z φ ◦ g dµ = φ(b) dµ = φ(b) = φ g dµ . X X X Otherwise, α ∈ I ◦ . By Proposition 5, we my choose `(x) = φ(α) + m(x − α) ≤ φ(x) for all x ∈ I. Therefore, Z Z Z φ ◦ g dµ ≥ φ(α) + m(g − α) dµ = φ(α) = φ g dµ . X X X Corollary 13. If on a measure space, f is a positive measurable function with integral 1, let g and φ satisfy the hypotheses of Jensen’s Inequality. Then Z Z (φ ◦ g)f dµ ≥ φ gf dµ) . X X