Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Real Analysis Course Notes: Measure, Integration, Differentiation, and Banach Spaces, Exams of Calculus

Real Analysis. Course Notes. Contents. 1. Measure, integration and differentiation on R . . . . . . . . . 1. 1.1. Real numbers, topology, logic .

Typology: Exams

2022/2023

Uploaded on 05/11/2023

thecoral
thecoral 🇺🇸

4.4

(29)

401 documents

1 / 85

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Real Analysis
Course Notes
Contents
1 Measure, integration and differentiation on R. . . . . . . . . 1
1.1 Real numbers, topology, logic . . . . . . . . . . . . . . 2
1.2 Lebesgue measurable sets and functions . . . . . . . . 4
1.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Differentiation and Integration . . . . . . . . . . . . . . . . . 15
3 The Classical Banach Spaces . . . . . . . . . . . . . . . . . . 28
4 Baire Category . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7 Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8 General Measure Theory . . . . . . . . . . . . . . . . . . . . . 79
1 Measure, integration and differentiation on R
Motivation. Suppose f: [0, π]Ris a reasonable function. We define
the Fourier coefficients of fby
an=2
πZπ
0
f(x) sin(nx)dx.
Here the factor of 2 is chosen so that
2
πZπ
0
sin(nx) sin(mx)dx =δnm.
We observe that if
f(x) =
X
1
bnsin(nx),
then at least formally an=bn(this is true, for example, for a finite sum).
This representation of f(x) as a superposition of sines is very useful for
applications. For example, f(x) can be thought of as a sound wave, where
anmeasures the strength of the frequency n.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55

Partial preview of the text

Download Real Analysis Course Notes: Measure, Integration, Differentiation, and Banach Spaces and more Exams Calculus in PDF only on Docsity!

Real Analysis

Course Notes

Contents

1 Measure, integration and differentiation on R......... 1 1.1 Real numbers, topology, logic.............. 2 1.2 Lebesgue measurable sets and functions........ 4 1.3 Integration........................ 9 2 Differentiation and Integration................. 15 3 The Classical Banach Spaces.................. 28 4 Baire Category.......................... 33 5 Topology............................. 40 6 Banach Spaces.......................... 54 7 Hilbert space........................... 67 8 General Measure Theory..................... 79

1 Measure, integration and differentiation on R

Motivation. Suppose f : [0, π] → R is a reasonable function. We define the Fourier coefficients of f by

an =

π

∫ (^) π

0

f (x) sin(nx) dx.

Here the factor of 2/π is chosen so that

2 π

∫ (^) π

0

sin(nx) sin(mx) dx = δnm.

We observe that if

f (x) =

∑^ ∞

1

bn sin(nx),

then at least formally an = bn (this is true, for example, for a finite sum). This representation of f (x) as a superposition of sines is very useful for applications. For example, f (x) can be thought of as a sound wave, where an measures the strength of the frequency n.

Now what coefficients an can occur? The orthogonality relation implies that 2 π

∫ (^) π

0

|f (x)|^2 dx =

∑^ ∞

−∞

|an|^2.

This makes it natural to ask if, conversely, for any an such that

|an|^2 < ∞, there exists a function f with these Fourier coefficients. The natural function to try is f (x) =

an sin(nx). But why should this sum even exist? The functions sin(nx) are only bounded by one, and

|an|^2 < ∞ is much weaker than

|an| < ∞. One of the original motivations for the theory of Lebesgue measure and integration was to refine the notion of function so that this sum really does exist. The resulting function f (x) however need to be Riemann inte- grable! To get a reasonable theory that includes such Fourier series, Cantor, Dedekind, Fourier, Lebesgue, etc. were led inexorably to a re-examination of the foundations of real analysis and of mathematics itself. The theory that emerged will be the subject of this course. Here are a few additional points about this example. First, we could try to define the required space of functions — called L^2 [0, π] — to simply be the metric completion of, say C[0, π] with respect to d(f, g) =

|f − g|^2. The reals are defined from the rationals in a similar fashion. But the question would still remain, can the limiting objects be thought of as functions? Second, the set of point E ⊂ R where

an sin(nx) actually converges is liable to be a very complicated set — not closed or open, or even a countable union or intersection of sets of this form. Thus to even begin, we must have a good understanding of subsets of R. Finally, even if the limiting function f (x) exists, it will generally not be Riemann integrable. Thus we must broaden our theory of integration to deal with such functions. It turns out this is related to the second point — we must find a good notion for the length or measure m(E) of a fairly general subset E ⊂ R, since m(E) =

χE.

1.1 Real numbers, topology, logic

The real numbers. Conway: Construction of the real numbers [Con, p.25]. Dedekind: just as a prime is characterized by the ideal of things it di- vides, so a number is characterized by the things less than it. Brouwer and Euclid: the continuum is not a union of points!

is bounded iff x ∈ Q. A condensation point of E ⊂ R is a point x ∈ R such that every neigh- borhood of x meets E in an uncountable set. In other words, its the set of points where E is ‘locally uncountable’.

Theorem 1.1 Any uncountable set contains an uncountable collection of condensation points.

The same holds true in any complete, separable metric space. Thus only countably many Y ’s can be embedded disjointly in R^2 , and only countably many M¨obius bands in R^3. Any closed uncountable set F has the order of the continuum. In fact it contains a copy of the Cantor set. (Proof: pick two condensation points, and then two disjoint closed intervals around them. Within each interval, pick two disjoint subintervals containing condensation points, and continue. By insuring that the lengths of the intervals tend to zero we get a Cantor set.)

How many open sets? Theorem. The set of all open subsets of R is of the same cardinality as R itself. Indeed, the same is true of the set of all Borel sets.

1.2 Lebesgue measurable sets and functions

On R we will construct a σ-algebra M containing the Borel sets, and a measure m : M → [0, ∞], such that m(a, b) = b − a, m is translation-

invariant, and m is countably additive. Definition: the outer measure m∗(E) is the infimum of

ℓ(Ii) over all coverings E ⊂

Ii by countable unions of intervals. Basic fact: subadditivity. For any collection of sets Ai, m∗(

∑ Ai)^ ≤ m∗(Ai). Basic fact: m∗[a, b] = b − a.

Proof. Clearly the outer measure is at most b − a. But if [a, b] is covered by

Ik, by compactness we can assume the union is finite, and then

b − a =

χ[a, b] ≤

χIk =

|Ik|.

Definition: E ⊂ R is measurable if

m∗(E ∩ A) + m∗( E˜ ∩ A) = m∗(A)

for all sets A ⊂ R. Because of subadditivity, only one direction needs to be checked. For example, if E ⊂ [0, 1] then m∗(E) + m∗([0, 1] − E) = 1.

Theorem 1.2 E = [a, ∞) is measurable.

Proof. From a good cover

Ii for A we must construct good covers for E ∩ A and E˜ ∩ A. This is easy because E cuts each interval Ii into two subintervals whose lengths add to that of Ii.

Theorem 1.3 The measurable sets form an algebra.

Proof. Closure under complements is by definition. Now suppose E and F are measurable, and we want to show E ∩ F is. By the definition of measurability, E cuts A into two sets whose measures add up. Now F cuts E ∩ A into two sets whose measures add up, and similarly for the complements. Thus E and F cut A into 4 sets whose measures add up to the outer measure of A. Assembling 3 of these to form A ∩ (E ∪ F ) and the

remaining one to form A ∩ E˜ ∪ F , we see E ∪ F is measurable.

Theorem 1.4∑ If Ei are disjoint and measurable, i = 1 , 2 ,... , N , then m∗(Ei ∩ A) = m∗(A ∩

Ei).

Proof. By induction, the case N = 1 being the definition of measurability.

Theorem 1.5 The measurable sets form a σ-algebra.

Proof. Suppose Ei is a sequence of measurable sets; we want to show

Ei is measurable. Since we already have an algebra, we can assume the Ei are disjoint. By the preceding lemma, we have for any finite N ,

∑^ N

1

m∗(Ei ∩ A) + m∗(A ∩

⋂^ N

1

˜Ei) = m∗(A).

The second term is only smaller for an infinite intersection, so letting N → ∞ we get ∑∞

1

m∗(Ei ∩ A) + m∗(A ∩

⋂^ ∞

1

˜Ei) ≤ m∗(A).

Now the first term dominates m∗(A ∩

Ei) so we are done.

Measurable functions. A function f : R → R is measurable if f −^1 (U ) is measurable whenever U is an open set. First examples: continuous functions, step functions (

∑N

1 aiχIi ,^ Ii^ dis- joint intervals) and simple functions (

∑N

1 aiχEi^ ,^ Ei^ disjoint measurable sets) are all measurable. Note that simple functions are exactly the measurable functions taking only finitely many values. In general, if f : A → B is any map, the map f −^1 : P(B) → P(A) is a σ-algebra homomorphism; indeed it preserves unions over any index set.

Thus f is measurable is the same as: (a) f −^1 (x, ∞) is measurable for all x ∈ R; or (b) f −^1 (B) is measurable for any Borel set B.

Warning. It is not true that f −^1 (M ) is measurable whenever M is mea- surable! Thus measurable functions are not closed under composition. More generally, for a topological space X we say f : R → X is mea- surable if the preimages of open sets are measurable. Example: if f, g are measurable functions, then h = (f, g) : R → R^2 is measurable. Indeed, for any open set U × V ⊂ R^2 , the preimage h−^1 (U × V ) = f −^1 (U ) ∩ g−^1 (V ) is measurable. Since every open set in R^2 is a union of a countable number of open rectangles, h is measurable. Similarly, if h : R^2 → R is continuous, then h(f, g) is measurable when- ever f and g are. This shows the measurable functions form an algebra:

f g and f + g are measurable if f and g are. Moreover, the measurable functions are closed under limits. Indeed, if f = lim fn then

f −^1 (a, ∞) = {x : ∃k ∃N ∀n ≥ N fn(x) > a + 1/k}

k

N

n≥N

f (^) n− 1 (a + 1/k, ∞).

Similarly for lim sup, lim inf etc. If f = g a.e. and f is measurable then so is g.

Theorem 1.8 (Littlewood’s second principle) If f is measurable on [a, b] then f is the limit in measure of continuous functions: there exists contin- uous fn such that for all ǫ > 0 , m{|f − fn| > ǫ} → 0.

Proof. Let EM = {|f | > M }; then

EM = ∅, so after truncating f on a set of small measure we obtain f 1 bounded by M. Cutting [−M, M ] into finitely many disjoint intervals of length ǫ, and collecting together the values, we see f 1 is a uniform limit of simple functions. Any simple function is built from indicator functions χE of measurable sets. By Littlewood’s first principle,

χE is approximated in measure by χJ , where J is a finite union of intervals. Finally χJ is a limit in measure of continuous functions.

Theorem 1.9 (Lusin’s Theorem; Littlewood’s 2nd principle) Given a measurable function f on [0, 1], one can find a continuous function g : [0, 1] → R such that g = f outside a set of small measure.

Theorem 1.10 (Egoroff; Littlewood’s 3rd principle) Let f (x) = lim fn(x) for each x ∈ [0, 1], where fn, f are measurable. Then fn → f uniformly out- side a set of small measure.

Example: Recall the ‘tent functions’ fn supported on [0, 1 /n] with a triangular graph of height n. We have fn → 0 but

fn = 0; these fn do not converge uniformly everywhere.

Proof of the Theorem. For any k > 0, consider the sets

EN = {x : |fn(x) − f (x)| > 1 /k for some n > N }.

Since fn → f , we have

EN = ∅. Since EN ⊂ [0, 1], we have m(EN ) → 0. Thus there is an N (k) such that m(EN (k)) is as small as we like, say less than 2 −kǫ. Let A =

k EN^ (k). Then for^ x^ outside^ A, we have sup^ |fn(x)−f^ (x)| ≤ 1 /k for all n > N (k), and therefore sup |fn(x) − f (x)| → 0. In other words, fn → f uniformly outside the set A; and m(A) ≤ ǫ.

Finitely-additive measures on N. The natural numbers admit a finitely- additive measure defined on all subsets, and vanishing on finite sets. (Such a measure is cannot be countably additive.) This construction gives a ‘pos- itive’ use of the Axiom of Choice, to construct a measure rather than to construct a non-measurable set. Filter: F ⊂ P(X) such that sets in F are ‘big’:

(1) ∅ 6 ∈ F, (2) A ∈ F, B ⊂ A =⇒ B ∈ F; and (3) A, B ∈ F =⇒ A ∩ B ∈ F.

Example: the cofinite filter (if X is infinite). Example: the ‘principal’ ultrafilter Fx of all sets with x ∈ F. This is an ultrafilter: if X = A ⊔ B then A or B is in F.

Theorem 1.11 Any filter is contained in an ultrafilter.

enumerate the nonzero values of φ and Ei = {φ = ai} are disjoint sets. The simple functions form a vector space.

Simple integration. For a simple function supported on a set of finite measure, we define ∫ φ =

aiχEi =

aim(Ei).

We also define

E φ^ =^

φχE. Example:

χQ = 0.

Theorem 1.12 Integration is linear on the vector space of simple functions.

Proof. Clearly

aφ = a

φ. We must prove

φ + ψ =

φ +

ψ. First note that for any representation of φ as

biχFi with the sets Fi disjoint, we have

φ =

bim(Fi). Indeed, ∫ (^) ∑ biχFi =

aj χS b i=aj Fi^

aj

bi=aj

m(Fi) =

bim(Fi).

Now take the finite collection of sets Fi on which φ and ψ are both constant, and write φ =

aiχFi and ψ =

biχFi. Then ∫ φ + ψ =

(ai + bi)m(Fi) =

φ +

ψ.

The Lebesgue integral. Now let E be a set of finite measure, let f : E → R be a function and assume |f | ≤ M. We define the Lebesgue integral by ∫

E

f = inf ψ≥f

E

ψ = sup f ≥φ

E

φ,

assuming sup and inf agree. (Here φ and ψ are required to be simple func- tions.)

Theorem 1.13 The two definitions of the integral of f above agree iff f is a measurable function.

Proof. Suppose f is measurable. Since

ψ ≥

φ, we just need to show the simple functions φ and ψ can be chosen such that their integrals are arbitrarily close. To this end, cut the interval [−M, M ] into N pieces [ai, ai+1) of length less than ǫ. Let Ei be the set on which f (x) lies in [ai, ai+1). Then φ =

aiχEi and ψ =

∫ ai+1χEi^ satisfying^ φ^ ≤^ f^ ≤^ ψ^ and (ψ − φ) ≤ ǫm(E), so we are done. Conversely, if the sup and inf agree, then we can choose simple functions φn ≤ f ≤ ψn such that

(ψn − φn) → 0. Let φ = sup φn and ψ = inf ψn. Then φ and ψ are measurable, and φ ≤ f ≤ ψ. We claim φ = ψ a.e. (and thus f is measurable). Otherwise, there is a set of positive measure A and an ǫ > 0 such that ψ − ψ > ǫ on A. But then ǫχA ≤ ψn − φn for all n, and thus

ψn − φn ≥ ǫm(A) > 0.

Theorem 1.14 Let f be a bounded function on an interval [a, b], and sup- pose f is Riemann integrable. Then f is also Lebesgue integrable, and the two integrals agree.

Proof. If f is Riemann integrable then there are step functions φn ≤ f ≤

ψn with

(ψn − φn) → 0. Since step functions are special cases of simple functions, we see f is Lebesgue integrable. It is now easy to check that the integral of bounded functions over sets of finite measure satisfies expected properties: The integral is linear. If f ≤ g then

f ≤

g. In particular |

f | ≤

|f |, and if A ≤ f ≤ B then Am(E) ≤

E f^ ≤ Bm(E). For disjoint sets,

A∪B f^ =^

A f^ +^

B f^. The most interesting assertion is

(f + g) =

f +

g. If ψ 1 ≥ f and ψ 2 ≥ g then ψ 1 + ψ 2 ≥ f + g, so by the infimum definition of the integral we get

(f + g) ≤

f +

g. To get the reverse inequality, use the supremum definition.

Theorem 1.15 (Bounded convergence) Let fn → f (pointwise) Theo- rem (Bounded convergence) Let fn → f (pointwise) on a set of finite measure E, where |fn|, |f | ≤ M. Then

E fn^ →^

E f^.

Proof. We will use Littlewood’s 3rd Principle. Ignoring a set A of small measure, the convergence is uniform. Then ∣∣ ∣∣

E−A

fn − f

E−A

|fn − f | ≤ m(E − A) sup E−A

|fn − f | → 0.

Fatou’s Lemma:

f ≤ lim inf

fn. Monotone Convergence: if f 1 ≤ f 2 ≤.. ., then

f = lim

fn.

Proofs: For Fatou’s lemma, let g be a bounded function with bounded support such that g ≤ f and (

f ) − ǫ ≤

g. Then gn = min(g, fn) → g and gn ≤ fn, so (∫ f

− ǫ ≤

g = lim

gn ≤ lim inf

fn.

Here we have used the Bounded Convergence Theorem to interchange inte- grals and limits. Letting ǫ → 0 gives the result. For monotone convergence: Since f ≥ fn for all n, we have

f ≥ lim sup

fn, while

f ≤ lim inf

fn by Fatou’s Lemma.

Theorem 1.16 (Modulus of integrability) Let f ≥ 0 be integrable. Then for any ǫ > 0 there is a δ > 0 such that m(E) < δ =⇒

E f < ǫ.

Corollary 1.17 The function F (t) =

∫ (^) t −∞ f^ (x)^ dx^ is uniformly continuous on R.

Proof of the Theorem. Let fM = min(M, f ). Then fM → f monotonely as M → ∞, and thus

(f − fM ) → 0. Choose M large enough that

(f − fM ) < ǫ/2. Then for m(E) < δ = ǫ/(2M ), we have

E f^ ≤^

E (f^ −^ fM^ ) + M m(E) ≤ ǫ.

Dominated convergence. Let fn → f , with |fn|, |f | ≤ g and

g < ∞. Then

fn →

f.

Proof. Given ǫ > 0 there is a δ > 0 such that

A g < ǫ^ whenever^ m(A)^ < δ. We can also choose M such that

E g < ǫ^ outside [−M, M^ ].^ Then by Littlewood’s 3rd principle, there is a set A ⊂ [−M, M ] with m(A) < δ outside of which fn → f uniformly. Thus

lim sup

fn − f

R−[−M,M ]

g +

A

g

≤ 4 ǫ.

Since ǫ was arbitrary,

fn →

f.

Derivatives. Even if f ′(x) exists everywhere, the behavior of f ′(x) can be very wild – e.g. not integrable. For example, if f (x) is any function smooth

away from x = 0, and |f (x)| ≤ |x|^2 , then f is differentiable at 0; but we can make f ′(x) wild, e.g. look at f (x) = x^2 sin(e^1 /x 2 x). In particular, f ′(x) need not be integrable. Here is an easy theorem illustrating the preceding results.

Theorem 1.18 Suppose f (x) is differentiable on R, vanishes outside [0, 1] and |f ′(x)| ≤ M. Then

∫ (^) t 0 f^

′(x) dx = f (t).

Proof. Since f is differentiable it is continuous, and fn(x) = n(f (x+1/n)− f (x)) → f ′(x) pointwise. By the mean-value theorem, |fn(x)| = |f ′(y)| ≤ M for some y ∈ [x, x + 1/n]. Thus

fn →

f ′. But ∫ (^) t

0

fn(x) dx = n

∫ (^) t+1/n

t

f (t) dt → f (t)

by continuity of f.

Convergence in measure. All the theorems about pointwise convergence also hold for convergence in measure. This can be proved using the following useful fact.

Theorem 1.19 If fn → f in measure, then there is a subsequence such that fn → f pointwise a.e.

As a warm-up to this fact, we prove the easy part of the Borel-Cantelli lemma.

Lemma 1.20 If

m(En) < ∞, then lim sup En, the set of points x that belong to En for infinitely many n has measure zero.

Remark: χlim sup En = lim sup χEn.

Proof. For any N > 0, we have

m(lim sup En) ≤ m(

⋃^ ∞

N

En) ≤

∑^ ∞

N

m(En) → 0

as N → ∞.

A nowhere differentiable function. Let f (x) =

1 an^ sin(bnx), where^

an converges quickly but bnan → ∞ rapidly. For concreteness, we take an = 10−n, bn = 10^6 n. Then for any n, we can choose t ≈ 1 /bn such that ∆an sin(bnx) ≍ an. For k < n, we have ∑ ∆ak sin(bkx) ≤

akbk/bn ≍ an− 1 bn− 1 /bn ≪ an,

and for k > n we have

∆ak sin(bkx) ≤ ak ≪ an

Thus ∆f /∆x ≍ an/t ≍ anbn → ∞, so f ′(x) does not exists. Riemann’s ‘example’. Riemann thought that the function

f (x) =

exp(2πin^2 x)/n^2

was nowhere differentiable. This is almost true, however it turns out that f ′(x) actually does exists at certain rational points.

Monotone functions. We say f : [a, b] → R is increasing if x ≤ y =⇒ f (x) ≤ f (y). If f or −f is increasing then f is monotone. Example: write Q = {q 1 , q 2 ,.. .} and set f (x) =

qi<x 2 −i. Then

f : R → R is monotone increasing, and f has a dense set of points of discontinuity.

Theorem 2.1 A monotone function f : [a, b] → R is differentiable almost everywhere.

Thus the oscillations of the preceding example are necessary to produce nowhere differentiability. Gleason has remarked that this property of monotone functions helped lead him to his proof of Hilbert’s 5th problem (which topological groups are Lie groups?). The proof of the Theorem will use the Vitali covering lemma.

Vitali coverings. Here is use an important covering argument based on the ‘greedy algorithm’. Let K be a compact subset of a metric space (X, d). A collection of balls B forms a Vitali covering of K if for every x ∈ K and r > 0 there is a B ∈ B with x ∈ B ⊂ B(x, r). We can be rather loose about the boundary of B: it is only necessary that B(y, s) ⊂ B ⊂ B(y, s) for some open ball B(y, s). In the case of the real numbers, this means B can be any interval except a degenerate one [a, a].

Theorem 2.2 For any Vitali covering B of K, there is a sequence of dis- joint balls 〈B(yi, ri)〉 in B such that K ⊂

B(yi, 3 ri). In fact for any N > 0 we have

K ⊂

⋃^ N

1

B(yi, ri) ∪

⋃^ ∞

N +

B(yi, 3 ri).

Proof. Since K is compact, we can assume B is a countable set of balls whose diameters tend to zero. (For each n, extract from K a finite subcover Bn by balls of diameter < 1 /n, and replace B with

1 Bn^ — it is still a cover in the sense of Vitali.) To construct the disjoint balls, we use the greedy algorithm. Let B(y 1 , r 1 ) be the largest ball in B, and define B(yn+1, rn+1) inductively as one of the largest balls among those in B disjoint from the ones already chosen, B(y 1 , r 1 ),... , B(yn, rn). We claim K ⊂

B(yi, 3 ri). Indeed, if x ∈ K then x belongs to some ball B(y, r) ∈ B. If B(y, r) belongs to the sequence of chosen balls B(yi, ri), then we are done — x is covered. Otherwise, consider the first i for which ri < r. Since B(y, r) was not chosen at the ith stage in the inductive definition, it must meet one of the earlier balls — say B(yj , rj ), with j < i. But then we have rj ≥ r, and since they meet, B(yj , 3 rj ) contains B(y, r). In particular, it contains x. Now suppose we have N > 0 and x ∈ K −

⋃N

1 B(yi, ri). Then since the union of the first N balls is closed, there is a ball B(y, r) ∈ B disjoint from the first N balls and containing x. Once again, by the nature of the greedy algorithm B(y, r) must meet B(yi, ri) for some i with ri ≥ r; but this time by our choice of B(y, r) we can insure that i > N. Since ri ≥ r we have x ∈

N +1 B(yi,^3 ri).

Theorem 2.3 (Vitali covering lemma) For any Vitali covering B of a set E ⊂ R of finite measure, and ǫ > 0 , there is a finite collection of disjoint balls B 1 ,... , Bn in B with m(E△

⋃n 1 Bi)^ < ǫ.

Proof. Since m(E) is finite, we can find a compact K and an open U such that K ⊂ E ⊂ U and m(K), m(E) and m(U ) all differ by at most ǫ. Remove from B any balls that are not contained in U ; then B is still a Vitali covering of E, and hence of K. Now extract a sequence of disjoint balls 〈Bi = B(yi, ri)〉 from B by the greedy algorithm. Then by Vitali’s Lemma, we have m(

Bi) =

2 ri ≤

Letting ǫ → 0 we find vm(E) ≤ um(E) and thus m(E) = 0.

Theorem 2.4 (Integral of the derivative) If f : [a, b] → R is mono- tone, then

∫ (^) b a f^

′(x) dx ≤ f (b) − f (a).

Proof. Define fn(x) = n(f (x + 1/n) − f (x)) ≥ 0. Then fn(x) → f ′(x), so by Fatou’s lemma we have

f ′^ ≤ lim inf inf fn. But

fn is, for n large, the difference between the averages of f over two disjoint intervals, so it is less than or equal to the maximum variation f (b) − f (a).

0.2 0.4 0.6 0.8 1

1

Figure 1. Cantor’s function: the devil’s staircase.

Singular functions. A monotone function is singular if f ′(x) = 0 a.e.

An example is the Cantor function or ‘devil’s staircase’,

f (0.a 1 a 2 a 3.. .) =

{ 2 −i^ : ai ≤ 1 and aj 6 = 1, 1 ≤ j < i.}

where x = 0.a 1 a 2 a 3... in base 3. This monotone function has the amazing property that it is continuous, and it climbs from 0 to 1, but f ′(x) = 0 a.e. On the other hand, f ′(x) does not exist (or equals infinity) for x in the Cantor set (in fact f stretches

intervals of length 3−n^ to length 2−n, and so even for a monotone function f ′(x) can fail to exist on an uncountable set (necessarily of measure zero). There is a more sophisticated example, due to Whitney, of a function f (x, y) on the plane whose derivatives exist everywhere, but which is not constant on its critical set. This function describes the topography of a hill with a (fractal) road running from top to bottom passing only along the level or flat parts of the hillside.

Bounded variation. We note that if f = g − h where g and h are both monotone, then f ′(x) also exists a.e. So it is desirable to characterize the full vector space of functions spanned by the monotone functions. A function f : [a, b] → R has bounded variation if

sup

∑^ n

1

|f (ai) − f (ai− 1 )| = ‖f ‖BV < ∞.

Here the sup is over all finite dissections of [a, b] into subintervals, a = a 0 < a 1 <... an = b. This supremum is called the total variation of f over [a, b].

Theorem 2.5 A function f is of bounded variation iff f (x) = g(x) − h(x) where g and h are monotone increasing.

Proof. Clearly ‖f ‖BV = f (b) − f (a) if f is monotone increasing, and thus f has bounded variation if it is a difference of monotone functions. For the converse, define

f+(x) = sup

∑^ n

1

max(0, f (ai) − f (ai− 1 )),

over all partitions a = a 0 <... < an = x, and similarly

f−(x) = sup

∑^ n

1

max(0, −f (ai) + f (ai− 1 )).

Clearly f+ and f− are monotone increasing, and they are bounded since the total variation of f is bounded. We claim f (x) = f (a) + f+(x) − f+(x). To see this, note that if we refine our dissection of [a, b], then both f+ and f− increase. Thus for any ǫ > 0, we can find a dissection for which both sums are within ǫ of their supremums.