6. Topology
In this chapter, we will see how topology, the study of limits and continuity, is formalized in mathlib. This topic is rather well developed and involves quite a few layers of mathematical structure. The first layer is naive set theory already covered in Chapter 4. The next one is the theory of filters, which then allows to develop topological spaces, metric spaces and a slightly more exotic intermediate notion called a uniform space.
This chapter will contain quite a bit more mathematical explanations than the previous ones since filters in particular are not systematically taught in universities, even to future mathematicians, and are crucial in formalizations of topology in all major libraries of formalized mathematics.
This discrepancy is easy to explain. Consider limits of a function
f : ℝ → ℝ
. We can consider
the limit of f x
when x
goes to some x₀
or ±∞
, or
x₀⁺
or x₀⁻
, as well as variations where x
is not allowed to be
x₀
. We could also ask that only rational values of x
are
considered, or other constraints, but let’s stick to those 13 cases.
Then the behavior of f x
could also be anyone of those 13
possibilities. So we get 13 × 13 = 169 definitions, including
“f x
tends to +∞
when x
tends to x₀
from the right without
being equal to x₀
” etc. That’s already a lot of definitions, and they
don’t even include limits of sequences for instance.
Then come the lemmas. For instance you can compose limits. If
f x
tends to y₀
when x
tends to x₀
and
g y
tends to z₀
when y
tends to y₀
then
g ∘ f x
tends to z₀
when x
tends to x₀
.
It is important to keep track of eack kind of limit here. For instance,
if the second assumption is only a pointed limit (ie.
g y
tends to z₀
when y
tends to y₀
and y ≠ y₀
)
then the conclusion may not hold.
So we now need 13 × 13 × 13 = 2197 lemmas.
In informal maths this is not a big issue, we can prove two or three of those 2197 lemmas and let the others as exercises for the reader. This trick is not available in formalized mathematics, hence we will need a bit more mathematics: Bourbaki’s theory of filters.
6.1. Filters
Filters on a type X
are collections of sets of X
satisfying three conditions. They are mostly used to
abstract two related kinds of ideas:
limits, including including all kinds of limites discussed above, in particular finite or infinite limits of sequences, finite or infinite limits of functions at a point or at infinity, etc…
things happening eventually, including things happening for large enough
n : ℕ
, or near enough a pointx
, or for close enough pairs of points, or things happening almost everywhere in the sense of measure theory. Dually, filters can also express the idea of things happening often: for arbitrarily largen
, or at a point in any neighborhood of given a point etc…
The examples of filters appearing in these descriptions will be described in details later, but we can already name them:
(at_top : filter ℕ)
: made of sets ofℕ
containing{n | n ≥ N}
for someN
𝓝 x
: made of neighborhoods ofx
in a topological space𝓤 X
: made of entourages of a uniform space (those space are generalizations of metric spaces and topological groups)μ.a_e
: made of sets whose complement has zero measure with respect to a measureμ
.
The general definition is as follows. A filter F : filter X
is a
collection of sets F.sets : set (set X)
such that
F.univ_sets : univ ∈ F.sets
(the set of all inhabitants ofX
belongs toF.sets
)F.sets_of_superset : ∀ {U V}, U ∈ F.sets → U ⊆ V → V ∈ F.sets
(ifU
belong toF.sets
then anything larger also belongs)F.inter_sets : ∀ {U V}, U ∈ sets → V ∈ sets → U ∩ V ∈ sets
(F.sets
is stable under intersection)
In mathlib, F
is defined as a structure bundling F.sets
and its
three properties, but the properties carry no additional data, so we
really want to blur the distinction between F
and F.sets
. This
is done by defining U ∈ F
to mean U ∈ F.sets
.
However the word sets still appears in some lemma names.
One can see from the axioms above that each filter defines a notion of “sufficiently large” set: the first
condition says that univ
is sufficiently large, the second one says that a set containing a sufficiently
large set is sufficient large and the third one says that the intersection of two sufficiently large sets
is sufficiently large. However a more useful way to see filters is that
filters on a type X
can be seen as generalized elements of set X
. For instance at_top
is the
“set of very large numbers”, and 𝓝 x₀
is the “set of points very close to x₀
”.
A first precise statement is that one can turn any s : set X
into the so-called principal filter
associated to s
. This definition is already in mathlib and has a notation 𝓟
(localized in filter
),
but here we use as an opportunity to construct a filter.
open filter
def principal {α : Type*} (s : set α) : filter α :=
{ sets := {t | s ⊆ t},
univ_sets := sorry,
sets_of_superset := sorry,
inter_sets := sorry}
For our second example, we’ll use the filter at_top : filter ℕ
(we could use any type with a preorder instead of ℕ
).
example : filter ℕ :=
{ sets := {s | ∃ a, ∀ b, a ≤ b → b ∈ s},
univ_sets := sorry,
sets_of_superset := sorry,
inter_sets := sorry }
We can also directly define the filter 𝓝 x
of neighborhoods of any x : ℝ
.
A neighborhood of x
is a set containing an open interval Ioo (x - ε) (x + ε)
(of course this is only a special case of a more general construction in mathlib).
With those examples, we can already define what is means for a function f : X → Y
to converges with respect to some G : filter Y
along some F : filter X
as
def tendsto₁ {X Y : Type*} (f : X → Y) (F : filter X) (G : filter Y) :=
∀ V ∈ G, f ⁻¹' V ∈ F
When X = ℕ
, Y = ℝ
then tendsto₁ u at_top (𝓝 x)
is equivalent to asking that the sequence u : ℕ → ℝ
converges to the real number x
. When both X
and Y
are ℝ
, tendsto f (𝓝 x₀) (𝓝 y₀)
is equivalent to \(\lim_{x \to x₀} f(x) = y₀\). All other kinds of limits mentioned in the introduction are
also equivalent to tendsto₁
for some filters on source and target.
The definition quoted above is definitionaly equivalent to what is in mathlib, but it does not appear in this way,
hence the subscript in the name.
The issue with the above definition is that it exposes a quantifier and elements of G
, and it hides the intuition
coming from seeing filters as generalized sets. It is more efficient to
hide the ∀ V
and expose the intuition using more algebraic structure.
The first ingredient is the push-forward operation \(f_*\) associated to any map f : X → Y
:
filter.map f
. Given a filter F
on X
, we get filter.map f F : filter Y
and, definitionnaly,
V ∈ filter.map f F ↔ f ⁻¹' V ∈ F
. In this file we’ve opened the filter namespace, so
filter.map
can be written as map
. This allows to rewrite the definition of tendsto
using
the order relation on filter Y
which is reversed inclusion of the set of members:
given G H : filter Y
, G ≤ H ↔ ∀ V : set Y, V ∈ H → V ∈ G
.
def tendsto₂ {X Y : Type*} (f : X → Y) (F : filter X) (G : filter Y) :=
map f F ≤ G
example {X Y : Type*} (f : X → Y) (F : filter X) (G : filter Y) :
tendsto₂ f F G ↔ tendsto₁ f F G := iff.rfl
One could argue that the order relation on filters seems backward. But recall that we can see filters on X
as
generalized elements of set X
, through the inclusion of 𝓟 : set X → filter X
given by principal filters.
This inclusion is order preserving so the order relation on filter can indeed be seen as the “inclusion relation
between generalized sets”. In this analogy, push-forward is analogous to the direct image.
And indeed map f (𝓟 s) = 𝓟 (f '' s)
.
We can now understand intuitively why a sequence u : ℕ → ℝ
converges to
a point x₀
if and only if map u at_top ≤ 𝓝 x₀
: this inequality means the “direct image under u” of
“the set of very big natural numbers” is “included” in “the set of points very close to x₀
”.
As promised, the above formulation exhibits no quantifier and no set. It also allows to leverage algebraic property
of the push-forward operation. First each filter.map f
is monotone. And then filter.map
is compatible with
composition.
#check @filter.map_mono -- ∀ {α β} {m : α → β}, monotone (map m)
#check @filter.map_map -- ∀ {α β γ} {f : filter α} {m : α → β} {m' : β → γ}, map m' (map m f) = map (m' ∘ m) f
Together these two properties allow to prove that limits compose, covering in one shot all 2197 variants from the introduction. You can practice proving this statement using either the definition involving the quantifier or the algebraic definition together with the two above lemmas.
example {X Y Z : Type*} {F : filter X} {G : filter Y} {H : filter Z} {f : X → Y} {g : Y → Z}
(hf : tendsto₁ f F G) (hg : tendsto₁ g G H) : tendsto₁ (g ∘ f) F H :=
sorry
The push-forward construction uses a map to push filters from the map source to the map target.
There also a pull-back operation going in the other direction: filter.comap
and generalizing the
preimage operation on sets. For any given map f
,
filter.map f
and filter.comap f
form a Galois connexion:
filter.map_le_iff_le_comap : filter.map f F ≤ G ↔ F ≤ filter.comap f G
.
In particular this operation could be used to provided another formulation of tendsto
that would be propositionally
(but not definitionaly) equivalent to the existing one.
This operation can be used to restrict filters to a subtype. For instance suppose we have f : ℝ → ℝ
,
x₀ : ℝ
and y₀ : ℝ
and we want to state that f x
goes to y₀
when x
goes to x₀
among rational numbers.
We can pull-back the filter 𝓝 x₀
to ℚ
using the coercion map
coe : ℚ → ℝ
and state tendsto (f ∘ coe : ℚ → ℝ) (comap coe (𝓝 x₀)) (𝓝 y₀)
.
variables (f : ℝ → ℝ) (x₀ y₀ : ℝ)
#check comap (coe : ℚ → ℝ) (𝓝 x₀)
#check tendsto (f ∘ coe) (comap (coe : ℚ → ℝ) (𝓝 x₀)) (𝓝 y₀)
The pull-back operation is also compatible with composition, but contravariant.
#check @comap_comap
Let’s now move up to the plane ℝ × ℝ
and try to understand how neighborhoods of some point
(x₀, y₀)
are related to 𝓝 x₀
and 𝓝 y₀
. There is product operation
filter.prod : filter X → filter Y → filter (X × Y)
, denoted by ×ᶠ
which answers this question:
example : 𝓝 (x₀, y₀) = 𝓝 x₀ ×ᶠ 𝓝 y₀ := nhds_prod_eq
This operation is defined in term of the pull-back operation and the inf
operation:
F ×ᶠ G = (comap prod.fst F) ⊓ (comap prod.snd G)
.
Here the inf
operation refers to the lattice structure on filter X
for any type X
:
F ⊓ G
is the greatest filter that is smaller than bother F
and G
, generalizing the notion
of the intersection of sets.
A lot of proofs in mathlib use all this structure (map
, comap
, inf
, sup
, prod
)
to give algebraic proofs about convergence without ever referring to members of filters.
You can practice this on the following lemma, unfolding the definition of tendsto
and filter.prod
if needed.
#check le_inf_iff
example (f : ℕ → ℝ × ℝ) (x₀ y₀ : ℝ) :
tendsto f at_top (𝓝 (x₀, y₀)) ↔
tendsto (prod.fst ∘ f) at_top (𝓝 x₀) ∧ tendsto (prod.snd ∘ f) at_top (𝓝 y₀) :=
sorry
The ordered type filter X
is actually a complete lattice: there is a bottom element and a top element and
any set of filters on X
have an Inf
and a Sup
.
Note that in the definition of filters, given the second property
(if U
belong to F
then anything larger also belongs), the first property
(the set of all inhabitants of X
belongs to F
) is
equivalent to asking that F
is not the empty collection of sets.
This shouldn’t be confused with a more subtle question. The
definition does not prohibit ∅ ∈ F
. If ∅ ∈ F
then
∀ U : set X, U ∈ F
, hence F
is a
rather trivial filter. This trivial filter is precisely the bottom element of the complete
lattice filter X
.
This contrasts with the definition of filters in
Bourbaki who doesn’t allow filters containing the empty set.
Because of this choice, we need an
extra assumption of non-triviality in some lemmas, but gain
nicer global formal properties for the theory. We’ve seen already that it gives us a
bottom element. It also allowed us to define principal : set X → filter X
without
precondition (this map sends ∅
to ⊥
).
And it allowed to get the pull-back operation without precondition.
Indeed it can happen that comap f F = ⊥
although F ≠ ⊥
. For instance,
given x₀ : ℝ
and s : set ℝ
, the pull-back of 𝓝 x₀
under the coercion
from the subtype corresponding to s
is non-trivial if and only if x₀
belongs to the
closure of s
.
In order to easily manage lemmas that do need to assume some filter is non-trivial, there is
a type class filter.ne_bot
, so you sometimes see lemmas assuming
(F : filter X) [F.ne_bot]
. The instance database knows for instance that (at_top : filter ℕ).ne_bot
and knows that pushing forward a non-trivial filter gives a non-trivial filter, so a lemma
assuming [F.ne_bot]
will automatically apply to map u at_top
for any sequence u
.
Our tour of algebraic properties of filters and their relations to limit is essentially done,
but careful readers may have noticed that our claim that we get back the usual notions
of limits is not so clear. Superficially, it may look like tendsto u at_top (𝓝 x₀)
is stronger than the usual condition because we ask that every neighborhood of x₀
has a preimage belonging to at_top
whereas the usual definition only asks
this for the standard neighborhoods Ioo (x₀ - ε) (x₀ + ε)
.
The key is that, by definition, every neighborhood contains such a standard one.
This idea leads to the notion of a filter basis. Given F : filter X
,
a family of sets s : ι → set X is a basis for F
if
∀ U : set X, U ∈ F ↔ ∃ i, s i ⊆ U
. Actually it is more flexible to also consider
a predicate on ι
to select only some values i
. In the case of
𝓝 x₀
, we want ι
to be ℝ
, i = ε
and the predicate should select
positive ε
. So the actual basis statement is:
example (x₀ : ℝ) : has_basis (𝓝 x₀) (λ ε : ℝ, 0 < ε) (λ ε, Ioo (x₀ - ε) (x₀ + ε)) :=
nhds_basis_Ioo_pos x₀
Of course we also have a nice basis for at_top
, and the lemma filter.has_basis.tendsto_iff
allows
to reformulate a tendsto f F G
given bases for F
and G
.
example (u : ℕ → ℝ) (x₀ : ℝ) :
tendsto u at_top (𝓝 x₀) ↔ ∀ ε > 0, ∃ N, ∀ n ≥ N, u n ∈ Ioo (x₀ - ε) (x₀ + ε) :=
begin
have : at_top.has_basis (λ n : ℕ, true) Ici := at_top_basis,
rw this.tendsto_iff (nhds_basis_Ioo_pos x₀),
simp
end
We now turn to the way filters allow to efficiently handle properties that holds for sufficiently large numbers
or sufficiently close to a given point. If you did the exercices in Section 3.6,
you have very often been in the situation where some property P n
holds for sufficiently large n
and some
other property Q n
holds for sufficiently large n
. After two `cases
this gave you
N_P
and N_Q
such that ∀ n ≥ N_P, P n
and ∀ n ≥ N_Q, Q n
. Then you set N := max N_P N_Q
and
patiently proved that ∀ n ≥ N, P n ∧ Q n
. After doing this a couple of time this feels very tiresome.
The way to solve this issue is to note that “P n
and Q n
hold for large enough n
” means
{n | P n} ∈ at_top
and {n | Q n} ∈ at_top
.
Then the fact at_top
is a filter means in particular that the intersection of two elements of at_top
is again in at_top
, so {n | P n ∧ Q n} ∈ at_top
. However writing {n | P n} ∈ at_top
is not very nice.
In this situation, we use the more suggestive notation ∀ᶠ n in at_top, P n
.
Note the superscript f
which stands for filter.
The notation reads: for all n
in “the set of very large numbers”, P n
holds. Technically, this notation
stands for filter.eventually
, and the lemma filter.eventually.and
uses the intersection property of filters to do
what we described above:
example (P Q : ℕ → Prop) (hP : ∀ᶠ n in at_top, P n) (hQ : ∀ᶠ n in at_top, Q n) :
∀ᶠ n in at_top, P n ∧ Q n := hP.and hQ
This notation is so convenient and intuitive to use that we also have specializations
when P
is an equality or inequality statement. For instance, let u
and v
be
two sequences of real numbers. Let us state that if
u n
and v n
coincide for large enough n
then u
tends to x₀
if and only
if v
tends to x₀
. First we’ll use the generic eventually
and then the one
specialized eventually_eq
for the equality predicate. Note that both statement are
definitionaly equivalent so the same proof will be used.
example (u v : ℕ → ℝ) (h : ∀ᶠ n in at_top, u n = v n) (x₀ : ℝ) :
tendsto u at_top (𝓝 x₀) ↔ tendsto v at_top (𝓝 x₀) :=
tendsto_congr' h
example (u v : ℕ → ℝ) (h : u =ᶠ[at_top] v) (x₀ : ℝ) :
tendsto u at_top (𝓝 x₀) ↔ tendsto v at_top (𝓝 x₀) :=
tendsto_congr' h
It instructive to review the definition of filters in terms of this use with eventually
.
Given F : filter X
, for any predicates P
and Q
on X
,
the condition
univ ∈ F
ensures that(∀ x, P x) → ∀ᶠ x in F, P x
.the condition
U ∈ F → U ⊆ V → V ∈ F
ensures that(∀ᶠ x in F, P x) → (∀ x, P x → Q x) → ∀ᶠ x in F, Q x
the condition
U ∈ F → V ∈ F → U ∩ V ∈ F
ensures that(∀ᶠ x in F, P x) → (∀ᶠ x in F, Q x) → ∀ᶠ x in F, P x ∧ Q x
, as we already discussed.
#check @eventually_of_forall
#check @eventually.mono
#check @eventually.and
The item above, corresponding to eventually.mono
is also a very nice use of filters, especially combined
with eventually.and
. There is also a tactic filter_upwards
that allows to nicely combines these.
Compare:
example (P Q R : ℕ → Prop) (hP : ∀ᶠ n in at_top, P n) (hQ : ∀ᶠ n in at_top, Q n)
(hR : ∀ᶠ n in at_top, P n ∧ Q n → R n) :
∀ᶠ n in at_top, R n :=
begin
apply (hP.and (hQ.and hR)).mono,
rintros n ⟨h, h', h''⟩,
exact h'' ⟨h, h'⟩
end
example (P Q R : ℕ → Prop) (hP : ∀ᶠ n in at_top, P n) (hQ : ∀ᶠ n in at_top, Q n)
(hR : ∀ᶠ n in at_top, P n ∧ Q n → R n) :
∀ᶠ n in at_top, R n :=
begin
filter_upwards [hP, hQ, hR],
intros n h h' h'',
exact h'' ⟨h, h'⟩
end
Readers who know about measure theory will note that the filter μ.ae
of sets whose complement has measure zero
(aka “the set of almost every point”) is not very useful as the source or target of tendsto
but very often
used with eventually
, when saying a property holds for almost every point.
There is dual version of ∀ᶠ x in F, P x
which is much less useful, but still exists: ∃ᶠ x in F, P x
means
{x | ¬P x} ∉ f
. For instance ∃ᶠ n in at_top, P n
means there are arbitrarily large n
such that P n
.
This notation stands for filter.frequently
.
As a more sophisticated example, the statement: “if a sequence u
converges to
some x
and u n
belongs to a set M
for n
large enough then x
is in the closure of
M
” is formalized as: tendsto u at_top (𝓝 x) → (∀ᶠ n in at_top, u n ∈ M) → x ∈ closure M
,
which is a special case of mem_closure_of_tendsto
from the topology library.
See if you can prove it using the quoted lemmas, knowing that cluster_pt x F
means (𝓝 x ⊓ F).ne_bot
.
#check mem_closure_iff_cluster_pt
#check le_principal_iff
#check ne_bot_of_le
example (u : ℕ → ℝ) (M : set ℝ) (x : ℝ)
(hux : tendsto u at_top (𝓝 x)) (huM : ∀ᶠ n in at_top, u n ∈ M) : x ∈ closure M :=
sorry